Weight Initialization Techniques

What not to do?

In weight initialization techniques for deep learning, here are a few simple things to avoid:

Don't initialize all weights to zero: It can cause all neurons to learn the same features and hinder model performance. Use random initialization instead.
Avoid large weight initialization: Large values can lead to unstable training. Prevent this by not initializing weights with excessively large values.
Be consistent in weight initialization: Use the same initialization method for all layers to maintain consistency in learning across the network.
Consider the activation functions: Different activation functions require different initialization techniques. Don't ignore the impact of activation functions on weight initialization.
Avoid overly complex initialization: Don't complicate weight initialization unnecessarily. Simpler methods like Xavier or He initialization often work well in practice.
Random initialization with small or large weights

$$Random(n,n)*\sqrt\frac{1}{n}$$

n is the number of dimensions.

$$Random(n,n)*\sqrt\frac{2}{n}$$

$$limit = \sqrt\frac{6}{fan\_in+fan\_out}$$

numbers are generated in range [-limit, limit].

$$limit = \sqrt\frac{6}{fan\_in}$$

numbers are generated in range [-limit, limit].