Day 13 - Glorot & He Initialization

Weight initialization strategies

Activation function used Name Strategy
Logistic, Softmax, Tanh, No activation function (in case of regression) Glorot A normal distribution with μ=0,σ2=1fanavg\mu = 0, \sigma^2 = \frac{1}{fan_{avg}}

A uniform distribution between r-r and +r+r, with r=3fanavgr = \sqrt{\frac{3}{fan_{avg}}}

where fanavg=fin+fout2fan_{avg} = \frac{{f_{in} + f_{out}}}{2}
ReLU and its variants He A normal distribution with μ=0,σ2=2fanin\mu = 0, \sigma^2 = \frac{2}{fan_{in}}
SELU LeCun A normal distribution with μ=0,σ2=1fanin\mu = 0, \sigma^2 = \frac{1}{fan_{in}}