Day 20 - Adaptive Learning Rates

AdaGrad

RMSProp

Adam

AdaMax

Nadam

Comparison of optimizers

Optimizer


Vanilla gradient descent


Momentum optimization


Nesterov Accelerated Gradient


AdaGrad


RMSProp


Adam


Nadam


AdaMax

Convergence speed


⭐️


⭐️⭐️


⭐️⭐️


⭐️⭐️⭐️


⭐️⭐️⭐️


⭐️⭐️⭐️


⭐️⭐️⭐️


⭐️⭐️⭐️

Convergence quality


⭐️⭐️⭐️


⭐️⭐️⭐️


⭐️⭐️⭐️


⭐️ (stops too early)


⭐️⭐️/ ⭐️⭐️⭐️


⭐️⭐️/ ⭐️⭐️⭐️


⭐️⭐️/ ⭐️⭐️⭐️


⭐️⭐️/ ⭐️⭐️⭐️