Adagrad

Photo by RetroSupply on Unsplash

Adagrad

Adagrad is an optimization algorithm used in deep learning. It adapts the learning rate individually for each parameter based on their historical gradients. It assigns larger learning rates to parameters with smaller gradients and smaller learning rates to parameters with larger gradients. This allows the algorithm to make more significant updates for infrequent parameters and smaller updates for frequent ones. Adagrad helps handle sparse data and avoids the need for manual tuning of the learning rate. However, it can suffer from diminishing learning rates over time, leading to slower convergence.