Improved Learning Rates for Stochastic Optimization

arXiv:2107.08686v3 Announce Type: replace-cross
Abstract: Stochastic optimization is a cornerstone of modern machine learning. This paper studies the generalization performance of two classical stochastic optimization algorithms: stochastic gradient descent (SGD) and Nesterov’s accelerated gradient (NAG). We establish new learning rates for both algorithms, with improved guarantees in some settings or comparable rates under weaker assumptions in others. We also provide numerical experiments to support the theory.

Liked Liked