Content area

Abstract

The heavy-ball (HB) method has become a well-known practice for large-scale machine learning problems, and it can achieve the fastest local convergence rate when objective functions are smooth and strongly convex using Polyak’s optimal hyper-parameters. However, such convergence rates are based on specific uncertain and time-invariant hyper-parameters that limit its potential. In this paper, we propose an adaptive HB that estimates the Polyak’s optimal hyper-parameters at each iteration. Our adaptive approach employs the absolute differences of current and previous model parameters and their gradients. Such representation allows for a computationally efficient optimizer. We show that our method guarantees a global linear convergence rate for smooth and strongly convex objective functions. Whereas in the stochastic setting, we show that proposed stochastic algorithm converges almost surely for non-convex smooth functions with bounded gradient. We validate the effectiveness of our method on image classification datasets with no empirical tuning, and its superiority on quadratic and non-convex functions while comparing its performance to the state-of-the-art optimizers.

Details

Title
An adaptive polyak heavy-ball method
Author
Saab, Samer 1   VIAFID ORCID Logo  ; Phoha, Shashi 1 ; Zhu, Minghui 1 ; Ray, Asok 1 

 The Pennsylvania State University, State College, USA (GRID:grid.29857.31) (ISNI:0000 0001 2097 4281) 
Pages
3245-3277
Publication year
2022
Publication date
Sep 2022
Publisher
Springer Nature B.V.
ISSN
08856125
e-ISSN
15730565
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2714530613
Copyright
© The Author(s), under exclusive licence to Springer Science+Business Media LLC, part of Springer Nature 2022.