Abstract

Whether it is Google's federated framework for mobile devices or OpenAI's large language models that are capturing the mainstream attention, large-scale learning systems are ubiquitous. While the rate of progress and performance of modern learning systems has been impressive, they are still hampered by many issues. For example, training large-scale learning models is costly both time and resource-wise, making guarantees with respect to an individual sample path imperative. In addition, it was noted that many learning models are impeded by the lack of classical smoothness and induce phenomena such as heavy-tailed gradient noise, necessitating the use of (stochastic) gradient methods with nonlinear mappings, such as sign, clipping or normalization. Moreover, the use of nonlinearly modified gradient methods is known to bring many benefits, such as stabilizing and accelerating training, reducing the size of transmitted messages, as well as enhancing security and privacy in distributed machine learning. Another issue stems from the fact that the data is generated across varied sources, making it difficult to train a single model that caters to the needs of a wide range of users. Toward resolving these issues, the first part of this thesis establishes learning guarantees for a general framework of nonlinear stochastic gradient methods in the presence of heavy-tailed noise. The general framework allows us to subsume many popular nonlinearities, like sign, normalization, clipping and quantization, providing a broad range of guarantees, including large deviation upper bounds and finite-time convergence, both in expectation and high-probability sense. The second part of the thesis is dedicated to studying the multi-model framework in distributed heterogeneous settings and designing algorithms that are able to utilize the wealth of data, while providing communication-efficient models, personalized to individual users.

Details

Title
High-Probability and Large Deviations Techniques for Design and Analysis of Large-Scale and Distributed Learning Systems
Author
Armacki, Aleksandar  VIAFID ORCID Logo 
Publication year
2025
Publisher
ProQuest Dissertations & Theses
ISBN
9798290939971
Source type
Dissertation or Thesis
Language of publication
English
ProQuest document ID
3238247992
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.