Abstract

Understanding the computational complexity of training simple neural networks with rectified linear units (ReLUs) has recently been a subject of intensive research. Closing gaps and complementing results from the literature, we present several results on the parameterized complexity of training two-layer ReLU networks with respect to various loss functions. After a brief discussion of other parameters, we focus on analyzing the influence of the dimension d of the training data on the computational complexity. We provide running time lower bounds in terms of W[1]-hardness for parameter d and prove that known brute-force strategies are essentially optimal (assuming the Exponential Time Hypothesis). In comparison with previous work, our results hold for a broad(er) range of loss functions, including lp-loss for all p ∈ [0, ∞]. In particular, we improve a known polynomial-time algorithm for constant d and convex loss functions to a more general class of loss functions, matching our running time lower bounds also in these cases.

Details

Title
The Computational Complexity of ReLU Network Training Parameterized by Data Dimensionality
Author
Froese, Vincent; Hertrich, Christoph; Niedermeier, Rolf
Pages
1775-1790
Section
Articles
Publication year
2022
Publication date
2022
Publisher
AI Access Foundation
ISSN
10769757
e-ISSN
19435037
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2708661145
Copyright
© 2022. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the associated terms available at https://www.jair.org/index.php/jair/about