Subset selection for multiple linear regression

Abstract

Subset selection in multiple linear regression aims to choose a subset of candidate explanatory variables that tradeoff fitting error (explanatory power) and model complexity (number of variables selected). We build mathematical programming models for regression subset selection based on mean square and absolute errors, and minimal-redundancy–maximal-relevance criteria. The proposed models are tested using a linear-program-based branch-and-bound algorithm with tailored valid inequalities and big M values and are compared against the algorithms in the literature. For high dimensional cases, an iterative heuristic algorithm is proposed based on the mathematical programming models and a core set concept, and a randomized version of the algorithm is derived to guarantee convergence to the global optimum. From the computational experiments, we find that our models quickly find a quality solution while the rest of the time is spent to prove optimality; the iterative algorithms find solutions in a relatively short time and are competitive compared to state-of-the-art algorithms; using ad-hoc big M values is not recommended.

Details

Title

Subset selection for multiple linear regression via optimization

Author

Park, Young Woong¹

; Klabjan Diego²

¹ Iowa State University, Ivy College of Business, Ames, USA (GRID:grid.34421.30) (ISNI:0000 0004 1936 7312)
² Northwestern University, Department of Industrial Engineering and Management Sciences, Evanston, USA (GRID:grid.16753.36) (ISNI:0000 0001 2299 3507)

Pages

543-574

Publication year

2020

Publication date

Jul 2020

Publisher

Springer Nature B.V.

ISSN

09255001

e-ISSN

15732916

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.1007/s10898-020-00876-1

ProQuest document ID

2405803439

Subset selection for multiple linear regression via optimization

Abstract

Details

Full text options

Suggested sources

Subset selection for multiple linear regression via optimization

Content area

Abstract

Details

Full text options

Suggested sources