Abstract

Feature selection is an indispensable step for the analysis of high-dimensional molecular data. Despite its importance, consensus is lacking on how to choose the most appropriate feature selection methods, especially when the performance of the feature selection methods itself depends on hyper-parameters. Bayesian optimization has demonstrated its advantages in automatically configuring the settings of hyper-parameters for various models. However, it remains unclear whether Bayesian optimization can benefit feature selection methods. In this research, we conducted extensive simulation studies to compare the performance of various feature selection methods, with a particular focus on the impact of Bayesian optimization on those where hyper-parameters tuning is needed. We further utilized the gene expression data obtained from the Alzheimer's Disease Neuroimaging Initiative to predict various brain imaging-related phenotypes, where various feature selection methods were employed to mine the data. We found through simulation studies that feature selection methods with hyper-parameters tuned using Bayesian optimization often yield better recall rates, and the analysis of transcriptomic data further revealed that Bayesian optimization-guided feature selection can improve the accuracy of disease risk prediction models. In conclusion, Bayesian optimization can facilitate feature selection methods when hyper-parameter tuning is needed and has the potential to substantially benefit downstream tasks.

Details

Title
The impact of Bayesian optimization on feature selection
Author
Yang, Kaixin 1 ; Liu, Long 1 ; Wen, Yalu 2 

 Shanxi Medical University, Department of Health Statistics, School of Public Health, Taiyuan, China (GRID:grid.263452.4) (ISNI:0000 0004 1798 4018) 
 University of Auckland, Department of Statistics, Auckland, New Zealand (GRID:grid.9654.e) (ISNI:0000 0004 0372 3343) 
Pages
3948
Publication year
2024
Publication date
2024
Publisher
Nature Publishing Group
e-ISSN
20452322
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2927742430
Copyright
© The Author(s) 2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.