Full Text

Turn on search term navigation

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

When the binary response variable contains an excess of zero counts, the data are imbalanced. Imbalanced data cause trouble for binary classification. To simplify the numerical computation to obtain the maximum likelihood estimators of the zero-inflated Bernoulli (ZIBer) model parameters with imbalanced data, an expectation-maximization (EM) algorithm is proposed to derive the maximum likelihood estimates of the model parameters. The logistic regression model links the Bernoulli probabilities with the covariates in the ZIBer model, and the prediction performance among the ZIBer model, LightGBM, and artificial neural network (ANN) procedures is compared by Monte Carlo simulation. The results show that no method can dominate the other methods regarding predictive performance under the imbalanced data. The LightGBM and ZIBer models are more competitive than the ANN model for zero-inflated-imbalanced data sets.

Details

Title
Binary Classification with Imbalanced Data
Author
Chiang, Jyun-You 1 ; Lio, Yuhlong 2   VIAFID ORCID Logo  ; Hsu, Chien-Ya 3 ; Chia-Ling, Ho 4   VIAFID ORCID Logo  ; Tsai, Tzong-Ru 3   VIAFID ORCID Logo 

 School of Statistics, Southwestern University of Finance and Economics, Chengdu 611130, China; [email protected] 
 Department of Mathematical Sciences, University of South Dakota, Vermillion, SD 57069, USA; [email protected] 
 Department of Statistics, Tamkang University, New Taipei City 251301, Taiwan; [email protected] 
 Department of Risk Management and Insurance, Tamkang University, New Taipei City 251301, Taiwan; [email protected] 
First page
15
Publication year
2024
Publication date
2024
Publisher
MDPI AG
e-ISSN
10994300
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2918688571
Copyright
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.