Full text

Turn on search term navigation

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Chronic kidney disease (CKD) is a worldwide public health problem, usually diagnosed in the late stages of the disease. To alleviate such issue, investment in early prediction is necessary. The purpose of this study is to assist the early prediction of CKD, addressing problems related to imbalanced and limited-size datasets. We used data from medical records of Brazilians with or without a diagnosis of CKD, containing the following attributes: hypertension, diabetes mellitus, creatinine, urea, albuminuria, age, gender, and glomerular filtration rate. We present an oversampling approach based on manual and automated augmentation. We experimented with the synthetic minority oversampling technique (SMOTE), Borderline-SMOTE, and Borderline-SMOTE SVM. We implemented models based on the algorithms: decision tree (DT), random forest, and multi-class AdaBoosted DTs. We also applied the overall local accuracy and local class accuracy methods for dynamic classifier selection; and the k-nearest oracles-union, k-nearest oracles-eliminate, and META-DES for dynamic ensemble selection. We analyzed the models’ performances using the hold-out validation, multiple stratified cross-validation (CV), and nested CV. The DT model presented the highest accuracy score (98.99%) using the manual augmentation and SMOTE. Our approach can assist in designing systems for the early prediction of CKD using imbalanced and limited-size datasets.

Details

Title
Exploring Early Prediction of Chronic Kidney Disease Using Machine Learning Algorithms for Small and Imbalanced Datasets
Author
Andressa C M da Silveira 1 ; Sobrinho, Álvaro 2   VIAFID ORCID Logo  ; Leandro Dias da Silva 3   VIAFID ORCID Logo  ; Evandro de Barros Costa 4 ; Pinheiro, Maria Eliete 4   VIAFID ORCID Logo  ; Perkusich, Angelo 5   VIAFID ORCID Logo 

 Electrical Engineering Department, Federal University of Campina Grande, Campina Grande 58428-830, Brazil; [email protected] 
 Computer Science, Federal University of the Agreste of Pernambuco, Garanhuns 55292-270, Brazil; Computing Institute, Federal University of Alagoas, Maceió 57072-900, Brazil; [email protected] 
 Computing Institute, Federal University of Alagoas, Maceió 57072-900, Brazil; [email protected] 
 Faculty of Medicine, Federal University of Alagoas, Maceió 57072-900, Brazil; [email protected] (E.d.B.C.); [email protected] (M.E.P.) 
 Virtus Research, Development and Innovation Center, Federal University of Campina Grande, Campina Grande 58428-830, Brazil; [email protected] 
First page
3673
Publication year
2022
Publication date
2022
Publisher
MDPI AG
e-ISSN
20763417
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2648973263
Copyright
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.