Full text

Turn on search term navigation

Copyright © 2024 Hosam El-Sofany et al. This is an open access article distributed under the Creative Commons Attribution License (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. https://creativecommons.org/licenses/by/4.0/

Abstract

With the increasing prevalence of diabetes in Saudi Arabia, there is a critical need for early detection and prediction of the disease to prevent long-term health complications. This study addresses this need by using machine learning (ML) techniques applied to the Pima Indians dataset and private diabetes datasets through the implementation of a computerized system for predicting diabetes. In contrast to prior research, this study employs a semisupervised model combined with strong gradient boosting, effectively predicting diabetes-related features of the dataset. Additionally, the researchers employ the SMOTE technique to deal with the problem of imbalanced classes. Ten ML classification techniques, including logistic regression, random forest, KNN, decision tree, bagging, AdaBoost, XGBoost, voting, SVM, and Naive Bayes, are evaluated to determine the algorithm that produces the most accurate diabetes prediction. The proposed approach has achieved impressive performance. For the private dataset, the XGBoost algorithm with SMOTE achieved an accuracy of 97.4%, an F1 coefficient of 0.95, and an AUC of 0.87. For the combined datasets, it achieved an accuracy of 83.1%, an F1 coefficient of 0.76, and an AUC of 0.85. To understand how the model predicts the final results, an explainable AI technique using SHAP methods is implemented. Furthermore, the study demonstrates the adaptability of the proposed system by applying a domain adaptation method. To further enhance accessibility, a mobile app has been developed for instant diabetes prediction based on user-entered features. This study contributes novel insights and techniques to the field of ML-based diabetic prediction, potentially aiding in the early detection and management of diabetes in Saudi Arabia.

Details

Title
A Proposed Technique Using Machine Learning for the Prediction of Diabetes Disease through a Mobile App
Author
El-Sofany, Hosam 1   VIAFID ORCID Logo  ; El-Seoud, Samir A 2   VIAFID ORCID Logo  ; Karam, Omar H 2   VIAFID ORCID Logo  ; Abd El-Latif, Yasser M 3   VIAFID ORCID Logo  ; Islam A T F Taj-Eddin 4   VIAFID ORCID Logo 

 College of Computer Science, King Khalid University, Abha, Saudi Arabia; Cairo Higher Institute for Engineering, Computer Science and Management, Cairo, Egypt 
 British University in Egypt- BUE, Faculty of Informatics and Computer Science, Cairo, Egypt 
 Faculty of Science, Ain Shams University, Cairo, Egypt 
 Faculty of Computers and Information, Assiut University, Assiut, Egypt 
Editor
Gianni Costa
Publication year
2024
Publication date
2024
Publisher
John Wiley & Sons, Inc.
ISSN
08848173
e-ISSN
1098111X
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2916947520
Copyright
Copyright © 2024 Hosam El-Sofany et al. This is an open access article distributed under the Creative Commons Attribution License (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. https://creativecommons.org/licenses/by/4.0/