Full Text

Turn on search term navigation

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Background: This study aims to identify unique metabolomics biomarkers associated with Type 2 Diabetes (T2D) and develop an accurate diagnostics model using tree-based machine learning (ML) algorithms integrated with bioinformatics techniques. Methods: Univariate and multivariate analyses such as fold change, a receiver operating characteristic curve (ROC), and Partial Least-Squares Discriminant Analysis (PLS-DA) were used to identify biomarker metabolites that showed significant concentration in T2D patients. Three tree-based algorithms [eXtreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LightGBM), and Adaptive Boosting (AdaBoost)] that demonstrated robustness in high-dimensional data analysis were used to create a diagnostic model for T2D. Results: As a result of the biomarker discovery process validated with three different approaches, Pyruvate, D-Rhamnose, AMP, pipecolate, Tetradecenoic acid, Tetradecanoic acid, Dodecanediothioic acid, Prostaglandin E3/D3 (isobars), ADP and Hexadecenoic acid were determined as potential biomarkers for T2D. Our results showed that the XGBoost model [accuracy = 0.831, F1-score = 0.845, sensitivity = 0.882, specificity = 0.774, positive predictive value (PPV) = 0.811, negative-PV (NPV) = 0.857 and Area under the ROC curve (AUC) = 0.887] had the slight highest performance measures. Conclusions: ML integrated with bioinformatics techniques offers accurate and positive T2D candidate biomarker discovery. The XGBoost model can successfully distinguish T2D based on metabolites.

Details

Title
Pilot-Study to Explore Metabolic Signature of Type 2 Diabetes: A Pipeline of Tree-Based Machine Learning and Bioinformatics Techniques for Biomarkers Discovery
Author
Fatma Hilal Yagin 1   VIAFID ORCID Logo  ; Al-Hashem, Fahaid 2   VIAFID ORCID Logo  ; Irshad, Ahmad 3   VIAFID ORCID Logo  ; Fuzail Ahmad 4   VIAFID ORCID Logo  ; Alkhateeb, Abedalrhman 5   VIAFID ORCID Logo 

 Department of Biostatistics and Medical Informatics, Faculty of Medicine, Inonu University, Malatya 44280, Turkey 
 Department of Physiology, College of Medicine, King Khalid University, Abha 61421, Saudi Arabia 
 Department of Medical Rehabilitation Sciences, College of Applied Medical Sciences, King Khalid University, Abha 61421, Saudi Arabia 
 Department of Respiratory Care, College of Applied Sciences, Almaarefa University, Diriya, Riyadh 13713, Saudi Arabia 
 Department of Computer Science, Lakehead University, Thunder Bay, ON P7B 5E1, Canada 
First page
1537
Publication year
2024
Publication date
2024
Publisher
MDPI AG
e-ISSN
20726643
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
3059624420
Copyright
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.