Abstract

Across jurisdictions, government and health insurance providers hold a large amount of data from patient interactions with the healthcare system. We aimed to develop a machine learning-based model for predicting adverse outcomes due to diabetes complications using administrative health data from the single-payer health system in Ontario, Canada. A Gradient Boosting Decision Tree model was trained on data from 1,029,366 patients, validated on 272,864 patients, and tested on 265,406 patients. Discrimination was assessed using the AUC statistic and calibration was assessed visually using calibration plots overall and across population subgroups. Our model predicting three-year risk of adverse outcomes due to diabetes complications (hyper/hypoglycemia, tissue infection, retinopathy, cardiovascular events, amputation) included 700 features from multiple diverse data sources and had strong discrimination (average test AUC = 77.7, range 77.7–77.9). Through the design and validation of a high-performance model to predict diabetes complications adverse outcomes at the population level, we demonstrate the potential of machine learning and administrative health data to inform health planning and healthcare resource allocation for diabetes management.

Details

Title
Predicting adverse outcomes due to diabetes complications with machine learning using administrative health data
Author
Ravaut Mathieu 1 ; Sadeghi Hamed 2   VIAFID ORCID Logo  ; Leung, Kin Kwan 2 ; Maksims, Volkovs 2 ; Kornas Kathy 3 ; Vinyas, Harish 4   VIAFID ORCID Logo  ; Watson, Tristan 5   VIAFID ORCID Logo  ; Lewis, Gary F 6 ; Weisman Alanna 7 ; Poutanen Tomi 2 ; Rosella, Laura 8   VIAFID ORCID Logo 

 Layer 6 AI, Toronto, Canada; University of Toronto, Department of Computer Science, Toronto, Canada (GRID:grid.17063.33) (ISNI:0000 0001 2157 2938) 
 Layer 6 AI, Toronto, Canada (GRID:grid.17063.33) 
 University of Toronto, Dalla Lana School of Public Health, Toronto, Canada (GRID:grid.17063.33) (ISNI:0000 0001 2157 2938) 
 University of Toronto, Dalla Lana School of Public Health, Toronto, Canada (GRID:grid.17063.33) (ISNI:0000 0001 2157 2938); University of Toronto, MD/PhD Program, Temerty Faculty of Medicine, Toronto, Canada (GRID:grid.17063.33) (ISNI:0000 0001 2157 2938) 
 University of Toronto, Dalla Lana School of Public Health, Toronto, Canada (GRID:grid.17063.33) (ISNI:0000 0001 2157 2938); ICES, Toronto, Canada (GRID:grid.418647.8) (ISNI:0000 0000 8849 1617) 
 University of Toronto, Department of Medicine, Temerty Faculty of Medicine, Toronto, Canada (GRID:grid.17063.33) (ISNI:0000 0001 2157 2938); University of Toronto, Department of Physiology, Temerty Faculty of Medicine, Toronto, Canada (GRID:grid.17063.33) (ISNI:0000 0001 2157 2938) 
 Mt. Sinai Hospital, Lunenfeld-Tanenbaum Research Institute, Toronto, Canada (GRID:grid.416166.2) (ISNI:0000 0004 0473 9881); University of Toronto, Division of Endocrinology and Metabolism, Temerty Faculty of Medicine, Toronto, Canada (GRID:grid.17063.33) (ISNI:0000 0001 2157 2938) 
 University of Toronto, Dalla Lana School of Public Health, Toronto, Canada (GRID:grid.17063.33) (ISNI:0000 0001 2157 2938); ICES, Toronto, Canada (GRID:grid.418647.8) (ISNI:0000 0000 8849 1617); Vector Institute, Toronto, Canada (GRID:grid.494618.6); Trillium Health Partners, Institute for Better Health, Mississauga, Canada (GRID:grid.417293.a) (ISNI:0000 0004 0459 7334); University of Toronto, Department of Laboratory Medicine & Pathology, Temerty Faculty of Medicine, Toronto, Canada (GRID:grid.17063.33) (ISNI:0000 0001 2157 2938) 
Publication year
2021
Publication date
Dec 2021
Publisher
Nature Publishing Group
e-ISSN
23986352
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2528862431
Copyright
© The Author(s) 2021. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.