Content area

Abstract

The All of Us Research Program (AoU) is an initiative designed to gather a comprehensive and diverse dataset from at least one million individuals across the USA. This longitudinal cohort study aims to advance research by providing a rich resource of genetic and phenotypic information, enabling powerful studies on the epidemiology and genetics of human diseases. One critical challenge to maximizing its use is the development of accurate algorithms that can efficiently and accurately identify well-defined disease and disease-free participants for case-control studies. This study aimed to develop and validate type 1 (T1D) and type 2 diabetes (T2D) algorithms in the AoU cohort, using electronic health record (EHR) and survey data. Building on existing algorithms and using diagnosis codes, medications, laboratory results, and survey data, we developed and implemented algorithms for identifying prevalent cases of type 1 and type 2 diabetes. The first set of algorithms used only EHR data (EHR-only), and the second set used a combination of EHR and survey data (EHR+). A universal algorithm was also developed to identify individuals without diabetes. The performance of each algorithm was evaluated by testing its association with polygenic scores (PSs) for type 1 and type 2 diabetes. We demonstrated the feasibility and utility of using AoU EHR and survey data to employ diabetes algorithms. For T1D, the EHR-only algorithm showed a stronger association with T1D-PS compared to the EHR + algorithm (DeLong p-value = 3 × 10−5). For T2D, the EHR + algorithm outperformed both the EHR-only and the existing T2D definition provided in the AoU Phenotyping Library (DeLong p-values = 0.03 and 1 × 10−4, respectively), identifying 25.79% and 22.57% more cases, respectively, and providing an improved association with T2D PS. We provide a new validated type 1 diabetes definition and an improved type 2 diabetes definition in AoU, which are freely available for diabetes research in the AoU. These algorithms ensure consistency of diabetes definitions in the cohort, facilitating high-quality diabetes research.

Details

1009240
Title
Algorithms for the identification of prevalent diabetes in the All of Us Research Program validated using polygenic scores
Author
Szczerbinski, Lukasz 1 ; Mandla, Ravi 2 ; Schroeder, Philip 3 ; Porneala, Bianca C. 4 ; Li, Josephine H. 5 ; Florez, Jose C. 5 ; Mercader, Josep M. 5 ; Udler, Miriam S. 5 ; Manning, Alisa K. 6 

 Medical University of Bialystok, Department of Endocrinology, Diabetology and Internal Medicine, Bialystok, Poland (GRID:grid.48324.39) (ISNI:0000 0001 2248 2838); Medical University of Bialystok, Clinical Research Centre, Bialystok, Poland (GRID:grid.48324.39) (ISNI:0000000122482838); Broad Institute of Harvard and MIT, Programs in Metabolism and Medical & Population Genetics, Cambridge, USA (GRID:grid.66859.34) (ISNI:0000 0004 0546 1623); Massachusetts General Hospital, Center for Genomic Medicine, Boston, USA (GRID:grid.32224.35) (ISNI:0000 0004 0386 9924); Massachusetts General Hospital, Diabetes Unit, Department of Medicine, Boston, USA (GRID:grid.32224.35) (ISNI:0000 0004 0386 9924) 
 Broad Institute of Harvard and MIT, Programs in Metabolism and Medical & Population Genetics, Cambridge, USA (GRID:grid.66859.34) (ISNI:0000 0004 0546 1623); Massachusetts General Hospital, Center for Genomic Medicine, Boston, USA (GRID:grid.32224.35) (ISNI:0000 0004 0386 9924); Massachusetts General Hospital, Diabetes Unit, Department of Medicine, Boston, USA (GRID:grid.32224.35) (ISNI:0000 0004 0386 9924); University of California, Cardiology Division, Department of Medicine and Cardiovascular Research Institute, San Francisco, USA (GRID:grid.266102.1) (ISNI:0000 0001 2297 6811) 
 Broad Institute of Harvard and MIT, Programs in Metabolism and Medical & Population Genetics, Cambridge, USA (GRID:grid.66859.34) (ISNI:0000 0004 0546 1623); Massachusetts General Hospital, Center for Genomic Medicine, Boston, USA (GRID:grid.32224.35) (ISNI:0000 0004 0386 9924); Massachusetts General Hospital, Diabetes Unit, Department of Medicine, Boston, USA (GRID:grid.32224.35) (ISNI:0000 0004 0386 9924) 
 Massachusetts General Hospital, Division of General Internal Medicine, Department of Medicine, Boston, USA (GRID:grid.32224.35) (ISNI:0000 0004 0386 9924) 
 Broad Institute of Harvard and MIT, Programs in Metabolism and Medical & Population Genetics, Cambridge, USA (GRID:grid.66859.34) (ISNI:0000 0004 0546 1623); Massachusetts General Hospital, Center for Genomic Medicine, Boston, USA (GRID:grid.32224.35) (ISNI:0000 0004 0386 9924); Massachusetts General Hospital, Diabetes Unit, Department of Medicine, Boston, USA (GRID:grid.32224.35) (ISNI:0000 0004 0386 9924); Harvard Medical School, Department of Medicine, Boston, USA (GRID:grid.38142.3c) (ISNI:000000041936754X) 
 Broad Institute of Harvard and MIT, Programs in Metabolism and Medical & Population Genetics, Cambridge, USA (GRID:grid.66859.34) (ISNI:0000 0004 0546 1623); Massachusetts General Hospital, Center for Genomic Medicine, Boston, USA (GRID:grid.32224.35) (ISNI:0000 0004 0386 9924); Harvard Medical School, Department of Medicine, Boston, USA (GRID:grid.38142.3c) (ISNI:000000041936754X); Massachusetts General Hospital, Clinical and Translational Epidemiology Unit, Department of Medicine, Boston, USA (GRID:grid.32224.35) (ISNI:0000 0004 0386 9924) 
Volume
14
Issue
1
Pages
26895
Publication year
2024
Publication date
2024
Publisher
Nature Publishing Group
Place of publication
London
Country of publication
United States
Publication subject
e-ISSN
20452322
Source type
Scholarly Journal
Language of publication
English
Document type
Journal Article
Publication history
 
 
Online publication date
2024-11-06
Milestone dates
2024-09-29 (Registration); 2024-04-20 (Received); 2024-09-29 (Accepted)
Publication history
 
 
   First posting date
06 Nov 2024
ProQuest document ID
3124952875
Document URL
https://www.proquest.com/scholarly-journals/algorithms-identification-prevalent-diabetes-all/docview/3124952875/se-2?accountid=208611
Copyright
© The Author(s) 2024. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Last updated
2024-11-07
Database
ProQuest One Academic