Abstract

Predictive modeling with electronic health record (EHR) data is anticipated to drive personalized medicine and improve healthcare quality. Constructing predictive statistical models typically requires extraction of curated predictor variables from normalized EHR data, a labor-intensive process that discards the vast majority of information in each patient’s record. We propose a representation of patients’ entire raw EHR records based on the Fast Healthcare Interoperability Resources (FHIR) format. We demonstrate that deep learning methods using this representation are capable of accurately predicting multiple medical events from multiple centers without site-specific data harmonization. We validated our approach using de-identified EHR data from two US academic medical centers with 216,221 adult patients hospitalized for at least 24 h. In the sequential format we propose, this volume of EHR data unrolled into a total of 46,864,534,945 data points, including clinical notes. Deep learning models achieved high accuracy for tasks such as predicting: in-hospital mortality (area under the receiver operator curve [AUROC] across sites 0.93–0.94), 30-day unplanned readmission (AUROC 0.75–0.76), prolonged length of stay (AUROC 0.85–0.86), and all of a patient’s final discharge diagnoses (frequency-weighted AUROC 0.90). These models outperformed traditional, clinically-used predictive models in all cases. We believe that this approach can be used to create accurate and scalable predictions for a variety of clinical scenarios. In a case study of a particular prediction, we demonstrate that neural networks can be used to identify relevant information from the patient’s chart.

Artificial intelligence: Algorithm predicts clinical outcomes for hospital inpatients

Artificial intelligence outperforms traditional statistical models at predicting a range of clinical outcomes from a patient’s entire raw electronic health record (EHR). A team led by Alvin Rajkomar and Eyal Oren from Google in Mountain View, California, USA, developed a data processing pipeline for transforming EHR files into a standardized format. They then applied deep learning models to data from 216,221 adult patients hospitalized for at least 24 h each at two academic medical centers, and showed that their algorithm could accurately predict risk of mortality, hospital readmission, prolonged hospital stay and discharge diagnosis. In all cases, the method proved more accurate than previously published models. The authors provide a case study to serve as a proof-of-concept of how such an algorithm could be used in routine clinical practice in the future.

Details

Title
Scalable and accurate deep learning with electronic health records
Author
Rajkomar Alvin 1   VIAFID ORCID Logo  ; Oren Eyal 2 ; Chen, Kai 2 ; Dai, Andrew M 2 ; Hajaj Nissan 2 ; Hardt Michaela 2 ; Liu, Peter J 2 ; Liu, Xiaobing 2 ; Marcus, Jake 2 ; Sun, Mimi 2 ; Sundberg Patrik 2 ; Yee, Hector 2 ; Zhang, Kun 2 ; Zhang, Yi 2 ; Flores, Gerardo 2 ; Duggan, Gavin E 2 ; Irvine, Jamie 2 ; Le Quoc 2 ; Litsch Kurt 2 ; Mossin, Alexander 2 ; Tansuwan Justin 2 ; Wang, De 2 ; Wexler, James 2 ; Jimbo, Wilson 2 ; Ludwig, Dana 3 ; Volchenboum, Samuel L 4 ; Chou, Katherine 2 ; Pearson, Michael 2 ; Srinivasan, Madabushi 2 ; Shah, Nigam H 5 ; Butte, Atul J 3 ; Howell, Michael D 2 ; Cui, Claire 2 ; Corrado, Greg S 2 ; Dean, Jeffrey 2 

 Google Inc, Mountain View, USA (GRID:grid.420451.6); University of California, San Francisco, San Francisco, USA (GRID:grid.266102.1) (ISNI:0000 0001 2297 6811) 
 Google Inc, Mountain View, USA (GRID:grid.420451.6) 
 University of California, San Francisco, San Francisco, USA (GRID:grid.266102.1) (ISNI:0000 0001 2297 6811) 
 University of Chicago Medicine, Chicago, USA (GRID:grid.170205.1) (ISNI:0000 0004 1936 7822) 
 Stanford University, Stanford, USA (GRID:grid.168010.e) (ISNI:0000000419368956) 
Publication year
2018
Publication date
Dec 2018
Publisher
Nature Publishing Group
e-ISSN
23986352
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2531366223
Copyright
© The Author(s) 2018. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.