Full text

Turn on search term navigation

© 2022. This work is licensed under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Background: Although machine learning (ML) algorithms have been applied to point-of-care sepsis prognostication, ML has not been used to predict sepsis mortality in an administrative database. Therefore, we examined the performance of common ML algorithms in predicting sepsis mortality in adult patients with sepsis and compared it with that of the conventional context knowledge–based logistic regression approach.

Objective: The aim of this study is to examine the performance of common ML algorithms in predicting sepsis mortality in adult patients with sepsis and compare it with that of the conventional context knowledge–based logistic regression approach.

Methods: We examined inpatient admissions for sepsis in the US National Inpatient Sample using hospitalizations in 2010-2013 as the training data set. We developed four ML models to predict in-hospital mortality: logistic regression with least absolute shrinkage and selection operator regularization, random forest, gradient-boosted decision tree, and deep neural network. To estimate their performance, we compared our models with the Super Learner model. Using hospitalizations in 2014 as the testing data set, we examined the models’ area under the receiver operating characteristic curve (AUC), confusion matrix results, and net reclassification improvement.

Results: Hospitalizations of 923,759 adults were included in the analysis. Compared with the reference logistic regression (AUC: 0.786, 95% CI 0.783-0.788), all ML models showed superior discriminative ability (P<.001), including logistic regression with least absolute shrinkage and selection operator regularization (AUC: 0.878, 95% CI 0.876-0.879), random forest (AUC: 0.878, 95% CI 0.877-0.880), xgboost (AUC: 0.888, 95% CI 0.886-0.889), and neural network (AUC: 0.893, 95% CI 0.891-0.895). All 4 ML models showed higher sensitivity, specificity, positive predictive value, and negative predictive value compared with the reference logistic regression model (P<.001). We obtained similar results from the Super Learner model (AUC: 0.883, 95% CI 0.881-0.885).

Conclusions: ML approaches can improve sensitivity, specificity, positive predictive value, negative predictive value, discrimination, and calibration in predicting in-hospital mortality in patients hospitalized with sepsis in the United States. These models need further validation and could be applied to develop more accurate models to compare risk-standardized mortality rates across hospitals and geographic regions, paving the way for research and policy initiatives studying disparities in sepsis care.

Details

Title
Predicting Sepsis Mortality in a Population-Based National Database: Machine Learning Approach
Author
Park, James Yeongjun  VIAFID ORCID Logo  ; Hsu, Tzu-Chun  VIAFID ORCID Logo  ; Hu, Jiun-Ruey  VIAFID ORCID Logo  ; Chun-Yuan, Chen  VIAFID ORCID Logo  ; Wan-Ting, Hsu  VIAFID ORCID Logo  ; Lee, Matthew  VIAFID ORCID Logo  ; Ho, Joshua  VIAFID ORCID Logo  ; Chien-Chang, Lee  VIAFID ORCID Logo 
First page
e29982
Section
Clinical Informatics
Publication year
2022
Publication date
Apr 2022
Publisher
Gunther Eysenbach MD MPH, Associate Professor
e-ISSN
1438-8871
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2657523042
Copyright
© 2022. This work is licensed under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.