Content area

Abstract

Purpose

Objectives were to develop a machine learning (ML) model based on electronic health record (EHR) data to predict the risk of vomiting within a 96-hour window after admission to the pediatric oncology and hematopoietic cell transplant (HCT) services using retrospective data and to evaluate the model prospectively in a silent trial.

Patients and methods

Admissions between 2018-06-02 to 2024-02-17 (retrospective) and 2024-05-09 to 2024-08-05 (prospective) to the oncology or HCT services were included. Data source was SEDAR, a curated and validated approach to deliver EHR data for ML. Prediction time was 08:30 the morning following admission. The outcome was any vomiting within 96 h following prediction time. We trained models using L2-regularized logistic regression, LightGBM and XGBoost. Training cohorts include the target cohort and all inpatient admissions.

Results

There were 7,408 admissions in the retrospective phase and 340 admissions in the prospective silent trial phase. The best-performing model in the retrospective phase was the LightGBM model trained on all inpatients. The number of features in the final model was 2,859. The area-under-the-receiver-operating-characteristic curve (AUROC) was 0.730 (95% confidence interval (CI) 0.694–0.765) for the retrospective phase and 0.716 (95% CI 0.649–0.784) for the prospective silent trial phase.

Conclusions

We found that data in the EHR could be used to develop a retrospective ML model to predict vomiting among pediatric oncology and HCT inpatients. This model retained satisfactory performance in a prospective silent trial. Future plans will include deployment into clinical workflows and determining if the model improves vomiting control.

Full text

Turn on search term navigation

© 2025. This work is licensed under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.