Content area

Abstract

Bacterial identification, antimicrobial resistance prediction, and strain typification are critical tasks in clinical microbiology, essential for guiding patient treatment and controlling the spread of infectious diseases. While Machine Learning (ML) has shown immense promise in enhancing MALDI-TOF mass spectrometry applications for these tasks, an up to date comprehensive review from a ML perspective is currently lacking. To address this gap, we systematically reviewed 93 studies published between 2004 and 2024, focusing on key ML aspects such as data size and balance, pre-processing pipelines, model selection and evaluation, open-source data and code availability. Our analysis highlights the predominant use of classical ML models like Random Forest and Support Vector Machines, alongside emerging interest in Deep Learning approaches for handling complex, high-dimensional data. Despite significant progress, challenges such as inconsistent preprocessing workflows, reliance on black-box models, limited external validation, and insufficient open-source resources persist, hindering transparency, reproducibility, and broader adoption. This review offers actionable insights to enhance ML-driven bacteria diagnostics, advocating for standardized methodologies, greater transparency, and improved data accessibility. In addition, we provide guidelines on how to approach ML for MALDI-TOF analysis, helping researchers navigate key decisions in model development and evaluation.

Competing Interest Statement

The authors have declared no competing interest.

Footnotes

* Final version submitted to journal. Text, figures and tables are updated and have been revised.

Details

1009240
Business indexing term
Title
Machine Learning applied to MALDI-TOF data in a clinical setting: a systematic review
Publication title
bioRxiv; Cold Spring Harbor
Publication year
2025
Publication date
Mar 3, 2025
Section
New Results
Publisher
Cold Spring Harbor Laboratory Press
Source
BioRxiv
Place of publication
Cold Spring Harbor
Country of publication
United States
University/institution
Cold Spring Harbor Laboratory Press
Publication subject
ISSN
2692-8205
Source type
Working Paper
Language of publication
English
Document type
Working Paper
Publication history
 
 
Milestone dates
2025-01-28 (Version 1)
ProQuest document ID
3160657507
Document URL
https://www.proquest.com/working-papers/machine-learning-applied-maldi-tof-data-clinical/docview/3160657507/se-2?accountid=208611
Copyright
© 2025. This article is published under http://creativecommons.org/licenses/by-nc/4.0/ (“the License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Last updated
2025-03-04
Database
ProQuest One Academic