Full text

Turn on search term navigation

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Tuberculosis (TB) is an infectious disease that has been declared a global emergency by the World Health Organization and remains one of the top ten causes of death worldwide. TB diagnosis is particularly challenging in developing countries, where limited infrastructure for detection and treatment complicates efforts to control the disease. These resource constraints are especially critical in remote areas with few mechanisms for timely diagnosis, which is essential for effective patient management. Artificial intelligence (AI) has emerged as a valuable tool in supporting health professionals by enhancing diagnostic processes. This paper explores the use of natural language processing (NLP) techniques and machine learning (ML) models to facilitate TB diagnosis in settings where robust data infrastructure is unavailable. Two distinct data sources were analyzed: text extracted from electronic medical records (EMRs) and patient clinical data (CD). Four different ML-based approaches were implemented: two models using each data source independently and two data fusion models combining both sources. The relevance of these strategies was assessed in collaboration with physicians to ensure their practical applicability in clinical decision-making. The results of the data fusion models were compared to determine which source provided more valuable diagnostic information. The best-performing model, which relied solely on CD, achieved a sensitivity of 73%, outperforming smear microscopy, which typically ranges from 40% to 60%. These findings underscore the importance of analyzing physicians’ reports and assessing the availability of such information alongside structured clinical data. This approach is particularly beneficial in resource-limited settings, where access to comprehensive clinical data may be restricted.

Details

Title
Data Fusion of Medical Records and Clinical Data to Enhance Tuberculosis Diagnosis in Resource-Limited Settings
Author
Orjuela-Cañón, Alvaro D 1   VIAFID ORCID Logo  ; Romero-Gómez, Andrés F 2 ; Jutinico, Andres L 3 ; Awad, Carlos E 4 ; Vergara, Erika 5 ; Palencia, Maria A 4 

 School of Medicine and Health Sciences, Universidad del Rosario, Bogota 111221, Colombia 
 Fundación Santa Fe de Bogotá, Bogota 110111, Colombia; [email protected] 
 Biomedical Engineering, Universidad Antonio Nariño, Bogota 110311, Colombia; [email protected] 
 Subred Integrada de Servicios de Salud Centro Oriente, Bogota 111711, Colombia; [email protected] (C.E.A.); [email protected] (M.A.P.) 
 Hospital Universitario Nacional, Bogota 111321, Colombia; [email protected] 
First page
5423
Publication year
2025
Publication date
2025
Publisher
MDPI AG
e-ISSN
20763417
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
3211858926
Copyright
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.