Full text

Turn on search term navigation

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Latent tuberculosis infection (LTBI) poses a significant public health challenge, especially in populations with high HIV prevalence and limited healthcare access. Early detection and targeted interventions are essential to prevent the progression of active tuberculosis. This study aimed to identify the key factors influencing LTBI outcomes through the application of predictive models, including logistic regression and machine learning techniques, while also evaluating strategies to enhance LTBI awareness and testing. Data from rural areas in the Eastern Cape, South Africa, were analyzed to identify key demographic, health, and knowledge-related factors influencing LTBI outcomes. Predictive models utilized, included logistic regression, decision trees, and random forests, to identify key determinants of LTBI positivity based on demographic, health, and knowledge-related factors in rural areas of the Eastern Cape, South Africa. The models evaluated factors such as age, HIV status, and LTBI awareness, with random forests demonstrating the best balance of accuracy and interpretability. Additionally, a knowledge diffusion model was employed to assess the effectiveness of educational strategies in increasing LTBI awareness and testing uptake. Logistic regression achieved an accuracy of 68% with high precision (70%) but low recall (33%) for LTBI-positive cases, identifying age, HIV status, and LTBI awareness as significant predictors. The random forest model outperformed logistic regression in accuracy (59.26%) and F1-score (0.63), providing a better balance between precision and recall. Feature importance analysis revealed that age, occupation, and knowledge of LTBI symptoms were the most critical factors across both models. The knowledge diffusion model demonstrated that targeted interventions significantly increased LTBI awareness and testing, particularly in high-risk groups. While logistic regression offers more interpretable results for public health interventions, machine learning models like random forests provide enhanced predictive power by capturing complex relationships between demographics and health factors. These findings highlight the need for targeted educational campaigns and increased LTBI testing in high-risk populations, particularly those with limited awareness of LTBI symptoms.

Details

Title
Exploring Determinants and Predictive Models of Latent Tuberculosis Infection Outcomes in Rural Areas of the Eastern Cape: A Pilot Comparative Analysis of Logistic Regression and Machine Learning Approaches
Author
Faye, Lindiwe Modest 1 ; Magwaza, Cebo 1 ; Dlatu, Ntandazo 2 ; Teke Apalata 1   VIAFID ORCID Logo 

 Department of Laboratory Medicine and Pathology, Walter Sisulu University, Private Bag X1, Mthatha 5100, South Africa; [email protected] (C.M.); [email protected] (T.A.) 
 Department of Public Health, Faculty of Health Sciences, Walter Sisulu University, Private Bag X1, Mthatha 5100, South Africa; [email protected] 
First page
239
Publication year
2025
Publication date
2025
Publisher
MDPI AG
e-ISSN
20782489
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
3181516335
Copyright
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.