Full text

Turn on search term navigation

© 2025. This work is licensed under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Background:Hypertension is a major global health issue and a significant modifiable risk factor for cardiovascular diseases, contributing to a substantial socioeconomic burden due to its high prevalence. In China, particularly among populations living near desert regions, hypertension is even more prevalent due to unique environmental and lifestyle conditions, exacerbating the disease burden in these areas, underscoring the urgent need for effective early detection and intervention strategies.

Objective:This study aims to develop, calibrate, and prospectively validate a 2-year hypertension risk prediction model by using large-scale health examination data collected from populations residing in 4 regions surrounding the Taklamakan Desert of northwest China.

Methods:We retrospectively analyzed the health examination data of 1,038,170 adults (2019-2021) and prospectively validated our findings in a separate cohort of 961,519 adults (2021-2023). Data included demographics, lifestyle factors, physical examinations, and laboratory measurements. Feature selection was performed using light gradient-boosting machine–based recursive feature elimination with cross-validation and Least Absolute Shrinkage and Selection Operator, yielding 24 key predictors. Multiple machine learning (logistic regression, random forest, extreme gradient boosting, light gradient-boosting machine) and deep learning (Feature Tokenizer + Transformer, SAINT) models were trained with Bayesian hyperparameter optimization.

Results:Over a 2-year follow-up, 15.20% (157,766/1,038,170) of the participants in the retrospective cohort and 10.50% (101,077/961,519) in the prospective cohort developed hypertension. Among the models developed, the CatBoost model demonstrated the best performance, achieving area under the curve (AUC) values of 0.888 (95% CI 0.886-0.889) in the retrospective cohort and 0.803 (95% CI 0.801-0.804) in the prospective cohort. Calibration via isotonic regression improved the model’s probability estimates, with Brier scores of 0.090 (95% CI 0.089-0.091) and 0.102 (95% CI 0.101-0.103) in the internal validation and prospective cohorts, respectively. Participants were ranked by the positive predictive value calculated using the calibrated model and stratified into 4 risk categories (low, medium, high, and very high), with the very high group exhibiting a 41.08% (5741/13,975) hypertension incidence over 2 years. Age, BMI, and socioeconomic factors were identified as significant predictors of hypertension.

Conclusions:Our machine learning model effectively predicted the 2-year risk of hypertension, making it particularly suitable for preventive health care management in high-risk populations residing in the desert regions of China. Our model exhibited excellent predictive performance and has potential for clinical application. A web-based application was developed based on our predictive model, which further enhanced the accessibility for clinical and public health use, aiding in reducing the burden of hypertension through timely prevention strategies.

Details

Title
Two-Year Hypertension Incidence Risk Prediction in Populations in the Desert Regions of Northwest China: Prospective Cohort Study
Author
Cheng, Yinlin  VIAFID ORCID Logo  ; Gu, Kuiying  VIAFID ORCID Logo  ; Ji, Weidong  VIAFID ORCID Logo  ; Hu, Zhensheng  VIAFID ORCID Logo  ; Yang, Yining  VIAFID ORCID Logo  ; Zhou, Yi  VIAFID ORCID Logo 
First page
e68442
Section
Clinical Information and Decision Making
Publication year
2025
Publication date
2025
Publisher
Gunther Eysenbach MD MPH, Associate Professor
e-ISSN
1438-8871
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
3222368831
Copyright
© 2025. This work is licensed under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.