Biomarker panels for improved risk prediction and

Full text

Turn on search term navigation

Introduction

Patients with atrial fibrillation (AF) have an increased risk of death, cardiovascular events, and bleeding. Findings from prospective cohort studies suggest that AF is associated with an approximately 5-fold increased risk of stroke, 2-fold increased risk of myocardial infarction and death^{1, 2–3}, and at least a 3-fold increased risk of heart failure^1,4. Additionally, AF patients have an increased risk of major bleeding due to oral anticoagulation treatment⁵. As these complications carry significant risks for the individual patient and induce a significant burden to the healthcare system, it is of substantial clinical relevance to gain a better understanding of their underlying pathophysiology and to improve risk prediction. Recent histological analyses from AF biopsies showed that the atrial tissue undergoes extensive changes over time, due to inflammation, blood clotting, vascular permeability, and collagen infiltration⁶. Circulating biomarkers that reflect these pathological pathways offer a promising and less invasive approach to assessing their involvement in individual patients, compared to more invasive diagnostic methods, such as tissue biopsies. Here, we report biomarker patterns associated with major adverse cardiac events and bleeding in AF patients, using a panel of selected biomarkers reflecting distinct disease pathways. By leveraging both traditional statistical methods and machine learning models, we gain a deeper understanding of AF pathophysiology and improve risk prediction for major cardiovascular events and bleeding complications in this population.

Results

Baseline characteristics of the cohort

A total of 3817 AF patients were included in this analysis. Mean age was 71 ± 10 years, and 1067 (28%) were female (Table 1). Nearly half had paroxysmal AF (49%), and non-paroxysmal forms of AF were present in 51% (28% persistent and 23% permanent AF). Hypertension was the most common cardiovascular risk factor (69%), followed by coronary artery disease (27%) and heart failure (24%). Overall, 84% of patients were on oral anticoagulation. Spearman rank correlations between biomarkers are presented in Supplementary Fig. 1. HsTropT showed strong correlations (>0.60) with GDF-15 (0.64) and cystatin C (0.62). Osteopontin (OPN) was strongly correlated with cystatin C (0.77) and GDF-15 (0.64). IGFBP-7 exhibited strong correlations with cystatin C (0.68) and GDF-15 (0.67). NT-proBNP had a strong correlation with ANG-2 (0.69).

Table 1. Baseline characteristics

Characteristic	N = 3817
Age, years	71 ± 10
Female sex	1067 (28)
Body mass index, kg/m²	27 ± 5
Systolic blood pressure, mmHg	134 ± 19
Heart rate, bpm	70 ± 17
CHA₂DS₂-VASc score	3.2 ± 1.7
Smoking status
Active	298 (8)
Past	1836 (48)
Never	1673 (44)
Type of atrial fibrillation
Paroxysmal	1869 (49)
Persistent	1070 (28)
Permanent	875 (23)
Medical history
Hypertension	2635 (69)
Diabetes	614 (16)
Prior stroke/TIA	651 (17)
Coronary artery disease	1021 (27)
Prior myocardial infarction	564 (15)
Peripheral vascular disease	281 (7)
Heart failure	909 (24)
Major bleeding	146 (4)
Chronic kidney disease	708 (19)
Oral anticoagulation	3212 (84)
Direct oral anticoagulants	1346 (35)
Vitamin K antagonists	1864 (49)

Data are presented as means ± standard deviations or counts (percentages).

TIA transient ischemic attack.

Associations of biomarkers with cardiovascular outcomes

To identify biomarkers associated with cardiovascular events, we conducted age- and sex-adjusted, and multivariable-adjusted Cox regression analyses for each biomarker and outcome (Supplementary Tables 1–10 and in Supplementary Fig. 2). For the composite outcome of cardiovascular death, nonfatal ischemic stroke, nonfatal systemic embolism, or nonfatal myocardial infarction, 5 biomarkers including d-dimer, GDF-15, IL-6, NT-proBNP and hsTropT independently contributed to the model fit (Supplementary Table 1, Fig. 1A). Notably, hsTropT, NT-proBNP, and GDF-15 were among the most significant variables in the model (Fig. 2A). For heart failure hospitalization, 4 biomarkers - GDF-15, IGFBP-7, NT-proBNP and hsTropT - were significantly associated with the outcome (Supplementary Table 2, Fig. 1B), with NT-proBNP and GDF-15 being among the most important risk predictors (Fig. 2B). GDF-15, IGFBP-7, IL-6 and hsTropT were associated with an increased risk of major bleeding events (Supplementary Table 3, Fig. 1C), with GDF-15 being one of the most important biomarkers in the model (Fig. 2C). NT-proBNP and IL-6 were associated with both ischemic and the composite of ischemic and hemorrhagic stroke (Supplementary Tables 4 and 5, Fig. 1D, E), and NT-proBNP was among the most important risk indicators (Fig. 2D, E). Both IL-6 and hsTropT were linked to a higher risk of MI (Supplementary Table 6, Fig. 1F), and both were among the major predictors in the model (Fig. 2F). For cardiovascular death, GDF-15, IL-6, NT-proBNP, and hsTropT were associated with the outcome (Supplementary Table 7, Fig. 1G). GDF-15, NT-proBNP, and hsTropT were the most important predictors (Fig. 2G). D-dimer, GDF-15, IGFBP-7, IL-6, NT-proBNP, and hsTropT were associated with all-cause death (Supplementary Table 8, Fig. 1H). GDF-15, IL-6, and hsTropT were key risk predictors (Fig. 2H). For the composite bleeding outcome and clinically relevant NM bleeding, GDF-15, IL-6 for composite bleeding, and NT-proBNP for clinically relevant NM bleeding were associated with the outcomes (Supplementary Tables 9 and 10, Fig. 1I and J). GDF-15 was the most important risk indicator in the model (Fig. 2I, J). In sensitivity analyses restricted to patients on oral anticoagulation at baseline, the associations between biomarkers and major bleeding, all strokes, and ischemic stroke remained consistent (Supplementary Tables 11–13 and Supplementary Figs. 3 and4).

Fig. 1 Associations between selected biomarkers and adverse cardiovascular outcomes from combined Cox models. [Images not available. See PDF.]

This figure shows standardized hazard ratios and 95% CIs of associations between backward-selected biomarkers and different adverse cardiac outcomes, derived from combined multivariable Cox models. A shows effect estimates of selected biomarkers for composite outcome of cardiovascular death, nonfatal ischemic stroke, nonfatal systemic embolism, or nonfatal myocardial infarction. B shows effect estimates of selected biomarkers for heart failure hospitalization. C shows effect estimates of selected biomarkers for major bleeding. D shows effect estimates of selected biomarkers for all strokes. E shows effect estimates of selected biomarkers for ischemic stroke. F shows effect estimates of selected biomarkers for myocardial infarction. G shows effect estimates of selected biomarkers for cardiovascular death. H shows effect estimates of selected biomarkers for all-cause death. I shows effect estimates of selected biomarkers for any bleeding. J shows effect estimates of selected biomarkers for clinically relevant non-major (NM) bleeding. All outcomes were assessed in N = 3817 AF patients. Dots and whiskers represent hazard ratios and 95% CIs. ALAT Alanine aminotransferase, ANG-2 Angiopoetin-2, GDF-15 growth differentiation factor‑15, hsTropT high-sensitivity troponin T, IGFBP-7 Insulin-like growth factor-binding protein-7, IL-6 Interleukin-6, NT-proBNP N-terminal pro-B-type natriuretic peptide, OPN Osteopontin.

Fig. 2 Relative importance of predictors from combined Cox models. [Images not available. See PDF.]

This figure shows the relative importance of each clinical variable and backward-selected biomarkers for different adverse cardiac outcomes, derived from combined multivariable Cox models. A shows relative importance of variables for the association with the composite outcome of cardiovascular death, nonfatal ischemic stroke, nonfatal systemic embolism, or nonfatal myocardial infarction. B shows relative importance of variables for the association with heart failure hospitalization. C shows relative importance of variables for the association with major bleeding. D shows relative importance of variables for the association with all strokes. E shows relative importance of variables for the association with ischemic stroke. F shows relative importance of variables for the association with myocardial infarction. G shows relative importance of variables for the association with cardiovascular death. H shows relative importance of variables for the association with all-cause death. I shows the relative importance of variables for the association with any bleeding. J shows relative importance of variables for the association with clinically relevant non-major (NM) bleeding. All outcomes were assessed in N = 3817 AF patients. Dots represent the partial χ2 – degree of freedom values. Source data are provided as a Source Data file. ALAT Alanine aminotransferase, ANG-2 Angiopoetin-2, BMI body mass index, CAD coronary artery disease, CKD chronic kidney disease, eGFR estimated glomerular filtration rate, GDF-15 growth differentiation factor‑15, hsTropT high-sensitivity troponin T, IGFBP-7 Insulin-like growth factor-binding protein-7, IL-6 Interleukin-6, NT-proBNP N-terminal pro-B-type natriuretic peptide, OPN Osteopontin, SBP Systolic blood pressure.

Comparison of biomarker model with clinical risk scores for stroke and major bleeding prediction

Figure 3 shows the discriminatory performance of the Cox model with and without biomarkers compared to clinical risk scores. For the composite stroke outcome, the inclusion of biomarkers significantly improved risk prediction relative to the CHA₂DS₂-VASc (AUC: 0.69 vs. 0.64; P = 0.0003) and the ABC-stroke score (AUC: 0.69 vs. 0.68; P = 0.02). For ischemic stroke, the biomarker model improved risk prediction as compared to the CHA₂DS₂-VASc (AUC: 0.68 vs. 0.63; P = 0.003) and the ABC-stroke score (AUC: 0.68 vs. 0.66; P = 0.03). For major bleeding, the biomarker model demonstrated improved predictive ability compared to the HAS-BLED score (AUC: 0.69 vs. 0.59; P = 0.007)

Fig. 3 Discriminatory performance of base Cox models with biomarkers vs. clinical scores for stroke and major bleeding. [Images not available. See PDF.]

This figure shows ROC curves of base Cox models with and without biomarkers and clinical risk scores for all strokes, ischemic stroke, and major bleeding. All outcomes were assessed in N = 3817 AF patients. ABC age, biomarkers, clinical history stroke risk score, CHA₂DS₂-VASc Congestive heart failure, Hypertension, Age (2 points if age >75 y), Diabetes, Stroke, Vascular disease, Sex category, HAS-BLED Hypertension, Abnormal renal/liver function, Stroke, Bleeding history or predisposition, Labile international normalized ratio, Elderly (>65 years), Drugs/alcohol concomitantly.

Comparison of Cox and machine learning models with and without biomarkers for predicting cardiovascular outcomes

We investigated the impact of adding biomarkers to traditional Cox models and machine learning algorithms on cardiovascular risk prediction (Fig. 4, Supplementary Table 14). For composite outcome, the inclusion of biomarkers significantly improved model performance, increasing the AUC of the combined Cox model from 0.74 to 0.77 (P = 2.6 × 10⁻⁸). The random forest model showed an increase in AUC from 0.74 to 0.75 (P = 0.03), while the XGBoost model demonstrated an improvement from 0.95 to 0.97 (P = 0.0007345). For heart failure hospitalization, the inclusion of biomarkers enhanced predictive accuracy across all models. The combined Cox model’s AUC increased from 0.77 to 0.80 (P = 5.5 × 10⁻¹⁰), the LASSO model from 0.80 to 0.83 (P = 0.04), the random forest model from 0.77 to 0.80 (P = 0.0002564), and the XGBoost model from 0.96 to 0.98 (P = 5.0 × 10⁻⁶). For major bleeding events, the addition of biomarkers resulted in improvements in some models. The combined Cox model’s AUC increased from 0.67 to 0.68 (P = 0.01), and the random forest model showed a small but non-significant increase from 0.63 to 0.65 (P = 0.10). However, the LASSO model did not show a significant change (AUC 0.69–0.70, P = 0.50). In contrast, the XGBoost model exhibited an improvement, with the AUC increasing from 0.94 to 0.97 (P = 8.8 × 10⁻⁵). Similar trends were observed for all secondary outcomes. When comparing biomarker-based Cox and machine learning models to established clinical risk scores, the Cox and most machine learning models demonstrated higher AUC values than the ABC-stroke and CHA₂DS₂-VASc for stroke prediction, and the HAS-BLED for major bleeding (Supplementary Fig. 5). In sensitivity analyses restricted to patients receiving oral anticoagulation, the results for stroke and major bleeding remained consistent, with most models incorporating biomarkers showing higher AUC values (Supplementary Table 15 and Supplementary Fig. 6).

Fig. 4 Predictive performance of Cox and machine learning models for outcomes with and without biomarkers. [Images not available. See PDF.]

The figure shows the AUC with 95% CI for Cox and machine learning models, comparing the performance of base and base + biomarker models for different adverse cardiac outcomes. The combined Cox models include age, sex, body mass index, current smoker, systolic blood pressure, history of diabetes, prior stroke or TIA, history of heart failure, chronic kidney disease, coronary artery disease, and backward-selected biomarkers. The machine learning models include all variables listed in the Supplementary Table 16 and all biomarkers. All outcomes were assessed in N = 3817 AF patients. Dots represent AUC values and whiskers indicate 95% CIs. AUC area under the curve.

Discussion

In this cohort of 3817 well-phenotyped AF patients, we identified several biomarkers associated with adverse cardiovascular events, including markers of myocardial injury (hsTropT), inflammation (IL-6), oxidative stress (GDF-15), coagulation (d-dimer), and cardiac dysfunction (NT-proBNP, IGFBP-7). The integration of these biomarkers into both traditional and machine learning-based predictive models significantly improved risk prediction, providing a more comprehensive assessment of adverse cardiovascular outcomes in this population. The improvement in predictive power was modest for most analyses.

Our analysis identified 6 biomarkers independently associated with AF-related complications and bleeding. GDF-15, a member of the TGF-β superfamily induced in cardiomyocytes, plays a significant role in oxidative stress, inflammation, cardiac injury, and fibrosis⁷. Epidemiological studies suggest that elevated GDF-15 levels increase the risk of bleeding in patients with cardiovascular diseases or AF^{8, 9–10}. Our findings not only confirm this association but also highlight GDF-15 as a robust predictor of heart failure hospitalization, with its predictive strength comparable to NT-proBNP and exceeding that of IGFBP-7 (Supplementary Table 2)¹¹. IL-6, a well-established pro-inflammatory cytokine, has been linked to the pathophysiology of cardiovascular disease, particularly in AF patients¹². Our study expands on these findings by demonstrating an association between IL-6 levels and both major and any bleeding events, suggesting that systemic inflammation may play a role in disrupting coagulation and vascular permeability. Additionally, IL-6 was significantly associated with stroke outcomes, consistent with findings from Mendelian randomization studies identifying IL-6 as a causal mediator of ischemic stroke in non-AF populations¹³. Genetic studies also underscore the relationship between IL-6 and atherosclerosis^14,15, a key risk factor for stroke. These findings warrant further investigation into the causal mechanisms linking IL-6 to both ischemic stroke and bleeding, as well as its potential therapeutic implications.

Our study underscores the multifactorial nature of AF-related complications, with diverse pathophysiological pathways contributing to risk. Observational studies have shown that a multimarker approach significantly improves risk prediction in both cardiovascular disease and AF populations^{16, 17, 18–19}. We demonstrated that incorporating key biomarkers into prediction models led to a modest but significant improvement in the discriminatory ability of Cox and most machine learning models. This supports the concept that a biomarker panel reflecting the diverse tissue changes seen in AF provides a valuable approach for comprehensive cardiovascular risk assessment.

Emerging candidate biomarkers may capture additional biological aspects of AF-related outcomes. For example, the thrombin–antithrombin complex has been associated with worse outcomes in anticoagulated Asian patients with AF²⁰, while factor VIII antigen independently predicts stroke risk²¹. Bone morphogenetic protein 10 (BMP10) has been associated with incident AF in a population free of AF at baseline²². Moreover, among patients with established AF, elevated BMP10 levels have been linked to a higher risk of ischemic stroke independent of oral anticoagulation treatment²³, and an increased incidence of adverse outcome events compared to those with lower levels²⁴. Future studies should investigate whether a more comprehensive proteomic analysis can provide deeper insights into the pathophysiology of AF complications and enhance risk stratification strategies.

Targeting multiple pathophysiological systems is essential for improving outcomes in complex cardiovascular conditions. For example, the polypill strategy has shown promise in stable coronary artery disease by simultaneously addressing multiple risk factors²⁵. In AF patients, targeted treatments aimed at underlying cardiovascular conditions have been shown to improve sinus rhythm maintenance in persistent AF patients²⁶; however, the impact of this strategy in reducing cardiovascular outcomes remains unclear. To date, no randomized trial has specifically evaluated the effects of such a multifaceted treatment approach in AF patients. Future clinical trials should therefore investigate whether comprehensive strategies targeting inflammation, coagulation disturbances, and cardiac dysfunction can improve long-term outcomes in this high-risk population.

The CHA₂DS₂-VASc score is a widely used tool for predicting stroke risk in AF patients²⁷. However, its discriminatory performance is moderate at best²⁸. Recent studies incorporating biomarkers, such as the ABC-stroke score, have demonstrated improved stroke prediction compared to the CHA₂DS₂-VASc score^29,30. Our findings confirm the value of biomarkers in improving risk stratification in addition to clinical scores for both stroke and bleeding. The additive benefit of our biomarker panel was much larger when compared with biomarker-free scores (CHA₂DS₂-VASc, HAS-BLED) than compared to scores that already include some biomarkers (ABC-stroke), confirming that a biomarker-based approach strongly enhances risk prediction compared to models based exclusively on clinical variables. Further studies are needed to determine the optimal number of biomarkers for such models. Our data suggest that a more comprehensive biomarker-based model provides better risk prediction. Amplified P-wave duration has been associated with AF recurrence after ablation and worse prognosis^31,32. Whether a combination with biomarkers further improves risk prediction in AF patients should be assessed in further studies.

We built machine learning models to leverage the entire set of clinical and biomarker variables for risk prediction. While traditional Cox regression models are limited by the number of predictors they can include without overfitting the model, machine learning models offer the advantage of capturing complex, non-linear relationships between clinical variables, biomarkers, and outcomes. In our study, we demonstrated that machine learning models, such as XGBoost and LASSO, achieved modest but significant improvements in predictive performance when biomarkers were included. These models, coupled with biomarker panels, have the potential to help clinicians identify patients who may benefit from further investigation and treatment. Future studies should assess whether the use of machine learning-based risk models can improve the management of AF patients.

This study has several limitations. First, the models were developed using a Swiss cohort, and their performance in external validation with other AF patient cohorts remains to be determined. However, we performed repeated cross-validation on the machine learning models, supporting the robustness of the results. Second, the study population was predominantly anticoagulated, which may limit the generalizability of our results to non-anticoagulated populations. Third, the absence of a non-AF comparison group limits our ability to determine whether the observed biomarker changes are specific to AF or linked to other pathophysiological processes. Future studies should incorporate appropriate control groups without AF or leverage Mendelian randomization analyses to improve our understanding of the pathophysiology of AF. Fourth, while our panel of biomarkers was chosen based on a thorough literature review and strong evidence linking them to AF pathophysiology and its complications, we acknowledge that any selection process is somewhat arbitrary and may miss relevant biomarker associations. Lastly, 16% of patients were not on oral anticoagulation, which may have influenced the associations between biomarkers and some outcomes. However, we showed that the results were consistent when analyses were restricted to those on anticoagulation.

In summary, our study highlights the complex, multifactorial nature of AF-related cardiovascular complications. We identified several biomarkers linked to diverse pathophysiological pathways - including myocardial injury, inflammation, oxidative stress, coagulation, and cardiac dysfunction - that are associated with adverse cardiovascular outcomes. By integrating these biomarkers into both traditional and machine learning-based risk models, we enhanced predictive accuracy, underscoring the potential clinical utility of biomarker-informed risk assessments in refining and optimizing the management of patients with AF.

Methods

Study population and procedures

We included patients with previously diagnosed AF from 2 prospective, multicenter cohort studies in Switzerland that used similar methodologies. The Basel Atrial Fibrillation (BEAT-AF) study enrolled 1545 patients from 2010 to 2014 across 9 centers in Switzerland³³, and the Swiss Atrial Fibrillation (Swiss-AF) study enrolled 2415 patients from 2014 to 2017 across 14 centers in Switzerland³⁴. Both studies had almost identical inclusion and exclusion criteria, as shown in Supplementary Table 16. Eligible patients had to have previously diagnosed AF. Patients who had secondary forms of AF or were unable to provide informed consent were excluded. For this analysis, we combined the BEAT-AF and Swiss-AF datasets, excluding 67 patients because of missing follow-up information and 76 patients because of missing all biomarker values, leaving a total of 3817 patients (Supplementary Fig. 7). Both studies comply with the Declaration of Helsinki, the study protocols were approved by the local ethics committees (Ethikkommission Nordwest- und Zentralschweiz (EKNZ)), and written informed consent was obtained from all participants. This study was conducted and reported in general accordance with the STARD guidelines³⁵.

At study enrolment and during yearly follow-up visits, trained study personnel collected information about patient demographics, risk factors, medical history, and current medical therapy using standardized case report forms. Sex of study participants was determined based on self-report. A detailed list of all variables collected is outlined in Supplementary Table 17. AF type was categorized according to guideline recommendations at the time of protocol development into paroxysmal, persistent, or permanent³⁶. Body mass index was calculated as weight in kilograms divided by height in meters squared. Three consecutive blood pressure measurements were obtained at study enrolment, and the mean was used for all analyses. Estimated glomerular filtration rate (eGFR) was calculated using the Chronic Kidney Disease Epidemiology Collaboration (CKD-EPI) formula.

Biomarker analyses and multiple imputation

Blood samples were drawn at baseline, immediately processed, and stored at −80 °C in a central biobank. We measured a panel of 12 biomarkers selected through an extensive literature review and robust evidence linking them to AF pathophysiology and its related complications. These biomarkers were chosen to capture distinct biological processes, including myocardial injury (hsTropT), inflammation (hs-CRP, IL-6, IGFBP-7, GDF-15), oxidative stress (GDF-15), renal disease (creatinine, cystatin C, OPN), coagulation (d-dimer), myocardial wall stress (NT-proBNP), extracellular matrix remodelling (IGFBP-7), liver disease (ALAT) and angiogenesis (ANG-2, IGFBP-7) (Supplementary Fig. 8). Biomarkers were analyzed centrally at Roche Diagnostics, Penzberg (Germany) on a cobas c311 or e601 by laboratory personnel blinded to clinical information under constant quality control and calibration. Most of the assays were routine products running on routine clinical analyzers. Detailed description about biomarker measurement is provided in the Supplementary Table 18.

To handle missing biomarker data in our dataset, we first assessed the percentage of missing values using a custom function pMiss, which calculates the proportion of missing values for each variable and each observation. The results showed that biomarkers had between 0.3% and 9.3% missing data. We then employed the mice package to visualize the missing data pattern and impute missing values using the predictive mean matching (PMM) method across 5 imputations. The imputed dataset was summarized and visually inspected using density plots and strip plots to assess the distribution and consistency of the imputed values (Supplementary Fig. 9). This approach ensures robust handling of missing data while preserving the integrity of the biomarker dataset for subsequent statistical analyses.

Adverse cardiovascular outcome measures

The three main outcomes of this analysis were: (1) a composite of cardiovascular death, nonfatal ischemic stroke, nonfatal systemic embolism, or nonfatal myocardial infarction, (2) heart failure hospitalization, and (3) major bleeding. Secondary outcomes were the individual components of the composite outcome, as well as total stroke, myocardial infarction (MI), clinically relevant non-major (NM) bleeding, a composite of major or clinically relevant NM bleeding, and all-cause death. Definitions of all outcomes were identical in both cohorts, and definitions are provided in Supplementary Table 19. All clinical outcomes were adjudicated by blinded study personal and physicians.

Development of machine learning models

To capture non-linear and complex relationships between clinical variables, biomarkers, and outcomes, different machine learning methods were applied. Specifically, 3 statistical methods including Least Absolute Shrinkage and Selection Operator (LASSO), random forest, and Extreme Gradient Boosting for survival analysis (XGBoost)^{37, 38–39} were used for all outcomes. We applied three statistical methods to develop predictive models for primary and secondary outcomes. For this, we included the whole sample of clinical variables (46 variables detailed in Supplementary Table 17) and constructed two types of models: a base model incorporating only clinical variables, and an enhanced model that included both clinical variables and all biomarkers (base model + biomarkers). The machine learning techniques used to build these models are outlined below.

Least absolute shrinkage and selection operator (LASSO)

LASSO was used to perform variable (feature) selection, regularization, and build parsimonious models for the primary and secondary outcomes. In brief, LASSO is a regularized regression analysis that simultaneously performs variable selection and regularization using machine learning algorithms. First, the feature matrix was prepared by converting the dataset into a model matrix excluding the intercept column. The response variable was set as the outcome being studied. LASSO logistic regression was then performed using the cv.glmnet function from the glmnet package, with 10-fold cross-validation to optimize the model’s regularization parameter (lambda). The optimal lambda was determined by the value that minimized the cross-validated error (lambda.min). The resulting model coefficients were extracted, and non-zero coefficients were identified as the selected variables.

Random forest

Random forest, an ensemble learning method that enhances predictive accuracy by combining multiple decision trees, was used to develop models for predicting primary and secondary outcomes. Each variable was considered as a potential predictor, and the algorithm built an ensemble of decision trees, each trained on different subsets of variables. The final prediction was derived from the average of individual tree predictions. The random forest models were constructed using the randomForest package in R, with reproducibility ensured by setting a random seed. The models were trained with default settings. Predictive probabilities for the composite outcome were generated using the trained model, and model accuracy was evaluated using the confusionMatrix function. Additionally, model performance was assessed through 10-fold cross-validation across 100 decision trees.

Extreme gradient boosting (XGBoost)

The XGBoost algorithm builds a series of models for the outcome being studied, each focusing on correcting residuals (“boosting”) of the combined predictions of the models built so far. Specifically, each new model aims to capture the relationships in the data that were not well represented by the previous models. To develop predictive models using the XGBoost algorithm, we transformed outcome variables to a binary numeric format. The XGBoost model was trained on the dataset with a random seed set for reproducibility, using the xgboost package in R. The model was configured with a binary logistic objective and trained over 10 boosting rounds. Post-training, variable importance was assessed using the xgb.importance function, which ranked the features based on their contribution to the model. True outcomes were used to evaluate the model’s performance, allowing for a robust assessment of its predictive capabilities.

Statistical analyses

The normality of distribution of each biomarker was assessed through visual inspection of histograms. Spearman rank correlations were applied to evaluate interrelationships between biomarkers. All biomarkers were log-transformed to improve the normality of the distribution. Cox proportional hazard models were constructed to assess hazard ratios (HR) and 95% confidence intervals (CI) for the main and secondary outcomes. To provide a unit-independent comparison between log-transformed biomarkers, HRs were standardized, representing the effect per 1 standard deviation (SD) increase. The initial model was adjusted for age and sex, and the multivariable model was adjusted for a prespecified list of covariates, including age, sex, body mass index, current smoker, history of hypertension, history of diabetes, prior stroke, history of heart failure, chronic kidney disease, and coronary artery disease. First, separate models were constructed for each biomarker. We then constructed a combined multivariable model including all biomarkers in a single model and performed a backward selection of the biomarkers using the Akaike information criterion (AIC) to exclude biomarkers. To assess the consistency of our findings, we conducted a sensitivity analysis for stroke and major bleeding, restricting the cohort to patients who were on oral anticoagulation therapy at baseline. In addition, we measured each variable’s relative importance in the models by calculating the partial χ2 statistic minus the predictor degrees of freedom. We then calculated the Area Under the Curve (AUC) of the Receiver Operating Characteristic (ROC) curve with corresponding 95% CI for each Cox model with and without key biomarkers and compared them using DeLong’s test.

We constructed predictive machine learning models using the entire sample of clinical variables to construct 2 types of models: a base model that only includes clinical variables, and an enhanced model incorporating both clinical variables and all biomarkers. The model’s discriminative ability was evaluated using AUC and corresponding 95% CI. AUC between base model and base model + biomarkers were compared using DeLong’s tests. Finally, we assessed the predictive performance of the Cox and machine learning models with biomarkers by comparing their AUC to established clinical risk scores for stroke (ABC-stroke and CHA₂DS₂-VASc)^29,40 and for major bleeding (HAS-BLED)⁴¹. All statistical analyses were performed using R version 4.3.1. A P-value < 0.05 was considered statistically significant.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Acknowledgements

This work was supported by grants of the Swiss National Science Foundation (grant numbers 33CS30_148474, 33CS30_177520, 32473B_176178, and 32003B_197524) awarded to D.C., M.K., and S.O.; the Swiss Heart Foundation, the Foundation for Cardiovascular Research Basel (FCVR), and the University of Basel awarded to D.C. The BEAT-AF study was supported by the Swiss National Science Foundation (grant number PP00P3_159322) awarded to D.C., the Swiss Heart Foundation, the University of Basel, Boehringer Ingelheim, Sanofi-Aventis, Merck Sharp & Dome, Bayer, Daiichi Sankyo, and Pfizer/Bristol-Myers Squibb, all awarded to D.C., M.K., and S.O.

Author contributions

P.B.M. and D.C. conceived and planned the study. P.B.M., A.Z. and D.C. performed the technical parts and analytic approach. P.B.M. and D.C. analyzed the data and contributed to the original interpretation. P.B.M. and D.C. wrote the draft manuscript, and S.A., S.B., T.R., M.H., N.R., A.S.M., A.B, J.H.B., G.M., A.Z., B.W., E.R., G.C., P.K., L.H.B., S.O., M.K. discussed the results and contributed to the final manuscript.

Peer review

Peer review information

Nature Communications thanks Hiromichi Wada and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Data availability

The data supporting the findings from this study are available in the article and its supplementary information. Deidentified individual-level data supporting the results of the study are not publicly available to protect patient privacy and to comply with the informed consent signed by the participants. Deidentified individual-level data supporting the results of the study may be made available from the corresponding author (Pascal B. Meyre) upon reasonable request and approval by the study steering committee. Written access proposals have to be submitted to the corresponding author at the following e-mail address: [email protected]. The expected timeframe for responses is 10 weeks. are provided with this paper.

Code availability

The code is publicly available via GitHub–Zenodo and can be accessed using the https://doi.org/10.5281/zenodo.15653551. The source code from the R-packages used in this study is freely available online (https://cran.r-project.org/).

Competing interests

P.B.M. received funding from the Swiss National Science Foundation outside the submitted work. S.A. received funding from the Swiss Heart Foundation and speaker fees from Roche Diagnostics outside of the submitted work. S.B. received funding from the Swiss National Science Foundation, the Mach-Gaensslen Foundation, and the Bangerter-Rhyner Foundation outside the submitted work. T.R. reports research grants from the Swiss National Science Foundation, the Swiss Heart Foundation, and the sitem insel support fund, all for work outside the submitted study. Speaker/consulting honoraria or travel support from Abbott/SJM, AstraZeneca, Brahms, Bayer, Biosense Webster, Biotronik, Boston Scientific, Daiichi Sankyo, Medtronic, Pfizer BMS, and Roche, all for work outside the submitted study. Support for his institution’s fellowship program from Abbott/SJM, Biosense Webster, Biotronik, Boston Scientific, and Medtronic for work outside the submitted study. A.M. reports fellowship and training support from Biotronik, Boston Scientific, Medtronic, Abbott/St. Jude Medical, and Biosense Webster; speaker honoraria from Biosense Webster, Medtronic, Abbott/St. Jude Medical, AstraZeneca, Daiichi Sankyo, Biotronik, MicroPort, Novartis, and consultant honoraria for Biosense Webster, Medtronic, Abbott/St. Jude Medcal and Biotronik. G.M. has received consultant fees for taking part in advisory boards from Novartis, Boehringer Ingelheim, Bayer, AstraZeneca, and Daiichi Sankyo, all outside of the current work. A.Z. is an employee of Roche Diagnostics, a commercial provider of diagnostic tests. M.K. reports personal fees from Bayer, personal fees from Böhringer Ingelheim, personal fees from Pfizer BMS, personal fees from Daiichi Sankyo, personal fees from Medtronic, personal fees from Biotronik, personal fees from Boston Scientific, personal fees from Johnson&Johnson, grants from Bayer, grants from Pfizer, grants from Boston Scientific, grants from BMS, grants from Biotronik. Grants from the Swiss National Science Foundation, the Swiss Heart Foundation, the Foundation for Cardiovascular Research Basel, and the University of Basel. D.C. has received consultant fees from Roche Diagnostics and Trimedics, outside of the current work. The remaining authors have nothing to disclose.

Supplementary information

The online version contains supplementary material available at https://doi.org/10.1038/s41467-025-62218-7.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

1. Conen, D et al. Risk of death and cardiovascular events in initially healthy women with new-onset atrial fibrillation. JAMA; 2011; 305, pp. 2080-2087.1:CAS:528:DC%2BC3MXmvFals7g%3D

2. Soliman, EZ et al. Atrial fibrillation and the risk of myocardial infarction. JAMA Intern. Med.; 2014; 174, pp. 107-114.

3. Wolf, PA; Abbott, RD; Kannel, WB. Atrial fibrillation as an independent risk factor for stroke: the Framingham Study. Stroke; 1991; 22, pp. 983-988.1:STN:280:DyaK3Mzislektg%3D%3D

4. Wang, TJ et al. Temporal relations of atrial fibrillation and congestive heart failure and their joint influence on mortality: the Framingham Heart Study. Circulation; 2003; 107, pp. 2920-2925.

5. Meyre, PB et al. Bleeding and ischaemic events after first bleed in anticoagulated atrial fibrillation patients: risk and timing. Eur. Heart J.; 2022; 43, pp. 4899-4908.1:CAS:528:DC%2BB3sXpvFemt7o%3D

6. Takahashi, Y et al. Histological validation of atrial structural remodelling in patients with atrial fibrillation. Eur. Heart J.; 2023; 44, pp. 3339-3353.1:CAS:528:DC%2BB2cXhsFOku7c%3D

7. Rochette, L., Dogon, G., Zeller, M., Cottin, Y. & Vergely, C. GDF15 and cardiac cells: current concepts and new insights. Int. J. Mol. Sci.https://doi.org/10.3390/ijms22168889 (2021).

8. Hagström, E et al. Growth differentiation factor-15 level predicts major bleeding and cardiovascular events in patients with acute coronary syndromes: results from the PLATO study. Eur. Heart J.; 2016; 37, pp. 1325-1333.

9. Mathews, L et al. Growth differentiation factor 15 and risk of bleeding events: The Atherosclerosis Risk in Communities Study. J. Am. Heart Assoc.; 2023; 12, 1:CAS:528:DC%2BB3sXhslSqtrnO e023847.

10. Hijazi, Z et al. The novel biomarker-based ABC (age, biomarkers, clinical history)-bleeding risk score for patients with atrial fibrillation: a derivation and validation study. Lancet; 2016; 387, pp. 2302-2311.1:CAS:528:DC%2BC28XlsFOhurw%3D

11. Blum, S et al. Insulin-like growth factor-binding protein 7 and risk of congestive heart failure hospitalization in patients with atrial fibrillation. Heart Rhythm; 2021; 18, pp. 512-519.

12. Aulin, J et al. Serial measurement of interleukin-6 and risk of mortality in anticoagulated patients with atrial fibrillation: Insights from ARISTOTLE and RE-LY trials. J. Thromb. Haemost.; 2020; 18, pp. 2287-2295.1:CAS:528:DC%2BB3cXhslagtbjM

13. Georgakis, MK et al. Interleukin-6 signaling effects on ischemic stroke and other cardiovascular outcomes: a mendelian randomization study. Circ. Genom. Precis. Med.; 2020; 13, e002872.

14. Levin, MG et al. A missense variant in the IL-6 receptor and protection from peripheral artery disease. Circ. Res.; 2021; 129, pp. 968-970.1:CAS:528:DC%2BB3MXitlGjur%2FM

15. Swerdlow, DI et al. The interleukin-6 receptor as a target for prevention of coronary heart disease: a mendelian randomisation analysis. Lancet; 2012; 379, pp. 1214-1224.

16. Sabatine, MS et al. Evaluation of multiple biomarkers of cardiovascular stress for risk prediction and guiding medical therapy in patients with stable coronary disease. Circulation; 2012; 125, pp. 233-240.1:CAS:528:DC%2BC38XotVGjuw%3D%3D

17. Pol, T et al. Using multimarker screening to identify biomarkers associated with cardiovascular death in patients with atrial fibrillation. Cardiovasc. Res.; 2022; 118, pp. 2112-2123.1:CAS:528:DC%2BB38XisVKmtLbM

18. Conen, D et al. A multimarker approach to assess the influence of inflammation on the incidence of atrial fibrillation in women. Eur. Heart J.; 2010; 31, pp. 1730-1736.

19. Zethelius, B et al. Use of multiple biomarkers to improve the prediction of death from cardiovascular causes. N. Engl. J. Med.; 2008; 358, pp. 2107-2116.1:CAS:528:DC%2BD1cXlvFCrsrw%3D

20. Koretsune, Y et al. Coagulation biomarkers and clinical outcomes in elderly patients with nonvalvular atrial fibrillation: ANAFIE Subcohort Study. JACC Asia; 2023; 3, pp. 595-607.

21. Singleton, MJ et al. Multiple blood biomarkers and stroke risk in atrial fibrillation: the REGARDS study. J. Am. Heart Assoc.; 2021; 10, 1:CAS:528:DC%2BB3MXit1WgtbfI e020157.

22. Chua, W et al. An angiopoietin 2, FGF23, and BMP10 biomarker signature differentiates atrial fibrillation from other concomitant cardiovascular conditions. Sci. Rep.; 2023; 13, 2023NatSR.1316743C1:CAS:528:DC%2BB3sXitVyltLbI 16743.

23. Hijazi, Z et al. Bone morphogenetic protein 10: a novel risk marker of ischaemic stroke in patients with atrial fibrillation. Eur. Heart J.; 2023; 44, pp. 208-218.1:CAS:528:DC%2BB3sXhsVyit7rP

24. Hennings, E et al. Bone morphogenetic protein 10-a novel biomarker to predict adverse outcomes in patients with atrial fibrillation. J. Am. Heart Assoc.; 2023; 12, 1:CAS:528:DC%2BB3sXhslWmtbnM e028255.

25. Yusuf, S et al. Polypill with or without aspirin in persons without cardiovascular disease. N. Engl. J. Med.; 2021; 384, pp. 216-228.1:CAS:528:DC%2BB3MXhvVKhsrs%3D

26. Rienstra, M et al. Targeted therapy of underlying conditions improves sinus rhythm maintenance in patients with persistent atrial fibrillation: results of the RACE 3 trial. Eur. Heart J.; 2018; 39, pp. 2987-2996.1:CAS:528:DC%2BC1MXhtlClsbfO

27. Camm, AJ et al. 2012 focused update of the ESC Guidelines for the management of atrial fibrillation: an update of the 2010 ESC Guidelines for the management of atrial fibrillation. Developed with the special contribution of the European Heart Rhythm Association. Eur. Heart J.; 2012; 33, pp. 2719-2747.

28. Siddiqi, TJ et al. Utility of the CHA2DS2-VASc score for predicting ischaemic stroke in patients with or without atrial fibrillation: a systematic review and meta-analysis. Eur. J. Prev. Cardiol.; 2022; 29, pp. 625-631.

29. Hijazi, Z et al. The ABC (age, biomarkers, clinical history) stroke risk score: a biomarker-based risk score for predicting stroke in atrial fibrillation. Eur. Heart J.; 2016; 37, pp. 1582-1590.

30. Berg, DD et al. Performance of the ABC scores for assessing the risk of stroke or systemic embolism and bleeding in patients with atrial fibrillation in ENGAGE AF-TIMI 48. Circulation; 2019; 139, pp. 760-771.

31. Jadidi, A et al. The duration of the amplified sinus-P-wave identifies presence of left atrial low voltage substrate and predicts outcome after pulmonary vein isolation in patients with persistent atrial fibrillation. JACC Clin. Electrophysiol.; 2018; 4, pp. 531-543.

32. Magnani, JW et al. P wave duration is associated with cardiovascular and all-cause mortality outcomes: the National Health and Nutrition Examination Survey. Heart Rhythm; 2011; 8, pp. 93-100.

33. Blum, S. et al. Prospective assessment of sex-related differences in symptom status and health perception among patients with atrial fibrillation. J. Am. Heart Assoc.https://doi.org/10.1161/jaha.116.005401 (2017).

34. Conen, D et al. Design of the Swiss Atrial Fibrillation Cohort Study (Swiss-AF): structural brain damage and cognitive decline among patients with atrial fibrillation. Swiss Med Wkly; 2017; 147, w14467.

35. Bossuyt, PM et al. STARD 2015: an updated list of essential items for reporting diagnostic accuracy studies. Bmj; 2015; 351, h5527.

36. Camm, AJ et al. Guidelines for the management of atrial fibrillation: the Task Force for the Management of Atrial Fibrillation of the European Society of Cardiology (ESC). Eur. Heart J.; 2010; 31, pp. 2369-2429.

37. Breiman, L. Random forests. Mach. Learn.; 2001; 45, pp. 5-32.

38. Friedman, J; Hastie, T; Tibshirani, R. Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw.; 2010; 33, pp. 1-22.

39. Chen, T. & Guestrin, C. in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 785–794 (Association for Computing Machinery, San Francisco, CA, USA, 2016).

40. Lip, GY; Nieuwlaat, R; Pisters, R; Lane, DA; Crijns, HJ. Refining clinical risk stratification for predicting stroke and thromboembolism in atrial fibrillation using a novel risk factor-based approach: the Euro Heart Survey on atrial fibrillation. Chest; 2010; 137, pp. 263-272.

41. Pisters, R et al. A novel user-friendly score (HAS-BLED) to assess 1-year risk of major bleeding in patients with atrial fibrillation: the Euro Heart Survey. Chest; 2010; 138, pp. 1093-1100.

Word count: 6399

Show less

© The Author(s) 2025. This work is published under http://creativecommons.org/licenses/by/4.0/ (the "License"). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Translate

Atrial fibrillation (AF) increases the risk of adverse cardiovascular events, yet the underlying biological mechanisms remain unclear. We evaluate a panel of 12 circulating biomarkers representing diverse pathophysiological pathways in 3817 AF patients to assess their association with adverse cardiovascular outcomes. We identify 5 biomarkers including D-dimer, growth differentiation factor 15 (GDF-15), interleukin-6 (IL-6), N-terminal pro-B-type natriuretic peptide (NT-proBNP), and high-sensitivity troponin T (hsTropT) that independently predict cardiovascular death, stroke, myocardial infarction, and systemic embolism, significantly enhancing predictive accuracy. Additionally, GDF-15, insulin-like growth factor-binding protein-7 (IGFBP-7), NT-proBNP, and hsTropT predict heart failure hospitalization, while GDF-15 and IL-6 are associated with major bleeding events. A biomarker model improves predictive accuracy for stroke and major bleeding compared to established clinical risk scores. Machine learning models incorporating these biomarkers demonstrate consistent improvements in risk stratification across most outcomes. In this work, we show that integrating biomarkers related to myocardial injury, inflammation, oxidative stress, and coagulation into both conventional and machine learning-based models refine prognosis and guide clinical decision-making in AF patients.

In atrial fibrillation patients, circulating biomarkers reflecting inflammation, myocardial injury, and coagulation improve prediction of stroke, heart failure, and bleeding, outperforming clinical scores and enhance risk stratification.”.

Details

Title

Biomarker panels for improved risk prediction and enhanced biological insights in patients with atrial fibrillation

Author

Meyre, Pascal B.¹

; Aeschbacher, Stefanie¹; Blum, Steffen¹; Reichlin, Tobias²; Haller, Moa³; Rodondi, Nicolas³; Müller, Andreas S.⁴; Bernheim, Alain⁴; Beer, Jürg Hans⁵

; Moschovitis, Giorgio⁶; Ziegler, André⁷; Wahrenberger, Bianca¹; Rigamonti, Elia⁶; Conte, Giulio⁸; Krisai, Philipp¹; Bonati, Leo H.⁹

; Osswald, Stefan¹; Kühne, Michael¹; Conen, David¹⁰

¹ Department of Cardiology, University Heart Center, University Hospital Basel, Basel, Switzerland (ROR: https://ror.org/04k51q396) (GRID: grid.410567.1) (ISNI: 0000 0001 1882 505X); Cardiovascular Research Institute Basel, University Hospital Basel, Basel, Switzerland (ROR: https://ror.org/02s6k3f65) (GRID: grid.6612.3) (ISNI: 0000 0004 1937 0642)
² Department of Cardiology, Inselspital, Bern University Hospital, Bern, Switzerland (ROR: https://ror.org/01q9sj412) (GRID: grid.411656.1) (ISNI: 0000 0004 0479 0855)
³ Institute of Primary Health Care (BIHAM), University of Bern, Bern, Switzerland (ROR: https://ror.org/02k7v4d05) (GRID: grid.5734.5) (ISNI: 0000 0001 0726 5157); Department of General Internal Medicine, Inselspital, Bern University Hospital, University of Bern, Bern, Switzerland (ROR: https://ror.org/02k7v4d05) (GRID: grid.5734.5) (ISNI: 0000 0001 0726 5157)
⁴ Department of Cardiology, Triemli Hospital Zurich, Zurich, Switzerland (ROR: https://ror.org/03kpdys72) (GRID: grid.414526.0) (ISNI: 0000 0004 0518 665X)
⁵ Department Internal Medicine, Baden Switzerland and Center of Molecular Cardiology, Cantonal Hospital Baden, University of Zürich, Zürich, Switzerland (ROR: https://ror.org/02crff812) (GRID: grid.7400.3) (ISNI: 0000 0004 1937 0650)
⁶ Divison of Cardiology, Regional Hospital of Lugano, Ente Ospedaliero Cantonale (EOS), Lugano, Switzerland (ROR: https://ror.org/00sh19a92) (GRID: grid.469433.f) (ISNI: 0000 0004 0514 7845); Cardiocentro Ticino Institute, Ente Ospedaliero Cantonale (EOC), Lugano, Switzerland (ROR: https://ror.org/00sh19a92) (GRID: grid.469433.f) (ISNI: 0000 0004 0514 7845)
⁷ Roche Diagnostics International, Rotkreuz, Switzerland (ROR: https://ror.org/00by1q217) (GRID: grid.417570.0) (ISNI: 0000 0004 0374 1269)
⁸ Cardiocentro Ticino Institute, Ente Ospedaliero Cantonale (EOC), Lugano, Switzerland (ROR: https://ror.org/00sh19a92) (GRID: grid.469433.f) (ISNI: 0000 0004 0514 7845)
⁹ Rheinfelden Rehabilitation Clinic, Rheinfelden, Switzerland
¹⁰ Population Health Research Institute, McMaster University, Hamilton, ON, Canada (ROR: https://ror.org/02fa3aq29) (GRID: grid.25073.33) (ISNI: 0000 0004 1936 8227); Department of Medicine, McMaster University, Hamilton, ON, Canada (ROR: https://ror.org/02fa3aq29) (GRID: grid.25073.33) (ISNI: 0000 0004 1936 8227); Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, ON, Canada (ROR: https://ror.org/02fa3aq29) (GRID: grid.25073.33) (ISNI: 0000 0004 1936 8227)

Pages

7042

Section

Article

Publication year

2025

Publication date

2025

Publisher

Nature Publishing Group

e-ISSN

20411723

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.1038/s41467-025-62218-7

ProQuest document ID

3235184796

Biomarker panels for improved risk prediction and enhanced biological insights in patients with atrial fibrillation

Jump to:

Full text

Abstract

Details

Suggested sources