Classification models using circulating

Abstract

Background

Intracranial aneurysms (IAs) are dangerous because of their potential to rupture. We previously found significant RNA expression differences in circulating neutrophils between patients with and without unruptured IAs and trained machine learning models to predict presence of IA using 40 neutrophil transcriptomes. Here, we aim to develop a predictive model for unruptured IA using neutrophil transcriptomes from a larger population and more robust machine learning methods.

Methods

Neutrophil RNA extracted from the blood of 134 patients (55 with IA, 79 IA-free controls) was subjected to next-generation RNA sequencing. In a randomly-selected training cohort (n = 94), the Least Absolute Shrinkage and Selection Operator (LASSO) selected transcripts, from which we constructed prediction models via 4 well-established supervised machine-learning algorithms (K-Nearest Neighbors, Random Forest, and Support Vector Machines with Gaussian and cubic kernels). We tested the models in the remaining samples (n = 40) and assessed model performance by receiver-operating-characteristic (ROC) curves. Real-time quantitative polymerase chain reaction (RT-qPCR) of 9 IA-associated genes was used to verify gene expression in a subset of 49 neutrophil RNA samples. We also examined the potential influence of demographics and comorbidities on model prediction.

Results

Feature selection using LASSO in the training cohort identified 37 IA-associated transcripts. Models trained using these transcripts had a maximum accuracy of 90% in the testing cohort. The testing performance across all methods had an average area under ROC curve (AUC) = 0.97, an improvement over our previous models. The Random Forest model performed best across both training and testing cohorts. RT-qPCR confirmed expression differences in 7 of 9 genes tested. Gene ontology and IPA network analyses performed on the 37 model genes reflected dysregulated inflammation, cell signaling, and apoptosis processes. In our data, demographics and comorbidities did not affect model performance.

Conclusions

We improved upon our previous IA prediction models based on circulating neutrophil transcriptomes by increasing sample size and by implementing LASSO and more robust machine learning methods. Future studies are needed to validate these models in larger cohorts and further investigate effect of covariates.

Details

Title

Classification models using circulating neutrophil transcripts can detect unruptured intracranial aneurysm

Author

Poppenberg, Kerry E; Tutino, Vincent M; Lu, Li; Muhammad Waqas; Armond June; Chaves, Lee; Jiang, Kaiyu; Jarvis, James N; Sun, Yijun; Snyder, Kenneth V; Levy, Elad I; Siddiqui, Adnan H; Kolega, John; Meng, Hui

Pages

1-19

Section

Research

Publication year

2020

Publication date

2020

Publisher

Springer Nature B.V.

e-ISSN

14795876

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.1186/s12967-020-02550-2

ProQuest document ID

2451929162

© 2020. This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Classification models using circulating neutrophil transcripts can detect unruptured intracranial aneurysm

Jump to:

Abstract

Details

Full text options

Suggested sources