Abstract

Alzheimer's disease (AD) is a neurodegenerative disease whose molecular mechanisms are activated several years before cognitive symptoms appear. Genotype-based prediction of the phenotype is thus a key challenge for the early diagnosis of AD. Machine learning techniques that have been proposed to address this challenge do not consider known biological interactions between the genes used as input features, thus neglecting important information about the disease mechanisms at play. To mitigate this, we first extracted AD subnetworks from several protein–protein interaction (PPI) databases and labeled these with genotype information (number of missense variants) to make them patient-specific. Next, we trained Graph Neural Networks (GNNs) on the patient-specific networks for phenotype prediction. We tested different PPI databases and compared the performance of the GNN models to baseline models using classical machine learning techniques, as well as randomized networks and input datasets. The overall results showed that GNNs could not outperform a baseline predictor only using the APOE gene, suggesting that missense variants are not sufficient to explain disease risk beyond the APOE status. Nevertheless, our results show that GNNs outperformed other machine learning techniques and that protein–protein interactions lead to superior results compared to randomized networks. These findings highlight that gene interactions are a valuable source of information in predicting disease status.

Details

Title
On the limits of graph neural networks for the early diagnosis of Alzheimer’s disease
Author
Hernández-Lorenzo, Laura 1 ; Hoffmann, Markus 2 ; Scheibling, Evelyn 3 ; List, Markus 3 ; Matías-Guiu, Jordi A. 4 ; Ayala, Jose L. 5 

 Complutense University of Madrid, Department of Computer Architecture and Automation, Computer Science Faculty, Madrid, Spain (GRID:grid.4795.f) (ISNI:0000 0001 2157 7667); San Carlos Research Health Institute (IdISSC), Universidad Complutense, Department of Neurology, Hospital Clínico San Carlos, Madrid, Spain (GRID:grid.4795.f) (ISNI:0000 0001 2157 7667); Technical University of Munich, Big Data in BioMedicine Group, Chair of Experimental Bioinformatics, TUM School of Life Sciences, Munich, Germany (GRID:grid.6936.a) (ISNI:0000000123222966) 
 Technical University of Munich, Big Data in BioMedicine Group, Chair of Experimental Bioinformatics, TUM School of Life Sciences, Munich, Germany (GRID:grid.6936.a) (ISNI:0000000123222966); Technical University of Munich, Institute for Advanced Study, Garching, Germany (GRID:grid.6936.a) (ISNI:0000000123222966) 
 Technical University of Munich, Big Data in BioMedicine Group, Chair of Experimental Bioinformatics, TUM School of Life Sciences, Munich, Germany (GRID:grid.6936.a) (ISNI:0000000123222966) 
 San Carlos Research Health Institute (IdISSC), Universidad Complutense, Department of Neurology, Hospital Clínico San Carlos, Madrid, Spain (GRID:grid.4795.f) (ISNI:0000 0001 2157 7667) 
 Complutense University of Madrid, Department of Computer Architecture and Automation, Computer Science Faculty, Madrid, Spain (GRID:grid.4795.f) (ISNI:0000 0001 2157 7667) 
Publication year
2022
Publication date
2022
Publisher
Nature Publishing Group
e-ISSN
20452322
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2727118345
Copyright
© The Author(s) 2022. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.