Content area

Abstract

Objectives

High classification accuracy of Alzheimer’s disease (AD) from structural MRI has been achieved using deep neural networks, yet the specific image features contributing to these decisions remain unclear. In this study, the contributions of T1-weighted (T1w) gray-white matter texture, volumetric information, and preprocessing—particularly skull-stripping—were systematically assessed.

Materials and methods

A dataset of 990 matched T1w MRIs from AD patients and cognitively normal controls from the ADNI database was used. Preprocessing was varied through skull-stripping and intensity binarization to isolate texture and shape contributions. A 3D convolutional neural network was trained on each configuration, and classification performance was compared using exact McNemar tests with discrete Bonferroni-Holm correction. Feature relevance was analyzed using Layer-wise Relevance Propagation, image similarity metrics, and spectral clustering of relevance maps.

Results

Despite substantial differences in image content, classification accuracy, sensitivity, and specificity remained stable across preprocessing conditions. Models trained on binarized images preserved performance, indicating minimal reliance on gray-white matter texture. Instead, volumetric features—particularly brain contours introduced through skull-stripping—were consistently used by the models.

Conclusion

This behavior reflects a shortcut learning phenomenon, where preprocessing artifacts act as potentially unintended cues. The resulting Clever Hans effect emphasizes the critical importance of interpretability tools to reveal hidden biases and to ensure robust and trustworthy deep learning in medical imaging.

Critical relevance statement

We investigated the mechanisms underlying deep learning-based disease classification using a widely utilized Alzheimer’s disease dataset, and our findings reveal a reliance on features induced through skull-stripping, highlighting the need for careful preprocessing to ensure clinically relevant and interpretable models.

Key Points

Shortcut learning is induced by skull-stripping applied to T1-weighted MRIs.

Explainable deep learning and spectral clustering estimate the bias.

Highlights the importance of understanding the dataset, image preprocessing and deep learning model, for interpretation and validation.

Details

1009240
Title
Skull-stripping induces shortcut learning in MRI-based Alzheimer’s disease classification
Author
Tinauer, Christian 1 ; Sackl, Maximilian 1 ; Stollberger, Rudolf 2 ; Schmidt, Reinhold 1 ; Ropele, Stefan 3 ; Langkammer, Christian 3 

 Medical University of Graz, Department of Neurology, Graz, Austria (GRID:grid.11598.34) (ISNI:0000 0000 8988 2476) 
 Graz University of Technology, Institute of Biomedical Imaging, Graz, Austria (GRID:grid.410413.3) (ISNI:0000 0001 2294 748X); BioTechMed-Graz, Graz, Austria (GRID:grid.452216.6) 
 Medical University of Graz, Department of Neurology, Graz, Austria (GRID:grid.11598.34) (ISNI:0000 0000 8988 2476); BioTechMed-Graz, Graz, Austria (GRID:grid.452216.6) 
Publication title
Volume
16
Issue
1
Pages
283
Publication year
2025
Publication date
Dec 2025
Publisher
Springer Nature B.V.
Place of publication
Heidelberg
Country of publication
Netherlands
e-ISSN
18694101
Source type
Scholarly Journal
Language of publication
English
Document type
Journal Article
Publication history
 
 
Online publication date
2025-12-22
Milestone dates
2025-11-11 (Registration); 2025-07-29 (Received); 2025-11-08 (Accepted)
Publication history
 
 
   First posting date
22 Dec 2025
ProQuest document ID
3285879492
Document URL
https://www.proquest.com/scholarly-journals/skull-stripping-induces-shortcut-learning-mri/docview/3285879492/se-2?accountid=208611
Copyright
© The Author(s) 2025. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Last updated
2025-12-26
Database
ProQuest One Academic