Full text

Turn on search term navigation

© 2025 Park et al. This is an open access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Introduction

The interpretation of plain hip radiographs can vary widely among physicians. This study aimed to develop and validate a deep learning-based screening model for distinguishing normal hips from severe hip diseases on plain radiographs.

Methods

Electronic medical records and plain radiograph from 2004 to 2012 were used to construct two patient groups: the hip disease group (those who underwent total hip arthroplasty) and normal group. A total of 1,726 radiographs (500 normal hip radiographs and 1,226 radiographs with hip diseases, respectively) were included and were allocated for training (320 and 783), validation (80 and 196), and test (100 and 247) groups. Four different models were designed–raw image for both training and test set, preprocessed image for training but raw image for the test set, preprocessed images for both sets, and change of backbone algorithm from DenseNet to EfficientNet. The deep learning models were compared in terms of accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), F1-score, and area under the receiver operating characteristic curve (AUROC).

Results

The mean age of the patients was 54.0 ± 14.8 years in the hip disease group and 49.8 ± 14.9 years in the normal group. The final model showed highest performance in both the internal test set (accuracy 0.96, sensitivity 0.96, specificity 0.97, PPV 0.99, NPV 0.99, F1-score 0.97, and AUROC 0.99) and the external validation set (accuracy 0.94, sensitivity 0.93, specificity 0.96, PPV 0.95, NPV 0.93, F1-score 0.94, and AUROC 0.98). In the gradcam image, while the first model depended on unrelated marks of radiograph, the second and third model mainly focused on the femur shaft and sciatic notch, respectively.

Conclusion

The deep learning-based model showed high accuracy and reliability in screening hip diseases on plain radiographs, potentially aiding physicians in more accurately diagnosing hip conditions.

Details

Title
Deep learning based screening model for hip diseases on plain radiographs
Author
Jung-Wee, Park; Ryu, Seung Min; Hong-Seok, Kim; Young-Kyun, Lee; Yoo, Jeong Joon  VIAFID ORCID Logo 
First page
e0318022
Section
Research Article
Publication year
2025
Publication date
Feb 2025
Publisher
Public Library of Science
e-ISSN
19326203
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
3166679444
Copyright
© 2025 Park et al. This is an open access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.