Full Text

Turn on search term navigation

© 2024 Tsunoda et al. This is an open access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

The billing database of the universal healthcare system in Japan potentially includes large-cohort data of patients with immunoglobulin A nephropathy, diagnosis codes aimed at billing should not be directly used for clinical research because of the risk of misdiagnosis. To solve this problem, we aimed to develop a novel method for identifying patients with immunoglobulin A nephropathy from billing data using machine learning. The medical records and bills of 3,743 patients who consulted nephrologists at a single center were extracted. Patients were labeled to have been diagnosed with immunoglobulin A nephropathy through a review of medical records. A manual analysis of the diagnostic accuracy and machine learning was performed. For machine learning, the datasets were preprocessed in three patterns and assigned to the XGBoost program using five-fold cross-validation. Of all the participants, 437 were labeled as having been diagnosed with immunoglobulin A nephropathy. Bill codes for immunoglobulin A nephropathy were provided to approximately half of them. The manually created criteria consisting of the recommended examinations and treatments in the Japanese guidelines for immunoglobulin A nephropathy showed both specificity and sensitivity < 0.8. In contrast, with the receiver operating characteristic curve analysis, the machine learning process yielded area under the curve values over 0.9 with preprocessing from the clinical viewpoint. Applying machine learning technology to a dataset preprocessed from a clinical viewpoint achieved a high performance in detecting patients with immunoglobulin A nephropathy. This methodology contributes to the construction of a disease-specific cohort using big bill data.

Details

Title
Machine-learning-based identification of patients with IgA nephropathy using a computerized medical billing database
Author
Tsunoda, Ryoya  VIAFID ORCID Logo  ; Kume, Keitaro; Kagawa, Rina; Sanuki, Masaru; Kitagawa, Hiroyuki  VIAFID ORCID Logo  ; Mase, Kaori; Yamagata, Kunihiro
First page
e0312915
Section
Research Article
Publication year
2024
Publication date
Dec 2024
Publisher
Public Library of Science
e-ISSN
19326203
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
3141380410
Copyright
© 2024 Tsunoda et al. This is an open access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.