It appears you don't have support to open PDFs in this web browser. To view this file, Open with your PDF reader
Abstract
Standard reference terminology of diagnoses and risk factors is crucial for billing, epidemiological studies, and inter/intranational comparisons of diseases. The International Classification of Disease (ICD) is a standardized and widely used method, but the manual classification is an enormously time-consuming endeavor. Natural language processing together with machine learning allows automated structuring of diagnoses using ICD-10 codes, but the limited performance of machine learning models, the necessity of gigantic datasets, and poor reliability of terminal parts of these codes restricted clinical usability. We aimed to create a high performing pipeline for automated classification of reliable ICD-10 codes in the free medical text in cardiology. We focussed on frequently used and well-defined three- and four-digit ICD-10 codes that still have enough granularity to be clinically relevant such as atrial fibrillation (I48), acute myocardial infarction (I21), or dilated cardiomyopathy (I42.0). Our pipeline uses a deep neural network known as a Bidirectional Gated Recurrent Unit Neural Network and was trained and tested with 5548 discharge letters and validated in 5089 discharge and procedural letters. As in clinical practice discharge letters may be labeled with more than one code, we assessed the single- and multilabel performance of main diagnoses and cardiovascular risk factors. We investigated using both the entire body of text and only the summary paragraph, supplemented by age and sex. Given the privacy-sensitive information included in discharge letters, we added a de-identification step. The performance was high, with F1 scores of 0.76–0.99 for three-character and 0.87–0.98 for four-character ICD-10 codes, and was best when using complete discharge letters. Adding variables age/sex did not affect results. For model interpretability, word coefficients were provided and qualitative assessment of classification was manually performed. Because of its high performance, this pipeline can be useful to decrease the administrative burden of classifying discharge diagnoses and may serve as a scaffold for reimbursement and research applications.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details



1 University Medical Centre Utrecht, University of Utrecht, Department of Cardiology, Division of Heart & Lungs, Utrecht, The Netherlands (GRID:grid.5477.1) (ISNI:0000000120346234)
2 University Medical Centre Utrecht, University of Utrecht, Department of Cardiology, Division of Heart & Lungs, Utrecht, The Netherlands (GRID:grid.5477.1) (ISNI:0000000120346234); Utrecht University, Department of Methodology and Statistics, Faculty of Social Sciences, Utrecht, The Netherlands (GRID:grid.5477.1) (ISNI:0000000120346234)
3 Utrecht University, Department of Methodology and Statistics, Faculty of Social Sciences, Utrecht, The Netherlands (GRID:grid.5477.1) (ISNI:0000000120346234); University of Southampton, Highfield, Southampton Statistical Sciences Research Institute, Southampton, UK (GRID:grid.5491.9) (ISNI:0000 0004 1936 9297)
4 University Medical Centre Utrecht, University of Utrecht, Department of Genetics, Division Laboratories, Pharmacy and Biomedical Genetics, Utrecht, The Netherlands (GRID:grid.5477.1) (ISNI:0000000120346234)
5 University Medical Centre Utrecht, Department of Information and Finance, Division of Health Administration and Information, Utrecht, The Netherlands (GRID:grid.7692.a) (ISNI:0000000090126352)
6 Utrecht University, Department of Methodology and Statistics, Faculty of Social Sciences, Utrecht, The Netherlands (GRID:grid.5477.1) (ISNI:0000000120346234)
7 University Medical Centre Utrecht, University of Utrecht, Department of Cardiology, Division of Heart & Lungs, Utrecht, The Netherlands (GRID:grid.5477.1) (ISNI:0000000120346234); University College London, Institute of Cardiovascular Science, Faculty of Population Health Sciences, London, UK (GRID:grid.83440.3b) (ISNI:0000000121901201); University College London, Health Data Research UK, Institute of Health Informatics, London, UK (GRID:grid.83440.3b) (ISNI:0000000121901201)