Abstract

Background

The amount of patient-related information within clinical information systems accumulates over time, especially in cases where patients suffer from chronic diseases with many hospitalizations and consultations. The diagnosis or problem list is an important feature of the electronic health record, which provides a dynamic account of a patient’s current illness and past history. In the case of an Austrian hospital network, problem list entries are limited to fifty characters and are potentially linked to ICD-10. The requirement of producing ICD codes at each hospital stay, together with the length limitation of list items leads to highly redundant problem lists, which conflicts with the physicians’ need of getting a good overview of a patient in short time.

This paper investigates a method, by which problem list items can be semantically grouped, in order to allow for fast navigation through patient-related topic spaces.

Methods

We applied a minimal language-dependent preprocessing strategy and mapped problem list entries as tf-idf weighted character 3-grams into a numerical vector space. Based on this representation we used the unweighted pair group method with arithmetic mean (UPGMA) clustering algorithm with cosine distances and inferred an optimal boundary in order to form semantically consistent topic spaces, taking into consideration different levels of dimensionality reduction via latent semantic analysis (LSA).

Results

With the proposed clustering approach, evaluated via an intra- and inter-patient scenario in combination with a natural language pipeline, we achieved an average compression rate of 80% of the initial list items forming consistent semantic topic spaces with an F-measure greater than 0.80 in both cases. The average number of identified topics in the intra-patient case (μIntra = 78.4) was slightly lower than in the inter-patient case (μInter = 83.4). LSA-based feature space reduction had no significant positive performance impact in our investigations.

Conclusions

The investigation presented here is centered on a data-driven solution to the known problem of information overload, which causes ineffective human-computer interactions at clinicians’ work places. This problem is addressed by navigable disease topic spaces where related items are grouped and the topics can be more easily accessed.

Details

Title
EHR problem list clustering for improved topic-space navigation
Author
Pfeifer, Markus Kreuzthalerstian; Vera Ramos, Jose Antonio; Kramer, Diether; Grogger, Victor; Bredenfeldt, Sylvia; Pedevilla, Markus; Krisper, Peter; Schulz, Stefan
Publication year
2019
Publication date
2019
Publisher
BioMed Central
e-ISSN
14726947
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2211325865
Copyright
© 2019. This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.