Full Text

Turn on search term navigation

© 2014 Morley et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Background

National electronic health records (EHR) are increasingly used for research but identifying disease cases is challenging due to differences in information captured between sources (e.g. primary and secondary care). Our objective was to provide a transparent, reproducible model for integrating these data using atrial fibrillation (AF), a chronic condition diagnosed and managed in multiple ways in different healthcare settings, as a case study.

Methods

Potentially relevant codes for AF screening, diagnosis, and management were identified in four coding systems: Read (primary care diagnoses and procedures), British National Formulary (BNF; primary care prescriptions), ICD-10 (secondary care diagnoses) and OPCS-4 (secondary care procedures). From these we developed a phenotype algorithm via expert review and analysis of linked EHR data from 1998 to 2010 for a cohort of 2.14 million UK patients aged ≥30 years. The cohort was also used to evaluate the phenotype by examining associations between incident AF and known risk factors.

Results

The phenotype algorithm incorporated 286 codes: 201 Read, 63 BNF, 18 ICD-10, and four OPCS-4. Incident AF diagnoses were recorded for 72,793 patients, but only 39.6% (N = 28,795) were recorded in primary care and secondary care. An additional 7,468 potential cases were inferred from data on treatment and pre-existing conditions. The proportion of cases identified from each source differed by diagnosis age; inferred diagnoses contributed a greater proportion of younger cases (≤60 years), while older patients (≥80 years) were mainly diagnosed in SC. Associations of risk factors (hypertension, myocardial infarction, heart failure) with incident AF defined using different EHR sources were comparable in magnitude to those from traditional consented cohorts.

Conclusions

A single EHR source is not sufficient to identify all patients, nor will it provide a representative sample. Combining multiple data sources and integrating information on treatment and comorbid conditions can substantially improve case identification.

Details

Title
Defining Disease Phenotypes Using National Linked Electronic Health Records: A Case Study of Atrial Fibrillation
Author
Morley, Katherine I; Wallace, Joshua; Denaxas, Spiros C; Hunter, Ross J; Patel, Riyaz S; Perel, Pablo; Shah, Anoop D; Timmis, Adam D; Schilling, Richard J; Hemingway, Harry
First page
e110900
Section
Research Article
Publication year
2014
Publication date
Nov 2014
Publisher
Public Library of Science
e-ISSN
19326203
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
1620165225
Copyright
© 2014 Morley et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.