A prototype ETL pipeline that uses HL7 FHIR RDF resources when deploying pure functions to enrich knowledge graph patient data

Abstract

Background

For clinical care and research, knowledge graphs with patient data can be enriched by extracting parameters from a knowledge graph and then using them as inputs to compute new patient features with pure functions. Systematic and transparent methods for enriching knowledge graphs with newly computed patient features are of interest. When enriching the patient data in knowledge graphs this way, existing ontologies and well-known data resource standards can help promote semantic interoperability.

Results

We developed and tested a new data processing pipeline for extracting, computing, and returning newly computed results to a large knowledge graph populated with electronic health record and patient survey data. We show that RDF data resource types already specified by Health Level 7's FHIR RDF effort can be programmatically validated and then used by this new data processing pipeline to represent newly derived patient-level features.

Conclusions

Knowledge graph technology can be augmented with standards-based semantic data processing pipelines for deploying and tracing the use of pure functions to derive new patient-level features from existing data. Semantic data processing pipelines enable research enterprises to report on new patient-level computations of interest with linked metadata that details the origin and background of every new computation.

Details

Business indexing term

Subject:

Machine learning

Location

Canada

Identifier / keyword

Linked data; FHIR; Fast health interoperability resources; Pure functions; Data enrichment; Knowledge graphs; ShEx validation; FAIR; ETL

Title

A prototype ETL pipeline that uses HL7 FHIR RDF resources when deploying pure functions to enrich knowledge graph patient data

Author

Ansari, Adeel; Conte, Marisa; Flynn, Allen; Paturkar, Avanti

Publication title

Journal of Biomedical Semantics; London

Volume

Pages

1-12

Number of pages

Publication year

2025

Publication date

2025

Section

Research

Publisher

Springer Nature B.V.

Place of publication

London

Country of publication

Netherlands

Publication subject

Biology--Bioengineering

e-ISSN

20411480

Source type

Scholarly Journal

Language of publication

English

Document type

Journal Article

Publication history

Online publication date

2025-09-01

Milestone dates

2024-12-16 (Received); 2025-07-01 (Accepted); 2025-09-01 (Published)

Publication history

First posting date

01 Sep 2025

DOI

https://doi.org/10.1186/s13326-025-00335-4

ProQuest document ID

3247146211

Document URL

https://www.proquest.com/scholarly-journals/prototype-etl-pipeline-that-uses-hl7-fhir-rdf/docview/3247146211/se-2?accountid=208611

© 2025. This work is licensed under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Last updated

2025-09-05

Database

2 databasesView list

Coronavirus Research Database
ProQuest One Academic

A prototype ETL pipeline that uses HL7 FHIR RDF resources when deploying pure functions to enrich knowledge graph patient data

Content area

Abstract

Details