Content area

Abstract

Background

Alzheimer’s disease (AD) and related dementias (ADRD) are common in older adults, their prevention and management are challenging problems. To prevent or delay ADRD, dietary supplements (DS) have emerged as a promising treatment; however, the role of DS usage on disease progression of patients with cognitive impairments remains unclear. Little clinical trial evidence is available, but substantial information is contained in electronic health records (EHR), including structured and unstructured data about patients’ DS usage and disease status. The objectives of this study were to (1) develop accurate natural language processing (NLP) methods to extract DS usage for patients with Mild Cognitive Impairment (MCI) and ADRD, (2) examine the coverage of DS in structured data versus unstructured data and (3) compare DS usage information in EHR with National Health and Nutrition Examination Survey (NHANES) data.

Methods

We collected EHR data for patients with MCI and ADRD. A pipeline to extract the usage information of DS from both structured data and unstructured clinical notes was developed in the study. For structured data, we used the medication table to identify the DS and for unstructured clinical notes, we applied Bidirectional Encoder Representations from Transformers (BERT) fine-tuning strategy to extract the DS usage status.

Results

The best named entity recognition model for DS achieved an F1-score of 0.964 and the PubMed BERT-based use status classifier had a weighted F1-score of 0.879. We applied these models to extract DS usage information from unstructured clinical notes and subsequently compared and combined with those from structured medication orders. In total, 125 unique DS were identified for patients with MCI and 108 unique DS were identified for patients with ADRD.

Conclusions

In this study, we developed an NLP-based pipeline to extract the DS use information from medication structured data and clinical notes in EHR for patients with MCI and ADRD. Our method could further help understand the DS usage of patients with MCI and ADRD, and how these DS could influence the diseases.

Details

1009240
Title
Identification of dietary supplement use from electronic health records using transformer-based language models
Volume
22
Supplement
3
Pages
1-15
Number of pages
12
Publication year
2025
Publication date
2025
Section
Research
Publisher
Springer Nature B.V.
Place of publication
London
Country of publication
Netherlands
e-ISSN
14726947
Source type
Scholarly Journal
Language of publication
English
Document type
Journal Article
Publication history
 
 
Online publication date
2025-11-03
Milestone dates
2021-08-31 (Received); 2025-10-14 (Accepted); 2025-11-03 (Published)
Publication history
 
 
   First posting date
03 Nov 2025
ProQuest document ID
3268429972
Document URL
https://www.proquest.com/scholarly-journals/identification-dietary-supplement-use-electronic/docview/3268429972/se-2?accountid=208611
Copyright
© 2025. This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Last updated
2025-11-04
Database
2 databases
  • Coronavirus Research Database
  • ProQuest One Academic