Abstract
Accurately and efficiently diagnosing Post-Acute Sequelae of COVID-19 (PASC) remains challenging due to its myriad symptoms that evolve over long- and variable-time intervals. To address this issue, we developed a hybrid natural language processing pipeline that integrates rule-based named entity recognition with BERT-based assertion detection modules for PASC-symptom extraction and assertion detection from clinical notes. We developed a comprehensive PASC lexicon with clinical specialists. From 11 health systems of the RECOVER initiative network across the U.S., we curated 160 intake progress notes for model development and evaluation, and collected 47,654 progress notes for a population-level prevalence study. We achieved an average F1 score of 0.82 in one-site internal validation and 0.76 in 10-site external validation for assertion detection. Our pipeline processed each note at 2.448 ± 0.812 seconds on average. Spearman correlation tests showed ⍴ > 0.83 for positive mentions and ⍴ > 0.72 for negative ones, both with P < 0.0001. These demonstrate the effectiveness and efficiency of our models and its potential for improving PASC diagnosis.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details
1 Population Health Sciences, Weill Cornell Medicine, New York, USA (ROR: https://ror.org/02r109517) (GRID: grid.471410.7) (ISNI: 0000 0001 2179 7643)
2 Nemours Children’s Health, Wilmington, USA
3 RECOVER Patient, Caregiver, or Community Advocate Representative, New York, USA
4 Applied Clinical Research Center, Children’s Hospital of Philadelphia, Philadelphia, USA (ROR: https://ror.org/01z7r7q48) (GRID: grid.239552.a) (ISNI: 0000 0001 0680 8770)
5 Louisiana Public Health Institute, New Orleans, USA (ROR: https://ror.org/01nacjv05) (GRID: grid.468191.3) (ISNI: 0000 0004 0626 8374)




