It appears you don't have support to open PDFs in this web browser. To view this file, Open with your PDF reader
Abstract
Identifying cohorts of patients based on eligibility criteria such as medical conditions, procedures, and medication use is critical to recruitment for clinical trials. Such criteria are often most naturally described in free-text, using language familiar to clinicians and researchers. In order to identify potential participants at scale, these criteria must first be translated into queries on clinical databases, which can be labor-intensive and error-prone. Natural language processing (NLP) methods offer a potential means of such conversion into database queries automatically. However they must first be trained and evaluated using corpora which capture clinical trials criteria in sufficient detail. In this paper, we introduce the Leaf Clinical Trials (LCT) corpus, a human-annotated corpus of over 1,000 clinical trial eligibility criteria descriptions using highly granular structured labels capturing a range of biomedical phenomena. We provide details of our schema, annotation process, corpus quality, and statistics. Additionally, we present baseline information extraction results on this corpus as benchmarks for future work.
Measurement(s) | Clinical Trial Eligibility Criteria |
Technology Type(s) | natural language processing |
Sample Characteristic - Organism | Homo sapiens |
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details
; Mullen, Tony 2
; Uzuner, Özlem 3 ; Yetisgen, Meliha 1 1 University of Washington, Department of Biomedical Informatics & Medical Education, Seattle, USA (GRID:grid.34477.33) (ISNI:0000000122986657)
2 Northeastern University, Khoury College of Computer Science, Seattle, USA (GRID:grid.261112.7) (ISNI:0000 0001 2173 3359)
3 George Mason University, Department of Information Sciences and Technology, Fairfax, USA (GRID:grid.22448.38) (ISNI:0000 0004 1936 8032)




