Content area
In this work, we report on a series of natural language processing tools and models to improve the efficiency and accuracy of information discovery from clinical trials and pharmacological studies. Our main contributions are:
1. The development of an open-source platform Tri-AL that
• Enables dynamic tracking of clinical trials information over time,
• Excels in data visualization and user interaction with a particular emphasis on enhancing the analysis and representation of race and ethnicity data to foster equity in clinical research, and
• Includes a predictive model utilizing machine learning to decipher drug mechanisms of action.
2. Heterogeneous Graph Neural Network for Gene-Chemical Entity Relation Extraction: We created a supervised deep learning model that adapts a heterogeneous Graph Neural Network to extract gene-chemical components. This model augments word representations using message passing that accurately identifies gene-chemical named entities and their relationships class.
3. Bipartite Graph Model for Evaluating Summarization Performance: We proposed a bipartite graph model to evaluate the performance of large language models in summarizing clinical trials. This model provides a robust framework to assess the accuracy and effectiveness of automated summarization tools in the medical domain.
