Content area

Abstract

Chinese Medical Named Entity Recognition (CMNER) seeks to identify and extract medical entities from unstructured medical texts. Existing methods often depend on single-modality representations and fail to fully exploit the complementary nature of different features. This paper presents a multimodal information fusion-based approach for medical named entity recognition, integrating a hybrid attention mechanism. A Dual-Stream Network architecture is employed to extract multimodal features at both the character and word levels, followed by deep fusion to enhance the model’s ability to recognize medical entities. The Cross-Stream Attention mechanism is introduced to facilitate information exchange between different modalities and capture cross-modal global dependencies. Multi-Head Attention is employed to further enhance feature representation and improve the model’s ability to delineate medical entity boundaries. The Conditional Random Field (CRF) layer is used for decoding, ensuring global consistency in entity predictions and thereby enhancing recognition accuracy and robustness. The proposed method achieves F1 scores of 65.26%, 80.31%, and 86.73% on the CMeEE-V2, IMCS-V2-NER, and CHIP-STS datasets, respectively, outperforming other models and demonstrating significant improvements in medical entity recognition accuracy and multiple evaluation metrics.

Full text

Turn on search term navigation

© 2025 Luo et al. This is an open access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.