Abstract

Prolonged and over-excessive interaction with cyberspace poses a threat to people’s health and leads to the occurrence of Cyber-Syndrome, which covers not only physiological but also psychological disorders. This paper aims to create a tree-shaped gold-standard corpus that annotates the Cyber-Syndrome, clinical manifestations, and acupoints that can alleviate their symptoms or signs, designating this corpus as CS-A. In the CS-A corpus, this paper defines six entities and relations subject to annotation. There are 448 texts to annotate in total manually. After three rounds of updating the annotation guidelines, the inter-annotator agreement (IAA) improved significantly, resulting in a higher IAA score of 86.05%. The purpose of constructing CS-A corpus is to increase the popularity of Cyber-Syndrome and draw attention to its subtle impact on people’s health. Meanwhile, annotated corpus promotes the development of natural language processing technology. Some model experiments can be implemented based on this corpus, such as optimizing and improving models for discontinuous entity recognition, nested entity recognition, etc. The CS-A corpus has been uploaded to figshare.

Details

Title
A tree-based corpus annotated with Cyber-Syndrome, symptoms, and acupoints
Author
Wang, Wenxi 1   VIAFID ORCID Logo  ; Zhao, Zhan 1 ; Ning, Huansheng 1   VIAFID ORCID Logo 

 University of Science and Technology Beijing, School of Computer & Communication Engineering, Beijing, China (GRID:grid.69775.3a) (ISNI:0000 0004 0369 0705) 
Pages
482
Publication year
2024
Publication date
2024
Publisher
Nature Publishing Group
e-ISSN
20524463
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
3053358947
Copyright
© The Author(s) 2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.