Content area

Abstract

The assessment of Chinese text readability plays a significant role in Chinese language education. Due to the intrinsic differences between alphabetic languages and Chinese character representations, the readability assessment becomes more challenging in terms of the language’s inherent complexity in vocabulary, syntax, and semantics. The article proposed the conceptual analogy between Chinese readability assessment and music’s rhythm and tempo patterns, in which the syntactic structures of the Chinese sentences could be transformed into an image. The Chinese Knowledge and Information Processing Tagger (CkipTagger) tool developed by Sinica-Taiwan is utilized to decompose the Chinese text into a set of tokens. These tokens are then refined through a user-defined token pool to retain meaningful units. An image with part-of-speech (POS) information will be generated by using the token versus syntax alignment. A discrete cosine transform (DCT) is then applied to extract the temporal characteristics of the text. Moreover, the study integrated four categories: linguistic features–type–token ratio, average sentence length, total word, and difficulty level of vocabulary for the readability assessment. Finally, these features were fed into the Support Vector Machine (SVM) network for the classifications. Furthermore, a bidirectional long short-term memory (Bi-LSTM) network is adopted for quantitative comparisons. In simulation, a total of 774 Chinese texts fitted with Taiwan Benchmarks for the Chinese Language were selected and graded by Chinese language experts, consisting of equal amounts of basic, intermediate, and advanced levels. The finding indicated the proposed POS with the linguistic features work well in the SVM network, and the performance matches with the more complex architectures like the Bi-LSTM network in Chinese readability assessments.

Details

1009240
Business indexing term
Location
Title
Chinese Text Readability Assessment Based on the Integration of Visualized Part-of-Speech Information with Linguistic Features
Author
Chi-Yi, Hsieh 1 ; Jing-Yan, Lin 2 ; Chi-Wen, Hsieh 3   VIAFID ORCID Logo  ; Bo-Yuan, Huang 4 ; Yi-Chi, Huang 4 ; Yu-Xiang, Chen 4 

 The Institute of Chinese Language Education, National Kaohsiung Normal University, Kaohsiung 80201, Taiwan; [email protected] 
 Department of Electrical Engineering, National Chiayi University, Chiayi City 600325, Taiwan; [email protected] 
 Department of Electrical Engineering, National Chung Cheng University, Minhsiung 621301, Taiwan; [email protected] (B.-Y.H.); [email protected] (Y.-C.H.); [email protected] (Y.-X.C.), Advanced Institute of Manufacturing with High-Tech Innovations, Ans. 621301 Innovation Building R209, 168 University Road, Ming-Hsiung Township, Chia-Yi 621301, Taiwan 
 Department of Electrical Engineering, National Chung Cheng University, Minhsiung 621301, Taiwan; [email protected] (B.-Y.H.); [email protected] (Y.-C.H.); [email protected] (Y.-X.C.) 
Publication title
Algorithms; Basel
Volume
18
Issue
12
First page
777
Number of pages
17
Publication year
2025
Publication date
2025
Publisher
MDPI AG
Place of publication
Basel
Country of publication
Switzerland
Publication subject
e-ISSN
19994893
Source type
Scholarly Journal
Language of publication
English
Document type
Journal Article
Publication history
 
 
Online publication date
2025-12-09
Milestone dates
2025-11-02 (Received); 2025-12-08 (Accepted)
Publication history
 
 
   First posting date
09 Dec 2025
ProQuest document ID
3286250259
Document URL
https://www.proquest.com/scholarly-journals/chinese-text-readability-assessment-based-on/docview/3286250259/se-2?accountid=208611
Copyright
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Last updated
2025-12-24
Database
ProQuest One Academic