Abstract

Cross-lingual sentiment analysis (CLSA) leverages one or several source languages to help the low-resource languages to perform sentiment analysis. Therefore, the problem of lack of annotated corpora in many non-English languages can be alleviated. Along with the development of economic globalization, CLSA has attracted much attention in the field of sentiment analysis and the last decade has seen a surge of researches in this area. Numerous methods, datasets and evaluation metrics have been proposed in the literature, raising the need for a comprehensive and updated survey. This paper fills the gap by reviewing the state-of-the-art CLSA approaches from 2004 to the present. This paper teases out the research context of cross-lingual sentiment analysis and elaborates the following methods in detail: (1) The early main methods of CLSA, including those based on Machine Translation and its improved variants, parallel corpora or bilingual sentiment lexicon; (2) CLSA based on cross-lingual word embedding; (3) CLSA based on multi-BERT and other pre-trained models. We further analyze their main ideas, methodologies, shortcomings, etc., and attempt to reach a conclusion on the coverage of languages, datasets and their performance. Finally, we look into the future development of CLSA and the challenges facing the research area.

Details

Title
A Survey of Cross-lingual Sentiment Analysis: Methodologies, Models and Evaluations
Author
Xu, Yuemei 1 ; Cao, Han 1 ; Du, Wanze 1 ; Wang, Wenqing 1 

 Beijing Foreign Studies University, School of Information Science and Technology, Haidian, Beijing, China (GRID:grid.443245.0) (ISNI:0000 0001 1457 2745) 
Pages
279-299
Publication year
2022
Publication date
Sep 2022
Publisher
Springer Nature B.V.
e-ISSN
2364-1541
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2890052282
Copyright
© The Author(s) 2022. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.