Content area
Affective text analysis, such as sentiment analysis and emotion recognition, has long been studied in the research community but still remains challenging. One major reason is that current natural language processing systems still struggle to recognize implicit affective expressions, where affect is conveyed without any affect-bearing cues. To address this challenge, this dissertation focuses on two learning tasks to acquire two types of implicit affective expressions, which are common and critical for affective text analysis.
The first learning task is affective event recognition, which aims to classify if an event impacts most people positively (e.g., "I watched the sunrise"), negatively (e.g., "I broke my leg") or neutrally (e.g., "I opened the door"). This dissertation first identifies the limitations of previous approaches and introduces a deep learning classifier to mitigate these limitations. It also presents two novel semi-supervised learning methods to produce more training data to improve a classifier. The first method, Discourse-Enhanced Self-Training, produces new affective events by using coreference relations between events and sentiment expressions. The second method, Multiple View Co-Prompting, generates new affective events of high quality by prompting language models. Experiments show that the new affective events produced by these two methods substantially improve affective event classifiers.
The second learning task is to recognize expressions of embodied emotion in natural language, which refer to physical responses in our body when emotion arises (e.g., "my legs shake due to fear"). This dissertation first introduces a new task that aims to identify whether a body part mention is involved in any embodied emotion or not. It also presents two semi-supervised algorithms to generate weakly labeled data to improve a classifier. The first algorithm extracts weakly labeled data from text by using manner expressions with emotion, and the second algorithm generates weakly labeled data by prompting a large language model. Experiments demonstrate that the harvested weakly labeled data can train an effective classifier on its own. Furthermore, it can improve a supervised classifier when combined with gold training data.