Content area
Full Text
1. Introduction
Twitter has become a prevalent communication tool amongst internet users, where millions of status updates that voice opinions on a variety of topics or share personal feelings, statements and events, are “tweeted” every single day (Kouloumpis et al., 2011). The wealth of information generated on Twitter makes it a “big data” source of citizen voice (Pak and Paroubek, 2010). It also has become an important platform for national discussions where allows the public, scholars and politicians to get a new and timely update of the public opinions towards an event like terror attack or natural disaster.
Twitter sentiment analysis (TSA) can be an effective vehicle to provide deep insights into the opinion of the public. Although considerable research has been devoted to the binary classification or ternary classification of texts, rather less attention has been paid to multi-class TSA. The multi-class TSA goes deeper in the classification. For example, in a five-class classification, the sentiment of a tweet is classified into multiple classes to capture whether a tweet is highly positive, positive, neutral, negative or highly negative. In this way, one can also gather the degree of positivity or negativity of a sentiment expressed in the tweet. In this paper, we propose a novel approach that classifies tweets into multiple sentiment classes.
In contrast to face-to-face communication, textual communication on Twitter is found to be lacking in non-verbal cues which are important to provide readers with contextual information such as the speaker’s intention or emotional state (Pavalanathan and Eisenstein, 2016). Besides, the 140 characters in maximum limit of Twitter messages makes it difficult to express anything more than pure contents. To compensate for the lack of facial cues and overcome the expression limitation, various non-standard orthographies, such as emoticons and emojis have been used on Twitter to communicate emotions (Kalman and Gergle, 2014; Dresner and Herring, 2010). Emojis are “picture characters” or pictographs that began to appear on Japanese mobile phones in the late 1990s. Recently, emojis have replaced emoticons and been widely adopted for simplifying the expression of emotions and for enriching communication on Twitter. As emojis supplement, disambiguate and even enhance the meaning of messages, it becomes an important area of study complementary to SA. In our previous work (Li