Emotion AI-Driven Sentiment Analysis: A Survey,

Full text

Turn on search term navigation

1. Introduction

Sentiment analysis (SA) refers to uncovering the human emotion that is conveyed within a context. It makes it possible to predict the emotion, attitude, or even the personality of a person which is expressed in the form of different aspects. Sentiment analysis identifies the human emotion underlined in the context which enables machines to understand these emotions accurately. Initially, knowledge or opinions were shared among family members, neighbors, friends, relatives, etc. in person. Now, with the evolution of technology, most of these exchanges happen online where SA plays a significant role. Technology has provided a platform for one to be exposed to thousands of opinions in minutes [1]. For example, a person can post their views on a social issue or on a product they have recently bought. These reviews also extend to movies, hotels, and restaurants [2]. Now, people are fonder of online communication; hence, both the opinions of individuals and the need for sentiment prediction in business areas have increased in order to understand the common people’s needs easily and their likes and dislikes [3]. Sentiment analysis and opinion mining (OM) are among the fields that are highly profited by these innovative approaches, and they involve an automated procedure of perceiving and recognizing human feelings [4].

This paper intended to provide a wide range of analyses on various studies on AI-driven SA and OM of emotion. In addition, this paper serves as a comprehensive review of SA and OM based on multiple approaches and methodologies, including the implicit and explicit extraction of data. This review paper includes a taxonomy of analyses on sentiment and the pros and cons of SA based on previous research works. The various levels of SA, open issues, research issues, as well as future directions on the study of sentiment and OM and their various applications are further highlighted in this review article.

1.1. Emotion AI-Driven Sentiment Analysis: Taxonomy

Sentiment analysis is sorted into the following three dimensions: the document level (DL), sentence level (SL), and feature level or angle level which is shown in Figure 1. At the DL, all the words related to emotions in the entire document are analyzed [5]. The positive or negative outcomes of the sentences are analyzed without focusing on every viewpoint. This analysis provides a general assessment of the report. At the DL, the examination supposes that the whole document has one specific theme [6]. Afterwards, it is estimated whether the tone of the entire report is positive or negative. This kind of SA is used for applications such as social and mental examinations by casual associations, client satisfaction analyses, and analyses of patients in therapeutic settings [7].

Similarly, at the SL, the aim is to discover the limit of the sentence, and the result is given at the general sentence level. It determines whether the real sentence is exceptional or objective. Also, it interprets whether the general inclination of the sentence is positive or negative in energetic sentences that are viewed as small records. It is widely used for tweets, Facebook posts, and short messages [8].

1.2. Sentiment Analysis (SA) Process

The main objective of SA is to obtain the emotion from the context. The context might be data from an online review or document; it can be anything in a large amount where human beings could delay the process while dealing with it. There are numerous steps to be followed to uncover the exact meaning and the sentiment oriented to it. Hence, this section explains the various methods of SA [9]. The process of emotion AI-driven sentiment analysis is illustrated in Figure 2.

1.2.1. Step 1: Data Collection

A dataset needs to be collected. For example, tweets as a dataset can be gathered by utilizing the Twitter API; ROAuth is required to approve the application [10]. The dataset could be of more than 5000 tweets that vary in size according to the data needed. It should include all three types of data, where structured data are in an organized format in the repository [11], semi-structured data are formatted in the form of structured data, and unstructured data are not organized and do not contain any pre-defined models [12]. The types of data are depicted in Figure 3.

1.2.2. Step 2: Training Dataset and Subjective Data

Two types of datasets are utilized for preparing the classifier: subjective information and unbiased information. The subjective dataset includes the notion inside the setting, while the impartial dataset does not include the sentiment or emotion of the specific situation [13]. Emotional information contains the opinion of a particular circumstance and conveys the emotion in two ways: glad or sad. An adequate measure of the negative and positive views (tweets) for two consecutive days was gathered to prepare the dataset by utilizing the classifier.

1.2.3. Step 3: Data Pre-Processing

Preprocessing is the initial phase in the supposition investigation, and it is done before semantically examining the vocabulary [14]. For instance, we can go back to the example of Twitter. Twitter is a platform where individuals from different parts of the world offer their perspectives as tweets in different languages. The information in these tweets may contain unstructured data that is boisterous, for instance, stop words, non-English words, and emphasis marks [15]. These kinds of unstructured information are exceedingly popular in tweets. In the preprocessing step, the tweet is divided based on parts of the speech (POS) tags. Information preprocessing includes evaluating URLs, sifting, expelling interrogative proclamations and stop words, barring unique characters, barring retweets, expelling hashtags on the perspective, barring emojis and pictures, expelling dialects other than English, and eliminating capitalized letters [16].

1.3. Comparison with Previous Surveys

Various approaches have been proposed for a suitable analysis of sentiment examination. Nevertheless, to the best of the authors’ knowledge, slant investigation is still in its initial stage and endures changes. A couple of efficient surveys have shaped this region well, and the existing solutions work well with the current advancements. The principal boundary is that estimation examination is a multi-faceted issue and includes various sub-problems; however, it is not a solitary errand [17]. Additionally, the existing overviews of conclusion, examination, and consideration are either centered around depicting particular specialized points or are primarily focused on a specific part of slant examination [18].

Among the studies that have been proposed recently, the one by Yue et al. [19] summed up an immensely critical research topic in the fields of SA and OM. This work is viewed as a reference book on emotion examinations, and the opinions that are extracted from it [20] focus on the multimodal SA addressed in both the supervised and unsupervised models. It has also detected the automatic sentiment from the context and tested the various machine learning (ML) approaches. Table 1 depicts the comparison of the previous surveys.

The rest of this survey paper is organized as follows: Section 2 presents the survey of the SA, Section 3 presents the various methodologies of the SA, Section 4 presents the results and discussions, and Section 5 elucidates the challenges, future research directions, and open issues.

2. Literature Review

This chapter presents an in-depth analysis of SA which is based on four approaches, namely, ontology-, lexicon-, machine learning-, and neural network-based approaches.

2.1. Ontology-Based Sentiment Analysis: Review

In recent developments in AI, ontologies have played an essential role in showing the relationships among class hierarchies using the concept of object-oriented programming. Ontology is defined as the precise labeling and description of the various types of relationships between an object and its properties.

There are four forms of ontology: entity, which denotes an object; the relation among things; the presence of an object in a relationship; and properties that are interrelated with the object [26]. Figure 4 shows the hierarchical structure of the general ontology for the English language. There are several reasons to build an ontology as listed below:

To examine the domain-specific knowledge;
To activate the domain knowledge for reuse;
To provide clear domain supposition;
To split the domain and functional expertise;
To provide a way to share their knowledge with the software agents.

Due to the rapid increase in the number of existing websites, information extraction has become more challenging. Most search engines are keyword based or are available as full-text search engines which poses a challenge in extracting precise information. Thus, another technique for data extraction and supposition mining framework has been proposed that depends on type-2 fuzzy ontology. This framework was designed to reconstruct the consumer’s full-text data into a proper, classical format for the search engine. This methodology provides the features that are extracted using the type-2 fuzzy ontology method.

In earlier days, SA followed the traditional analysis system which had no proper or precise use for sentiment words. This system used the short-text form to share opinions or reviews in a discussion [27]. Moreover, this led to a challenge in finding the proper sentiment. Hence, to overcome these challenges, a cross-domain SA was developed. This system has enhanced the sentiment presentation through two perspectives—microscopic and macroscopic views. The fusion of these sentiments resulted in the form by considering the simplicity and the speed of the system. It also uses simple linear insertion for fusing images and texts.

Scross media = λScontext + (1 − λ) Simages(1)

where Scross media is the fusion sentiment result, Scontext is the normalized text sentiment, Simage is the normalized image sentiment, and λ is the balanced weight (when λ > 0.5, it gives better fusion results).

The need to overcome the challenge of the ambiguities in the opinions expressed in Chinese online product reviews has led to a novel approach to identify the product aspects quickly, and it was proposed to use the opinions related to the products to build a suitable ontology [28]. The job of SentiWord is to consider the single context and the PoS presented in the statement. The SentiWord is given a score between −1 and 1, where the lowest value indicates a negative sentiment and the highest value indicates a positive one. As the SentiWord considers both the word and PoS, the statement gives a clear view of the tweet given.

Different perspectives and challenges are identified and strategies to overcome them. Sasi et al. [29] demonstrated their contribution to the analysis of the negative sentiment provided in a tweet by the consumer. To attain high customer satisfaction, they focused only on the negative opinions in the tweets related to the delivery service of the United States Postal Service. They used object properties to develop the ontology and SentiStrength to uncover the score of the statement provided in the tweet.

More past ontology-based works are presented in the Table 2 below.

2.2. Lexicon-Based Sentiment Analysis: A Review

One of the best examples of a lexicon is the shouts exchanged among the players in a match such as “Yeah!”, “Hoo!”, “Hut!”, “Blitz!”, “Hike!”, etc. Another set of examples is words used by lawyers in court: “I object, my lord”, “court adjourns”, “counsel”, etc. These sets of terminologies that are used among a group of people are known as lexicons. The phrases that have specific meanings are called lexemes. Different languages have different words with the same meaning; for example, water is thanni (Tamil), vellam (Malayalam), paani (Hindi), etc.

Lexicon-based SAs have two approaches—a dictionary-based approach and a corpus-based approach. The dictionary-based method contains words with semantic orientation, and the corpus-based approach has words with and without sentiments that can be used for other purposes as well [39]. Figure 5 shows the architecture of the lexicon-based SA, and it shows how the view is classified and the opinion is extracted from the new data. The data are trained by the learning model that classifies the data into three forms—positive, negative, and neutral.

Lexicon-based SA is used when the training data are inadequate. According to Thakkar et al. [40], unigrams were used in previous algorithms, but they did not provide satisfactory results. Hence, the authors proposed the n-grams method which is formed by the N number of unigrams to offer better results. For negative statements in the document, the authors proposed to use a ratio-based approach.

Another challenge to the domain are heterogeneity and linguistic problems. The domain that is specific to the sentiment can differ from content to content; however, when it comes to language, it differs from person to person [41]. To overcome these kinds of challenges, an algorithm that is domain-specific should be taken up with already existing lexicon-based domain-specific methods and with the utilization of undefined dictionary-based sentiment analysis (DBSA) and corpus-based sentiment analysis lexicons (CBSALs).

A few more past lexicon-based works are presented in Table 3 below.

Lexicons are built for applications, such as online product reviews, blogs, Twitter, medical forums, etc., and various works related to them are presented.

2.3. Sentiment Analysis Based on Machine Learning: A Review

Machine learning is a model that can handle any complex work that humans cannot accomplish in real-time. In this model, humans enable machines to think and learn by themselves using the experiences they have gained. For example, when the context from feedback is extracted, the aspects and features of the review are identified. Then, the identified feature is labeled with the maximum matching class. In the past, dictionaries were significantly used to understand the views pertaining to a tweet’s specific situation during research. A noteworthy issue of lexical examination is the explicit space words that are not utilized in these lexicons, in which case, deciding the notions of these dictionaries is the best test. The current work did not require preparation for the classifier to decide the conclusion for the area’s explicit lexicons. In this examination, two words were included—incredible and poor [54]. Astounding was viewed as an exceedingly positive setting, and poor was considered to be a profoundly negative setting. The score produced from each word was observed using the given rule:

(2) $Predict (S) = \frac{\log (hit (s \land good) hit (worst)}{hit (s \land worst) hit (good) - 1}$

This formula helps to understand how SA is utilized, as it is human nature to know “how and what”. Since AI is the best in class, the notion assumes an indispensable job in it [55]. Previously, scholastic understudies were found to investigate the supposition of others. The Rule-Based Emission Model (RBEM) was used to recognize the extremity in sentences. Because of the exploration, the approach performed well and gave outcomes that were exceptionally ascendable, transparent, and could work effectively. It has led to the study of a new issue of unsupervised analysis of sentiment in a signed social network. Methodologically, they have suggested consolidating the signed social relations and nostalgic signs from terms into a bound together structure when feeling names are absent. Later, these were additionally analyzed on two true signed social networks—Opinions and Slashdot. The outcomes demonstrated that the proposed Signed Senti has a fundamentally preferred execution over best in class strategies [56].

It aims to provide the automatic SA that uncovers the in-depth attitude that is held towards an entity [57]. Also, the problem in SA and multimodal sentiment analysis (MSA) in terms of a different aspect of the data has been discussed, for example, through images, human–machine interactions, human–human interaction images, videos, etc. Consider the utilization of three AI approaches—for instance, naive Bayes (NB), support vector machine (SVM), and most extraordinary entropy—to separate notions into good and bad requests. They utilize the word sack to obtain results on the gathering of the supposition examination. The results exhibited in SVM would be higher in capability than NB. [58] At the same time, exactly when the dataset is lower in size for preparing and testing, the outcome would be higher while utilizing the NB classifier.

The outcomes on the Twitter dataset that was gathered demonstrated that the accuracy of the proposed showcase was 74% greater than that of the conventional directed emotion classifiers (SVM, random forest (RF), decision tree (DT), and some semi-administered calculations) [59]. Similarly, an improved NB system showed two NB varieties using lemmas (things, action words, graphic words, and modifiers), polarity lexicons, and multiword as perspective highlights.

More past works on machine learning are presented in Table 4 below.

2.4. Neural Network Models: A Review

In the neural models, unique words are utilized as the input in the parse trees which provide the synthetic data and semantic data. Hence, emotion composition is derived from the best. Recurrent neural networks and convolution neural networks are becoming more popular, and they do not require parse trees to split their features from the given sentences. Instead, recurrent neural networks and convolution neural networks (CNNs) utilize word embedding as inputs which already inscribes the semantic and synthetic data. Additionally, the architecture of the convolution neural networks and recurrent neural networks help in learning the connectivity between the words in a statement. A recursive autoencoders network (RAN) in a semi-supervised model for sentence-level SA resulted in providing a low-dimensional vector representation [68]. In a new matrix-vector recursive neural network (MVRNN), each context is also related to a matrix representation in the form of a tree [69].

The structure of a tree is derived from a parser that is used externally. It portrays a collaboration of Recurrent Neural Network (RNN) and CNN architecture for the classification of the sentiment from a short context which takes the favorable position in the coarse-grained features generated by a CNN [70].

An approach based on the linguistics LSTM for effective sentiment prediction incorporates the sentiment lexicon as a highly intensive context and negative context [71]. The LSTM includes these features while analyzing the sentiment to provide a useful view of the context. Authors have presented a traditional CNN–LSTM model which consists of two sections—local CNN and LSTM—to predict the attitude illustrated in the content [72]. Table 5 presents the overall pros and cons identified from the survey.

3. Methodology

The two important methodologies used for sentiment analysis, such as the machine learning-based approach and lexicon-based approach, are discussed in the next section.

3.1. Machine Learning Approaches

As discussed in the literature review, SA can be performed through various methods. Figure 4 shows the categorization of the methods in the analysis of sentiment and aggregation of opinions.

Figure 6 mentions the different approaches that are discussed here. The methods are a probabilistic classifier, linear classifier, rule-based classifier, DT classifier, and NB classifier.

The probabilistic classifier (PC) provides a probabilistic function to the set of input data. Further, the input function f(x) is applied with the probability and is mapped with the output function (y). Hence, the PC is denoted as follows:

Y′ = f(x)(3)

where f(x) is the input function. When the conditional distributors replace the PC, it is as follows:

Z′ = arg max Z Pr (Z′ = z/V′)(4)

where the PC is changed to the conditional classifiers Pr(Z′/V′), and the given z Z′ is assigned to v V′.

In a linear classifier, word T = (T1, T2, T3) is the frequency of a single word, the vector V = (V1, V2, V3) is a linear input coefficient, and the scalar S is the linear output coefficient. This classifier categorizes the margin to distinguish between the two classes [73].

In the rule-based classifier, the “if condition and then decision” approach is used to make specific rules. This rule-based classifier is also known as a multi-class classifier. Furthermore, this classifies the data into three forms—good, bad, and neither good nor bad. Besides, this unsupervised classifier mainly focuses on the prediction of emotion in the context and emoticons [74].

The DT classifier is a recursive one. For the training data, the condition based on the classification is applied, and it divides the training data such that the ones which satisfy the conditions are all of one class, and the procedure continues until all the data satisfy the condition [75].

The NB approach examines whether the feature provided with the probability has the role of the label or not. In this, every feature is independent and the specific feature is mapped with a label matched in the maximum.

P(E/A) = (P(E) × P(A/E))/P(A)(5)

Here, P(E) is the probability of the label, and P(A) is the probability of the feature. The NB classifier is mainly required to classify features such as email IDs, URLs, words, phrases, dictionaries, parse trees, etc. The NB algorithm is solely used for the textbook, and it classifies the string and not any of the numerical data or subsets. This classifier is a class-specific unigram language model. The likelihood features are assigned the probability of every single word, and they propose the probability to each sentence as follows:

(6) $P (sentence | count) = \prod P (word | count)$

At present, for the given parameters, the positive and negative values were specified. The grade was multiplied altogether by the positive score and the negative score. The total positive score was found to be 0.0000005, and the overall negative rating was 0.0000000010 as shown in Table 6. For the given statement, a higher probability was given to the positive. Hence, the given statement is irrefutable. Figure 7 shows how the SA process was executed as well as how the classification was done for the training data using the NB classifier. The training data as classified as positive, negative or neutral. The knowledge-based method determined the record of the appearance of a word in a particular record or report.

After finding the occurrence of a word, similar words are grouped together. Then, the NB approach is utilized to classify the test data and predict the sentiment to document as positive, negative or neutral [76]. The Bayesian network model used in it is a directed acyclic graph. There is a strong relationship between the feature and the label in this model. Maximum entropy deals with probability. Initially, to set-up maximum entropy modeling, the characteristics should be selected to determine the constraints. In-text classification—the use of word count—is supposed to be the feature.

(7) $P (L / F) = P (L) \times P (\frac{F}{L}) / P (F)$

where “P” refers to a function, “L” refers to label, “F” refers to features, P(L) refers to the probability of label, P(F/L) is the probability of the feature categorized as label, and P(F) refers to probability of the feature.

Further, this model is executed by the vectors. Hence, in this model, the labeled features are converted into vectors. The weight of the feature is allocated. The prediction of the label and the feature is performed to calculate the sentiment of the context. Here, the feature and the label were mapped in the form of vectors. The above representation [77] of the maximum entropy shows that if any of the words occur from the same class, the weight for that class will be higher. The primary advantage of the maximum entropy is that it utilizes the natural binary features.

The SVM algorithm classifies the data by using the hyperplane such as the positive or negative forms. In this model, no probability was applied; hence, it was not a PC. Support Vector Machine approach is exceptionally efficient in text categorization [78] and is better than PCs. The motivation behind SVM is to recognize the hyperplane which the vector symbolizes. This records vectors in single class word vectors and sentence vectors in a single year. So far, its accuracy is more than 90% which is high when compared to the NB and maximum entropy method.

(8) $vector = \sum_{j} α j, countj, documentj vector, α j \geq 0,$

Here, αj refers to a dual optimization problem and document.

j vector − αj > 0 = support vector machines. The classification part plays a major and versatile role in identifying the hyperplane and making sure which constraint falls under the set margin [79].

Random Forest approach is said to be a tree classifier. Every single class in the tree is given an information vector, and the most elevated class is taken into the point. The error rate is dictated by the connection between the two trees in the woods and the weight of individual trees in the forest. Further, to decrease the error, the trees ought to have substantial weight or quality and be free of one another. The RF takes the DT as the individual predictors that are based on the methods of randomizing outputs, boosting, and bagging. Any large number of datasets can be easily classified using RF methods with better accuracy [80].

Algorithm 1.

Input: A = Total number of trees D = Training dataset F′ = Features f′ = Sub featuresOutput: Label of bagged class 1. For each and every tree in A (i) Create a bootstrapping sample S of size D (ii) Create a tree recursively to all the internal nodes with the following steps: Step 1:- Choose the sub feature f′ randomly at feature F′ Step 2:- Best sub feature f′ has to selected Step 3:- Finally split the node 2. Test instance will be passed to trees once trees are created 3. Later, a majority of votes are provided by assigning the class labels

The neural network model encompasses three layers: an input layer, a hidden layer, and an output layer [81]. Artificial neural networks (ANNs) are used for learning by applying a neural network of multiple layers in it. Furthermore, this is more powerful in representing the neural network, and also, it is more practical with a minuscule quantity of data and a minimum of two phases. The neural network is categorized into the recurrent neural network and the feedforward neural network. Several activation functions are to be used which include ReLu, tanh for sigmoid function, and leaky ReLu.

(9) $f (W^{'} x) = sig (W^{'} x) = \frac{1}{1 + \exp {(- W^{'} x)}^{'}}$

(10) $f (W^{'} x) = Relu (W^{'} x) = \max (0, W^{'} x)$

Sentiment Analysis in the neural network is done with the initial representation of the word into a vector and by word-level word embedding, character-level embedding, sentence-level embedding, and training the network. Numerous profound learning models that are utilized as a part of the NLP, which require the word embedding, come as information highlights. This word embedding changes over the setting into a vector of consistent, genuine numbers; e.g., word—Hai (H—0.13, a—0.15, and i—0.23). The experimentation is finished with a few calculations. Furthermore, it helps to estimate productivity and precision [82]. The convolution neural network is typically used in the picture arrangement for the examination of emotion. Likewise, it isolates the strings where every single independent setting is changed to vector [83]. Figure 8 shows the flow process of convolution layers.

Algorithm 2.

Training CNN for sentiment analysisInput: f(x) = {x₁, x₂, …. x_n} , f(y) = {y₁, y₂,…. y_n} Step 1: Train the CNN with input f(x)and f(y) Step 2: Sentiment score of f(x) predicted Step 3: Let X be the leftover training data and Y be their sentiment Step 4: Let the CNN fine-tune with the input X and Y Step 5: Return (the average performance)Output: Sentiment of the data

The LSTM approach is a versatile type of recurrent neural network which handles the dependencies. The entire recurrent neural network follows the chain rules and is done recursively until the process achieves the optimization stage [84]. The recursive iteration in the LSTM is complicated when compared to ordinary and straightforward RNN because it has four layers that interact in a particular manner. From the cell state at the timestamp t, the LSTM decides which data that should be in the dump. This is determined by utilizing the sigmoid function (σ), which is called the “forget gate.” The capacity guarantees hit – 1 (yield from the past covered layer) and it (current data) yields a number in ‘0’ or ‘1’ where 1 means “thoroughly keep” and 0 implies “absolutely dump”.

(11) $f = σ (W^{'} x + U^{'} h (t - 1))$

3.2. Lexicon-Based Approaches

3.2.1. Sentiment Analysis in Gram Representation

Unigram is the representation of the word presented in the document. In addition, it is associated with the feature value of the word in the document which is referred to as term frequency (TF). Unigram is said to be a single word in the document. Every single word that is taken into account from the document presented is known as a unigram. Alternatively, every pair of words is called a bigram which is used for bigram representation from the document. Here, the feature is associated with the bigrams in the document. Further, in n-gram representation, “N” number of words in the document were considered [85].

3.2.2. Term Frequency- Inverse Document Frequency (TF-IDF) Representation

Term frequency-inverse document frequency is the representation of TF-IDF. Here, the word presented in the document is given by “total frequency (word, document) × log (inverse document frequency (word, document)”. A log is presented here for the computation of base 10, and d is the training data which are the collection of the document presented [86].

3.2.3. Dictionary-Based Approach

The new terminologies are collected manually by this approach, and then a list of synonyms and antonyms of the terms are formed. It is later matched to the list, and words with similar meanings are grouped together. This process continues whenever a new term is found [87].

3.2.4. Corpus-Based Approach

The corpus-based method is applied to a particular topic. The corpus approach has two forms: statistical approach and semantic approach. The corpus-based approach is mainly used for addressing languages. The corpus data are extracted from the corpora which has a large amount of data, and it also has the actual pattern of the language used in day-to-day life [88].

3.2.5. Statistical Approach

This approach is used to find the occurrence of words. The principal goal of this approach is to determine the extremity between positive and negative words. When positive data are high, the entire data are positive and vice versa for negative data. Cosine similarity is one of the statistical approaches utilized in determining the sentiment and the opinion that is uncovered from the context. Cosine similarity shows the similarity among two vectors which is non-zero. Cosine similarity determines the polarity, whether positive or negative [89].

4. Results and Discussions

First, the evaluation metrics that were considered for comparison between the approaches are discussed here. In order to find the effectiveness of every classifier, the following guideline was utilized to find the precision, support, review, and F-measure [90].

4.1. Evaluation Metrics

4.1.1. Prediction Accuracy

Generally, to analyze accuracy, the following rule is applied which determines how the sentiment is calculated and determined accurately. Besides, this is said to be the precision measure [91].

(12) $P r e d i c t i o n a c c u r a c y = \frac{L T T}{T T T}$

where LTT refers to the labeled Twitter tweets and TTT refers to the total tweets.

4.1.2. Refined Measure on Tweet Precision

The fraction of the data that are retrieved in a relevant manner is defined as follows:

(13) $R e f i n e d m e a s u r e f o r p o s i t i v e d a t a = \frac{P T T}{T T P + F P}$

(14) $R e f i n e d m e a s u r e f o r n e g a t i v e d a t a = \frac{N T T}{T T N + F N}$

where TTP refers to total positive tweets, and TTN refers to total negative tweets.

4.1.3. Recall

The fraction of the data that are retrieved in a relevant manner is defined as follows

(15) $R C = \frac{P T T}{T T P + F N}$

4.1.4. F-Measure

The F-measure is used to evaluate the false rate, and the formula used to define it is as follows:

(16) $F - m e a s u r e = \frac{2 \times R e f i n e d m e a s u r e \times R C}{R e f i n e d m e a s u r e + R C}$

In this article, the training dataset and testing dataset were taken from Twitter. The tweets were collected based on a movie. Table 7 depicts the total dataset used for testing and training the sentiment. This particular movie data were tested with three different approaches: ontology-based SA, lexicon-based SA, and machine-learning-based SA. The results are given as follows:

4.2. Using Ontology-Based Sentiment Analysis

In the ontology-based SA, four primary conventional approaches were tested: specific ontology-based SA, fuzzy logic-based SA, aspect-based SA, and domain-specific SA. The aspect-based SA resulted in 83% accuracy, 84% recall, and a F-measure of 50%. This shows that the exactness of prediction was high compared to the other approaches for the considered data. The results are shown in Figure 9.

4.3. Using Lexicon-Based Sentiment Analysis

The four major approaches that were used to test the lexicon-based SA were term frequency approach, word count, unigram, and bigram. The precision of the word count was 91%, the recall was 88%, and the F-measure was below 10% for the considered data. The results are shown in Figure 10.

4.4. Using Machine-Learning-Based Sentiment Analysis

In the machine learning approaches, the multinomial NB algorithm, RF algorithm, SVM, and XG Boosting algorithm were tested. The SVM resulted in high accuracy of 96%, recall of 66%, and F-measure of 60%. Moreover, the machine-learning-based SVM approach achieved 96% accuracy for the considered data. These results may vary when the dataset varies. The results are shown in Figure 11.

The above discussion provided quantitative results of various approaches in Sentiment Analysis. Recently, Twitter data analysis has been used the most to predict sentiment. The overall merits and demerits identified from this work is presented in Table 8.

5. Challenges, Future Research Directions, and Open issues

Despite the several advantages of emotion AI-driven SA, there are significant challenges that have to be focused on. Resolving the below-mentioned challenges will make SA more efficient and effective, so it can be applied everywhere.

5.1. Mining Unstructured Data

In the broad range of social media, all kinds of users are present, from well-educated to uneducated users. To save time, some users have started to text in the message format; for example, to convey the message “happy with this car,” the user would text “happppppppppy vth dis ”. This kind of text is considered to be unstructured data. Many of the SA methods preprocess this information. Hence, extracting and providing the sentiment and the emotion to these kinds of data are really challenging [92].

5.2. Identifying Composite Media Features

Generally, this is one of the pressing issues centered around the notion of examination in the light of the fact that remarks or audits may bring attention to any issue. Table 9 depicts user data examples [93].

Extracting these kinds of data, identifying the feature of the data, then determining the opinion are immense challenges in SA, because several million users use online shopping, and each has a different manner of using language to express feedback [94].

5.3. Different Words with the Same Meaning

In a user’s review, different words with the same meaning might be used. It is necessary to classify the similarity among each word, as some words are placed differently in some sentences which may cause them to sound different, even though they have the same meaning [95].

Sentiment words that do not express a sentiment—some of the words in interrogative statements may not explicitly express any sentiment. Still, the emotion present in the statement should be identified as another challenge [96].
Emotion identification (sarcasm)—it is difficult for the machine to identify sarcastic statements. Researchers on SA work hard to identify sarcastic comments with high accuracy, as human emotions and attitudes are often ambiguous [97].

A significant challenge faced by SA is making the machine understand intense human emotions conveyed in the context. With a rise in the usage of unstructured data, human language has become highly complicated, and it is difficult to determine the opinions, viewpoints or reviews of the customer as well as the right sentiment of the context [98]. The open issues on SA are summarized in Figure 12 as follows:

Once these challenges are addressed, the possible outcome can benefit from understanding the affinity or sentiment toward a particular phenomenon, entity or idea. Also, understanding the customers’ perspectives on a specific aspect of a product, brand or advertisement is challenging [99]. A more accurate interpretation of the sentiment is expressed in an unstructured data format that can be evaluated [100]. It further involves accessing customers’ feedback to measure customers’ satisfaction and effective e-governance and crisis management [101]. Figure 13 demonstrates the scope of future research and open issues in detail.

6. Conclusions

In this paper, an overview of emotion AI-driven SA in various domains was presented. Also, this survey reviewed the merits, demerits, and scope of the different approaches that have been considered. A significant advantage of SA is that it provides the exact emotion that is underlined in the context. Traditional methodologies, such as machine-learning-based approaches, lexicon-based analysis, and ontology-based analysis, were considered for experimentation to compare performances. In the considered sample data, the aspect-based ontology approach, SVM, and term frequency achieved high accuracy and provided better SA results in each category. Future research directions as well as limitations were also highlighted for the benefit of future researchers. Even though the results showed higher accuracy for the sample data considered, these results may vary when it is applied to other applications. Deep learning approaches can also be considered for comparing the performances as part of the future work which may bring significant changes to the results.

Author Contributions

Conceptualization, P.C., D.R.V., and K.S.; Methodology, P.C., D.R.V., and K.S.; Software, P.C., V.S., and D.G.R.; Validation, C.-Y.C.; Formal Analysis, V.S., and D.G.R.; Investigation, P.C., D.R.V., and K.S.; Resources, K.S., and C.-Y.C.; Data Curation, P.C.; Writing-Original Draft Preparation, P.C., D.R.V., and K.S.; Writing-Review & Editing, V.S., C.-Y.C. and D.G.R.; Visualization, K.S.; Supervision, D.R.V.; Project Administration, C.-Y.C.; Funding Acquisition, C.-Y.C.

Funding

This research was partially funded by “Intelligent Recognition Industry Service Research Center” from The Featured Areas Research Center Program within the framework of the Higher Education Sprout Project by the Ministry of Education (MOE) in Taiwan. Grant number: N/A and the APC was funded by the aforementioned Project.

Conflicts of Interest

The authors declare that they have no conflict of interest.

Abbreviations

ABBREVIATION	FULL FORM
NB	Naive Bayes
MNB	Multinomial naive Bayes
SVM	Support vector machine
ME	Maximum entropy
NN	Neural network
ANN	Artificial neural network
CNN	Convolution neural network
RNN	Recurrent neural network
LSTM	Long short-term memory
RF	Random forest
LC	Linear classifier
PC	Probability classifier
PSDEE	Polarity shift detection, elimination, and ensemble
TF	Term frequency
TF-IDF	Term frequency-inverse document frequency
DT	Decision tree
BoW	Bags of word
Pos	Parts of speech
FSC	Fuzzy semantic classifier
RFWC	Relative frequency word count
KNN	K-nearest neighbor
CRF	Conditional random fields
DL	Document level
SL	Sentence level
FL	Feature level

Figures and Tables

Figure 1. Taxonomy: emotion AI-driven sentiment analysis.

View Image - Figure 2. Process of emotion AI-driven sentimental analysis, K-Nearest Neighbour (KNN), Convolutional Neural Network (CNN), Support Vector Machine (SVM).

Figure 2. Process of emotion AI-driven sentimental analysis, K-Nearest Neighbour (KNN), Convolutional Neural Network (CNN), Support Vector Machine (SVM).

Figure 3. Depiction of types of data.

Figure 4. The structure of the ontology: terminologies.

Figure 5. Architecture of lexicon-based sentiment analysis.

Figure 6. Various methods of emotion AI-driven sentiment analysis.

Figure 7. Sentiment analysis using the naive Bayes classifier.

Figure 8. Convolution neural network flow process.

Figure 9. Evaluation metrics of ontology-based sentiment analysis.

Figure 10. Evaluation metrics on lexicon-based sentiment analysis.

Figure 11. Evaluation metrics of machine-learning-based sentiment analysis.

Figure 12. Summary of open issues.

Figure 13. Scope of future research and open issues.

Table 1

Comparison of the previous surveys.

Reference Number	Survey Objective	Survey Outcome
[21]	To survey the development of sentiment analysis on images, videos, blogs, etc.To discuss visualized sentiment analysis and speech sentiment analysis.	It focused on the methodology to be used to provide the sentiment for both speech and visualization.It discussed how the automatic sentiment analyzer could be proposed for both speech and visual data.
[22]	To focus on increasing precision and reducing the false rate of sentiment analysis.	The evaluation metrics were discussed, and they showed the experimental results.It explained how the supervised and unsupervised algorithms work.
[23]	To conduct a sentiment analysis for the communities.	It discussed fine-grained sentiment analysis and algorithm classification.
[24]	To depict the n-gram, unigram, and focuses on interpreting the sentiment in every single sentence.	It discussed the neural network-based approach and word embedding and elucidated how the recurrent neural network works.
[25]	To conduct a survey to implement Twitter sentiment analysis.	It presented the methodology of HybridSeg and discussed subjective data.

Table 2

Chronological view of ontology-based sentiment analysis.

References Number	Problem Identified	Methodology	Dataset Used	Results	Limitation
[30]	Data sparsity problem	Specific domain ontology	Tweets	82%	Constrained accuracy
[31]	To improve the accuracy	Microblog specific sentiment lexicon	Dataset from Tencent Weibo (2013) on 20 topics	84.3%	Only for Chinese blogs
[32]	Binary classification problem and accuracy	Fuzzy ontology (FO) with machine learning technique (ML)	Hotel reviews	82.70%	Increased complexity
[33]	Binary classification problem	Ontology at aspect extraction	Tweets	82.9%	Not suitable for all domains
[34]	To increase the efficiency of the project	Ontology-based text mining method (OTMM) and Self Organizing Maps (SOM)	National Natural Science Foundation of China (NSFC), 110,000 proposals	91.2%	Time-consuming technique
[35]	Polarity shift problem	Polarity Shifting Device Model	Movie reviews	87.1%	Limited accuracy
[36]	Lexicon-based approach	Movie review dataset	77.6%	Hard to refresh the word reference
[37]	Sentiment compression technique before the aspect-based sentiment analysis	Chinese blog review dataset	88.78%	Extractive compression technique failed to achieve accuracy
[38]	For enhancing the prediction accuracy of the unemployment rate	Domain ontology	Unemployment Initial Claims (UICs) values between January 2004 and March 2012, US Department of Labor	81.7%	Rate prediction was not accurate

Table 3

A chronological view of lexicon-based sentiment analysis.

Reference Number	Level of Analysis	Approach	Dataset Used	Results
[42]	Feature-based term weight	Term Frequency (TF)	Multi-domain dataset	85%
[43]	-	Support Vector Machine	Tweets	75.5%
[44]	Feature based	Polarity classifier,enhanced emoticon classifierSentiWordNet-based classifier	Tweets	87%
[45]	-	Relative frequency word count	Drug, car, hotel	80%
[46]	Word level	Relative frequency word count (RFWC)	Tweets	Less than 50%
[47]	Sentence level	Weight scheme	Tweets	70%
[48]	Phrase level	OpenDover (web service)	Tweets	75%
[49]	Sentence level	Semantic-based lexicon analysis	Tweets	-
[50]	Word level	Word emotion lexicon	Socio-political (SP) and sports	Did not add value
[51]	SentiWord level	Opinion works affixation	9 various emotion/sentiment database	94%
[52]	Feature based	Opinion mining and ranking adjective count algorithm	DVD players	97.1%
[53]	Link based	Sentiment strength propagation approach	Hotel survey report	72%

Table 4

A chronological view of machine-learning-based sentiment analysis.

	[60]	[61]	[62]	[63]	[64]	[65]	[66]	[67]
Approach	Unsupervised	Supervised	Classifier approach	Classifier approach	Supervised	Unsupervised	Hybrid	Supervised
Features	Sentiment orientation weight	Sentiment orientation weight Multi-nomial Sentiment Analyis MSA	Word stem + n-gram	Word stem + n-grams MSA	n-gram	Semantic sentiment weight	n-grams	Word stem, SentiWord
Algorithm	Based lexicon	Based lexicon	SVM, Naïve Bayes (NB), KNN	SVM, NB, KNN	SVM, NB	Rule-based and lexicon-based	SVM + lexicon-based approach	SVM, NB
Dataset	Twitter	Social network	Aljazeera	Twitter	Facebook and blogs	Twitter	Twitter	Microblog
Accuracy	83%	97%	96%	67%	75%	46%	84.01%	87%
Limitation	Negation and intensification were not considered	Manual extraction and feature score were done	Fewer features and classifiers were used	Could use more features as well as/Parts of Speech tag	More classifiers can be included to reduce the false rate	Did not add value	Negation was not considered	Usage of the dictionary-based rule to translate the word to MSA

Table 5

Advantages and disadvantages of existing research work on sentiment analysis.

Pros	Cons
• Many utilized fewer parameters.	• Some of the approaches require labeled data.
• Labeling data is not necessary for feature extraction.	• It does not count the absence of text.
• Training is done easily.	• Features are confused at times during analysis.
• Some of the low-featured data can be extracted easily.	• When there is more than one possible opinion, it fails to handle the situation.
• Opinion words are categorized into different forms.	• Many consider only adjectives.
• Develops new standardized data.	• Exact feature identification is complicated.
• Implicit statements are considered in the text during extraction.	• Unlabeled data are not analyzed properly.
• Some of the lexicon approaches provide less false rates.	• It does not analyze short text properly.
• Domain-specific data achieve high accuracy in predicting sentiments.	• Even though it ensures high accuracy, F-measure is also high.
• Neural net models on predicting the sentiment achieve high accuracy with a less false rate.	• Building neural net ID is highly complicated.
• Ontology sentiment analysis provides high accuracy when it is domain-specific.	• The false rate in the analysis is high in ontology when compared to the lexicon.

Table 6

Word count.

Word	Positive Word	Negative Word
She	0.100	0.20000
Like	0.100	0.00010
This	0.010	0.01000
Comedy	0.050	0.00500
Movie	0.010	0.10000

Table 7

Statistical view of dataset.

Dataset	Positive	Negative	Neutral	Total
Testing	10,000	5800	4200	20,000
Training	2000	2000	2000	6000

Table 8

Merits and demerits for lexicon, machine learning and hybrid approaches.

Types	Approaches	Merits	Demerits
Lexicon based	Dictionary-based and rule-based approachEnsemble modeling	There is more extensive-term analysis and, hence, accuracy reaches a high.	This is strictly limited to a smaller number of words in the lexicon trained; hence, it is utilized in the fixed score of opinion.
Machine learning	Bayesian networksSupport vector machinesMaximum entropyRandom forestDecision treeNeural networks	It is efficient in training the model for a particular purpose.	It is only applicable when the data are labeled. This is more costly.
Hybrid	Lexicon + machine learning	It is done at the sentence level, so it is easy at the document level.	It is complex and too noisy.

Table 9

User data example.

User	Comment
User 1	“the amazon delivered the paste before the delivery date,”
User 2	“Paste tastes really good, and my cavity is reducing.”

Word count: 7458

Show less

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Translate

The essential use of natural language processing is to analyze the sentiment of the author via the context. This sentiment analysis (SA) is said to determine the exactness of the underlying emotion in the context. It has been used in several subject areas such as stock market prediction, social media data on product reviews, psychology, judiciary, forecasting, disease prediction, agriculture, etc. Many researchers have worked on these areas and have produced significant results. These outcomes are beneficial in their respective fields, as they help to understand the overall summary in a short time. Furthermore, SA helps in understanding actual feedback shared across different platforms such as Amazon, TripAdvisor, etc. The main objective of this thorough survey was to analyze some of the essential studies done so far and to provide an overview of SA models in the area of emotion AI-driven SA. In addition, this paper offers a review of ontology-based SA and lexicon-based SA along with machine learning models that are used to analyze the sentiment of the given context. Furthermore, this work also discusses different neural network-based approaches for analyzing sentiment. Finally, these different approaches were also analyzed with sample data collected from Twitter. Among the four approaches considered in each domain, the aspect-based ontology method produced 83% accuracy among the ontology-based SAs, the term frequency approach produced 85% accuracy in the lexicon-based analysis, and the support vector machine-based approach achieved 90% accuracy among the other machine learning-based approaches.

Details

Title

Emotion AI-Driven Sentiment Analysis: A Survey, Future Research Directions, and Open Issues

Author

Chakriswaran, Priya¹; Durai Raj Vincent¹

; Srinivasan, Kathiravan¹

; Sharma, Vishal²

; Chuan-Yu, Chang³

; Daniel Gutiérrez Reina⁴

¹ School of Information Technology and Engineering, Vellore Institute of Technology (VIT), Vellore 632 014, Tamil Nadu, India; [email protected] (P.C.); [email protected] (K.S.)
² Department of Information Security Engineering, Soonchunhyang University, Asan 31538, Korea; [email protected]
³ Department of Computer Science and Information Engineering, National Yunlin University of Science and Technology, Yunlin 64002, Taiwan
⁴ Department of Electronic Engineering, University of Seville, 41092 Sevilla, Spain; [email protected]

First page

5462

Publication year

2019

Publication date

2019

Publisher

MDPI AG

e-ISSN

20763417

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.3390/app9245462

ProQuest document ID

2533769726

Emotion AI-Driven Sentiment Analysis: A Survey, Future Research Directions, and Open Issues

Jump to:

Full text

Abstract

Details

Suggested sources