Emotion-RGC net: A novel approach for emotion recognition in social media using RoBERTa and Graph Neural Networks

Abstract

Emotion recognition in social media is a challenging task due to the complex and unstructured nature of user-generated content. In this paper, we propose Emotion-RGC Net, a novel deep learning model that integrates RoBERTa, Graph Neural Networks (GNN), and Conditional Random Fields (CRF) to enhance the accuracy and robustness of emotion classification. RoBERTa is employed for effective feature extraction from unstructured text, GNN captures the propagation and influence of emotions through user interactions, and CRF ensures global consistency in emotion label prediction. We evaluate the proposed model on two widely-used datasets, Sentiment140 and Emotion, demonstrating significant improvements over traditional machine learning models and other deep learning baselines in terms of accuracy, recall, F1-score, and AUC. Emotion-RGC Net achieves an accuracy of 89.70% on Sentiment140 and 88.50% on Emotion, highlighting its effectiveness in handling both coarse- and fine-grained emotion classification tasks. Despite its strong performance, we identify areas for future research, including reducing the model’s reliance on large labeled datasets, improving computational efficiency, and incorporating temporal dynamics to capture emotion evolution in social networks. Our results indicate that Emotion-RGC Net provides a robust solution for emotion recognition in diverse social media contexts.

Full text

Translate

Turn on search term navigation

Introduction

With the widespread use of the internet, social media has become a crucial platform for daily communication and information sharing. Users express their emotions and opinions through various forms such as text, images, and videos, making emotion analysis a hot research topic in fields like social sciences, marketing, and intelligent human-computer interaction. Emotion recognition is not merely about determining sentiment polarity (e.g., positive or negative); more importantly, it involves identifying nuanced emotion categories (e.g., anger, happiness, sadness) to better understand users’ psychological states [1]. As a result, the challenge of efficiently and accurately identifying and classifying emotions from massive volumes of social media data has become a research area of significant practical value. In the commercial realm, emotion recognition can help companies better understand user feedback, optimize products and services, and improve customer satisfaction [2]. In social network analysis, emotion recognition aids in uncovering patterns of emotion propagation, potentially enabling the early detection of collective emotional shifts and the prevention of rumors and negative emotions from spreading. Therefore, developing an automated model capable of accurately capturing the emotions of social media users is not only vital for business applications but also holds broad social significance. However, social media text often includes unstructured and noisy data, such as slang, emoticons, and abbreviations, which complicates the task of emotion recognition [3,4]. Moreover, the relationships between users and the patterns of emotion propagation within networks are also complex factors that must be considered in emotion analysis. As such, existing emotion recognition methods face challenges in efficiently extracting emotional features from complex data and capturing the relationships driving emotion propagation.

In recent years, the rapid development of deep learning technologies has led to significant advances in emotion analysis. Methods based on convolutional neural networks (CNNs), recurrent neural networks (RNNs), and pre-trained language models (such as BERT and RoBERTa) have demonstrated powerful performance in natural language processing tasks. These models automatically extract features from data, reducing the need for manually designed features, which traditional emotion analysis methods heavily relied upon [5]. For instance, pre-trained models like BERT and RoBERTa, through large-scale unsupervised training, capture rich contextual information, enabling them to achieve outstanding performance in downstream tasks such as emotion classification. Despite the many advantages deep learning brings to emotion analysis—such as improving generalization and reducing reliance on labeled data—there are still several unresolved issues [6]. First, social media texts often contain informal language, emoticons, and ambiguity, making it difficult for text-based feature extraction methods alone to accurately capture users’ true emotions. Second, the propagation of emotions within social networks is a complex process, where emotions from different users often influence each other. Traditional emotion analysis methods generally overlook the relational structure between users, leading to suboptimal performance when dealing with large-scale emotion propagation in social networks [7]. Moreover, while pre-trained language models perform well in feature extraction, they typically lack the ability to model emotion propagation and user relationships, limiting their applicability in social media emotion analysis.

To address these issues, this paper proposes a novel emotion recognition method, Emotion-RGC Net, which combines the strengths of the RoBERTa pre-trained language model and Graph Neural Networks (GNNs) to enhance the accuracy of emotion recognition in social media. RoBERTa is leveraged to extract rich emotional features from textual data, capitalizing on its powerful capabilities in natural language understanding [8]. GNNs, on the other hand, excel at processing user relationship networks and can capture the interactions and propagation of emotions between users [9]. Additionally, Conditional Random Fields (CRF) are introduced in the post-processing phase to further refine the emotion classification results, considering the interdependencies between emotion categories [10]. This innovative multi-module model not only accurately identifies emotions in social media but also models the propagation of emotions, overcoming the limitations of traditional methods that fail to effectively utilize user relationship networks.

The main contributions of this paper are as follows:

* This paper propose an end-to-end emotion recognition model that combines RoBERTa and GNN, effectively capturing both textual and relational features.

* The paper enhance emotion classification accuracy by incorporating a Conditional Random Field (CRF) layer.

* This paper provide an in-depth analysis and modeling of emotion propagation mechanisms, expanding the application scenarios of emotion analysis.

Related work

Multimodal emotion analysis

In recent years, multimodal emotion analysis has gained widespread attention. Compared to unimodal approaches (such as pure text), multimodal emotion analysis combines different types of data sources (e.g., text, audio, video) to capture emotional expressions more comprehensively and accurately through the fusion of multiple information streams. For instance, in video-based emotion analysis, researchers do not only analyze textual data but also integrate non-verbal cues such as speech tone and facial expressions to build a multimodal understanding of emotions [11,12]. This multi-angle approach effectively compensates for the limitations of single-modality data and significantly enhances the accuracy and robustness of emotion recognition. The core of multimodal emotion analysis lies in feature fusion—how to effectively combine data from different modalities and process them in a unified way. This process typically relies on advances in deep learning, particularly methods like convolutional neural networks (CNNs) and recurrent neural networks (RNNs), which can automatically extract high-dimensional emotional features from raw audio, video, and textual data [13]. Recently, researchers have begun experimenting with more advanced models, such as Transformer models based on attention mechanisms, to strengthen the connections between different modalities and better capture emotional information. Additionally, by employing joint learning on multimodal data, models can further improve their performance in complex emotional scenarios, such as when users express mixed emotions [14]. The fusion of multiple modalities helps to accurately recognize these nuanced emotional states.

While multimodal emotion analysis offers significant advantages, it also faces considerable challenges. Data from different modalities often have varying formats and temporal characteristics, making data fusion complex. Text processing typically relies on the order of words or sentences, whereas audio and video require temporal information to capture emotional changes [15]. Synchronizing and coordinating information across modalities is one of the key difficulties in multimodal emotion analysis. Furthermore, multimodal approaches usually require substantial computational resources and access to high-quality multimodal datasets, which can be costly to acquire and process. Another issue is data inconsistency—there may be redundancy or conflicts between information from different modalities, and finding ways to preserve relevant information while reducing noise is an ongoing challenge.

Despite these challenges, the potential of multimodal emotion analysis to accurately recognize emotions remains enormous, especially in social media contexts where users often express emotions not only through text but also through images, videos, and even audio [16]. Multimodal emotion analysis thus provides a more comprehensive perspective for emotion recognition. With advancements in deep learning technologies and hardware infrastructure, future research in this field is expected to overcome current challenges, allowing multimodal emotion analysis to play a role in a wider range of applications [17]. In this context, combining textual data with other modalities for emotion analysis is becoming a trend, but in the practical application of social media, text remains the primary medium for conveying information. Efficiently utilizing deep learning techniques for unimodal text-based emotion analysis remains a key research direction. This paper adopts this perspective and proposes a text-based emotion recognition model, Emotion-RGC Net.

Social media emotion propagation network models

On social media platforms, emotions are expressed not only through individual posts and interactions but also propagate through exchanges, reposts, and comments between users. This process of emotion propagation makes emotion analysis not just an individual-level task but a networked phenomenon [18]. In social networks, emotions often spread and evolve in complex ways between users. A user’s emotional state can directly influence other closely connected users, creating a "contagion effect." For instance, when a user expresses strong emotions in a social media post, it can trigger responses or reposts from others, further amplifying and spreading the emotion across a larger network [19]. Research shows that this kind of emotional propagation in social networks follows certain patterns, particularly in how different network structures can affect the strength and speed of emotion diffusion. Effectively modeling these emotional relationships has become a key challenge in improving the accuracy of emotion recognition [20]. To capture and model emotion propagation patterns on social media, Graph Neural Networks (GNNs) have gained significant attention in recent years. GNNs are capable of processing graph-structured data, where nodes represent users and edges represent interactions between them, allowing the construction of a network model that can simulate emotion propagation. Unlike traditional emotion analysis models, GNNs iteratively update the representations of nodes, learning the emotional interaction patterns between users [21]. For example, a user’s emotional state is influenced not only by their own textual features but also by the emotional states of connected users. Through this process, GNNs can capture the complexity of emotion propagation in social networks, providing richer information for emotion classification.

Recent advancements in emotion propagation network models based on GNNs have yielded promising results. Studies have utilized GNNs to analyze interaction patterns between users and found that emotion propagation follows different paths and intensities depending on the structure of the social network [22]. On platforms like Twitter, emotional interactions between users often involve multiple forms of engagement, such as retweets, comments, and likes. By modeling these interactions with GNNs, it becomes possible to predict users’ emotional states more accurately and even forecast the spread of emotions in advance. However, GNNs still face challenges in modeling emotion propagation [23]. The relationships between users are highly complex, and effectively modeling emotion propagation at scale remains a significant issue. Additionally, emotional propagation is not solely dependent on user relationships but is also influenced by other factors, such as topic trends, temporal dynamics, and external events. Solely relying on graph-based models is insufficient to address all challenges in emotion analysis [24]. Incorporating these dynamic factors into graph models to improve the accuracy of emotion propagation prediction is an important direction for future research. The proposed Emotion-RGC Net aims to address these challenges by combining the strengths of RoBERTa and GNN. It leverages the emotional information from textual data while accounting for the relational structures between users, thereby enabling more efficient and accurate emotion recognition on social media.

Application of pre-trained language models

With the rapid advancements in deep learning within the field of Natural Language Processing (NLP), pre-trained language models have become one of the mainstream methods for emotion analysis. These models are pre-trained on large-scale, unsupervised corpora, learning rich contextual information and effectively transferring to downstream tasks such as sentiment recognition and emotion classification [25]. This "pre-training and fine-tuning" paradigm significantly enhances the performance of emotion analysis, especially in tasks where the amount of data is limited or the cost of annotation is high. The successful application of pre-trained models like BERT and RoBERTa has notably improved the accuracy and robustness of textual emotion recognition. In recent years, the application of BERT (Bidirectional Encoder Representations from Transformers) in emotion analysis tasks has garnered significant attention [26]. By utilizing a bidirectional Transformer structure, BERT can simultaneously capture both the preceding and succeeding context of a sentence, leading to a deeper understanding of its semantics and emotional content. Building upon BERT, further improved versions have been developed, such as RoBERTa (Robustly Optimized BERT Approach), which enhances language representation by increasing the amount of pre-training data, extending training time, and optimizing hyperparameters [27]. These pre-trained models can capture not only semantic information at the word and phrase levels but also implicit emotional signals within the context, thus providing a clear advantage in emotion classification tasks.

The application of pre-trained language models in emotion analysis brings several distinct advantages. By leveraging unsupervised learning on large-scale text corpora, these models significantly reduce the reliance on labeled data for emotion analysis tasks. This is particularly critical in scenarios where emotion-labeled datasets are scarce, such as in social media, where obtaining high-quality, emotion-annotated datasets is often challenging [28]. Pre-trained models enable knowledge transfer from the pre-training phase to emotion analysis tasks, thereby enhancing model performance. Moreover, these models exhibit strong generalization capabilities, allowing them to easily adapt to various emotion classification tasks, whether based on broad emotional categories (e.g., anger, joy, sadness) or more fine-grained emotion distinctions [29]. Pre-trained language models also tend to generalize well, not only performing exceptionally on training data but also handling the diversity and complexity of language in real-world applications, thus avoiding issues like overfitting.

Despite the tremendous success of pre-trained models in emotion analysis tasks, several limitations remain. Pre-trained models primarily focus on textual features, while in social media emotion analysis, the relationships and interactions between users are equally critical. While these models can extract emotional features from individual texts, they often overlook the propagation and evolution of emotions within a user network, limiting their potential for complex social network-based emotion analysis [30]. Additionally, large-scale models like RoBERTa require significant computational resources during both training and inference, which poses challenges when applying these models to large-scale social media data [8]. To overcome these limitations, this paper not only leverages RoBERTa to extract emotional features from text but also incorporates Graph Neural Networks (GNNs) to capture the emotional propagation relationships between users. This combination effectively addresses the shortcomings of relying solely on pre-trained models, enabling Emotion-RGC Net to consider both textual and network structure features in social media emotion recognition, thus improving the accuracy and robustness of emotion classification.

Methods

In the field of credit risk assessment, despite the demonstrated potential of deep learning models such as RNNs, LSTM networks, and GRUs in processing sequential data, these methodologies exhibit significant limitations in capturing long-term dependencies. Addressing these challenges, we propose a sophisticated hybrid model that synthesizes the strengths of LSTM and GRU architectures to enhance the accuracy and computational efficiency in credit risk prediction. Our model aims to refine the existing credit scoring paradigms by equilibrating computational efficiency with the proficiency to capture intricate temporal patterns. The subsequent section delineates this model in comprehensive detail.

Overall model: emotion-RGC net

In this paper, we propose Emotion-RGC Net, a multi-module deep learning model designed for emotion recognition on social media. The innovation of this model lies in its ability to address multi-layered challenges in emotion analysis by combining a pre-trained language model (RoBERTa), a Graph Neural Network (GNN), and Conditional Random Fields (CRF). The overall architecture of the model is shown in Fig 1, consisting of three core modules: RoBERTa for emotion feature extraction from text, GNN for modeling user relationships and capturing emotion propagation, and CRF for post-processing and optimizing emotion classification.

[Figure omitted. See PDF.]

The RoBERTa module extracts emotion features from social media text; the GNN module builds and updates the social relationship network between users, capturing emotion propagation patterns; the CRF module globally optimizes the final emotion classification to ensure coherence and consistency in emotional states.

In the text input processing phase, the model receives user-generated text data from social media as input. Given that social media text is often highly unstructured, containing emoticons, non-standard language, and more, Emotion-RGC Net employs the RoBERTa model for deep feature extraction. RoBERTa, with its bidirectional Transformer architecture, captures fine-grained emotional cues and language dependencies from the context. After preprocessing, the text is embedded into vector representations, forming feature vectors rich in semantic and emotional information.

Next comes the user relationship modeling and emotion propagation phase. Emotion-RGC Net utilizes GNN (Graph Neural Network) to capture the network of relationships between social media users. GNN treats users as nodes in the graph and interactions (such as comments, shares, or likes) as edges, constructing a social network structure among users. The text features generated by RoBERTa are passed into the GNN model as initial node (user) representations. Through a multi-layer message-passing mechanism, each user node aggregates emotional information from its neighboring nodes, capturing the emotion propagation patterns between users. After several iterations, the GNN updates each node’s emotion representation, ultimately producing feature vectors that reflect the propagation characteristics of emotions across the social network.

In the third stage, the model performs emotion classification and optimization. The emotion features processed by GNN are input into CRF (Conditional Random Fields), which further optimizes the emotion classification globally. CRF models the dependencies between emotional states, ensuring that the classification results remain consistent across sequences of social interactions. For example, in an interaction chain, users’ emotions often correlate with one another, and CRF captures this dynamic, avoiding disjointed predictions. Additionally, CRF leverages the contextual emotional information provided by the GNN to further enhance the accuracy and coherence of the emotion classification.

As shown in Fig 1, RoBERTa as an emotional feature extraction module, it is responsible for extracting the vector representation with emotion information from the input text. These emotional characteristics usually contain rich emotional information, including the emotional polarity (positive, negative, etc.) and the intensity of the emotion. These features will be sent to the graph neural network (GNN) for further processing and aggregation. The role of GNN in this model is to spread emotional features through the graph structure information between nodes, simulating the propagation process of emotion in social networks. In each layer of GNN, the emotional features are updated according to the emotional information of the adjacent nodes, eventually forming a global understanding of the overall emotional transmission. The CRF module plays a role in optimizing the emotional transmission path. During emotion transmission, inconsistency or noise of emotion information may occur, and the CRF helps the model to select the most consistent emotion sequence and optimizes it in a way of maximizing the global conditional probability. By calculating the transition probability between labels and the relationship between observed features and labels, CRF provides an effective global optimization strategy for the model, which further improves the accuracy and consistency of emotion transmission.

These modules work closely together, and Emotion-RGC Net can model the transmission process of emotion more accurately, and effectively process the local and global dependence of emotional information. RoBERTa The extracted emotion features provide powerful local information for GNN, while the graph structure aggregation ability of GNN provides the optimization basis for CRF, finally forming a powerful and efficient model of emotion transmission. This modular design not only improves the expressive power of the model, but also enhances its robustness in complex emotional transmission scenarios.

The running process of the Emotion-RGC Net model is shown in Algorithm 1.

1. Algorithm 1: Training Process of Emotion-RGC Net

1. Input: Sentiment140 Dataset, Emotion Dataset

2. Output: Trained Emotion-RGC Net, Evaluation Metrics

3. Function TrainEmotionRGCNet(Sentiment140 , Emotion):

1. Initialize Model ← Emotion − RGCNet ;

2. Initialize Optimizer ← Adam ( α = 0 . 001 ) ;

3. Initialize LossFunction ← CrossEntropyLoss ;

4. for epoch ← 1 to max_epochs do

1. For each batch B ∈ { Sentiment140 , Emotion } { inputs , labels ← B;

2. outputs ← Model ( inputs ) ;

3. Calculate loss: Loss ← LossFunction ( outputs , labels ) ;

4. Backpropagate loss and update model parameters: Optimizer . update ( Loss ) ;

5. }

1. if epochmodeval_interval = 0 then

1. predictions ← Model ( test_inputs ) ;

2. accuracy ← Accuracy ( predictions , test_labels ) ;

3. recall ← Recall ( predictions , test_labels ) ;

4. f1 ← F1_score ( predictions , test_labels ) ;

5. auc ← AUC ( predictions , test_labels ) ;

6. if accuracy > best_accuracy then

1. best_accuracy ← accuracy;

2. best_model ← Model;

7. end

2. end

6. end

7. Return best_model , best_accuracy , recall , f1 , auc

Emotion feature extraction module: RoBERTa

In the Emotion-RGC Net model, RoBERTa serves as the key module for emotion feature extraction. Using a bidirectional Transformer structure, it extracts emotional cues from text and generates high-dimensional feature vectors suitable for subsequent processing. The strength of the RoBERTa model lies in its deep language representations learned from extensive pre-training on large-scale corpora, showcasing exceptional capability when handling social media texts with complex semantic dependencies [8,31].

As shown in Fig 2, RoBERTa is based on the BERT architecture and consists of multiple layers of bidirectional Transformer encoders. Transformers allow the model to attend to both preceding and following context simultaneously within a sentence. Compared to traditional unidirectional models, RoBERTa’s bidirectional approach enables it to more comprehensively capture both the semantic and emotional information present in the text. Each Transformer layer comprises two sub-layers: a self-attention mechanism and a feed-forward neural network.

[Figure omitted. See PDF.]

Input text is first processed through word and positional embeddings to generate context-aware vector representations. After being processed by the multi-head self-attention mechanism, the model gradually produces emotion features containing global contextual information, with the final global emotion representation output through the [CLS] token.

In the Emotion-RGC Net model, RoBERTa is responsible for extracting emotion features from social media text. The process includes input preprocessing, generating contextual feature representations through the Transformer layers, and outputting the final emotion feature vector.

The input text is first tokenized, mapping the text sequence into a series of subword units (tokens) and representing them as embedding vectors. Suppose the tokenized input sequence is , then the embedding for each token is defined as:

(1)

where is the word embedding from the matrix , and is the positional embedding from matrix , where d is the dimension of the embedding vector, V is the vocabulary size, and n is the input sequence length. Through word and positional embeddings, the text sequence is transformed into context-aware representations, which serve as input to the Transformer layers.

Each layer of the Transformer contains a multi-head self-attention mechanism to capture the contextual dependencies within the input text sequence. For each input sequence , the self-attention mechanism computes attention scores that capture the relationships between each token and other tokens in the sequence:

(2)

where , , and , with as the projection matrices for queries, keys, and values, and being the dimension of each head. For multi-head attention, the formula expands to:

(3)

where , and is the output weight matrix, with h as the number of attention heads.

This mechanism allows each token to attend to different features of other tokens across multiple attention heads, enhancing the model’s ability to understand complex texts, particularly in capturing long-range dependencies. After being processed through multiple Transformer layers, the contextual feature representation generated is:

(4)

where is the output matrix of the l-th layer, representing the contextual feature representation after multi-head attention and residual connection.

After several Transformer layers, RoBERTa produces the contextual feature representation matrix , where L represents the output of the final layer. To generate a global representation of the text, the special [CLS] token is used, which is designed to represent the entire input sequence:

(5)

where is the feature vector for the [CLS] token, representing the global emotion of the entire text.

After RoBERTa’s feature extraction process, the social media text is mapped to a global emotion feature vector z:

(6)

where is the output of the final layer’s [CLS] token, normalized to produce the final global feature vector z, which is then passed to the subsequent Graph Neural Network (GNN) module for modeling emotion propagation and interaction among users.

Through this multi-level feature extraction process, Emotion-RGC Net is able to capture fine-grained emotion features from text. In the context of social media, this feature extraction lays a solid foundation for the subsequent stages of emotion propagation modeling and emotion classification.

User relationship modeling module: GNN

In the Emotion-RGC Net model, the Graph Neural Network (GNN) is responsible for capturing user relationships and emotion propagation within social networks [32]. Since emotions are not only expressed by individuals but also propagated through interactions among users, relying solely on text feature extraction often fails to reflect the dynamic evolution of emotions [9]. To address this issue, GNN models the social relationships between users and incorporates the emotion propagation process into the emotion analysis framework, thereby enhancing the overall accuracy of emotion recognition. In this model, each user is represented as a node in a graph, with edges representing interactions between users (e.g., comments, likes, shares). By iteratively passing messages between nodes and updating their states, each user node learns not only from its own emotional features but also from the emotional states of its neighboring nodes, capturing the patterns of emotion propagation throughout the network. GNN models emotion propagation in the social network through the following iterative update formula:

(7)

As shown in Fig 3, GNN gradually updates the emotion representation of each node through a multi-layer message-passing mechanism. Initially, the feature vectors of the user nodes come from the text emotion features extracted by RoBERTa. GNN then aggregates information from neighboring nodes to update each node’s state. After multiple layers of propagation, the final emotion representation of each node incorporates both its own emotion features and those of its neighboring nodes. The GNN process consists of three steps: node initialization, message passing, and node updating, with the final output being feature vectors that encapsulate emotion propagation information.

[Figure omitted. See PDF.]

It includes a social network structure composed of nodes (users) and edges (interactions between users). Through multiple layers of message passing, nodes receive and aggregate emotional information from their neighboring nodes, gradually updating the emotional representation of each node.

The input to GNN is a graph structure G = ( V , E ) , where V represents the set of user nodes and E represents the interaction relationships between users. For each node , its initial feature vector is derived from the text features extracted by RoBERTa, :

(8)

where is the text posted by user , and is the emotion feature vector generated by RoBERTa, with d being the dimensionality of the feature vector.

The core of GNN is the message-passing mechanism, where information is propagated between nodes via their edges. In each round of message passing, a node receives information from its neighboring nodes , and the features of the neighboring nodes are aggregated using an aggregation function AGG. The message-passing formula is as follows:

(9)

where represents the message passed to node from its neighboring nodes in layer l + 1, and is the feature representation of neighboring node in layer l. The aggregation function AGG can be as simple as summation or averaging, or it can be more complex, such as max pooling or weighted summation.

After each round of message passing, the feature vector of each node is updated based on the aggregated information from its neighbors:

(10)

where is a learnable weight matrix, and σ is a non-linear activation function (such as ReLU). This process combines both the node’s own features and its neighbors’ information, allowing each node to capture the emotion propagation within the entire network.

After L layers of message passing and node updating, the final feature vector for each node contains the emotional features of both the node itself and its neighbors. The output of the GNN is the feature matrix , representing the emotional states of all nodes in the social network, where | V | is the number of nodes and d is the dimension of each node’s feature vector.

(11)

These feature vectors will serve as input to the next step in the pipeline, the Conditional Random Fields (CRF) module, for global optimization of emotion classification.

During the integration process, the embedding vector output by RoBERTa is converted into the input node features of GNN, and the relationship between nodes is further mined through the graph convolution layer of GNN, and information transmission and aggregation between nodes are performed with the support of the global graph structure. Through this integration strategy, we can effectively combine the semantic information of text and the spatial relationship of graph structure to improve the performance of the model in complex tasks.

Post-processing module: CRF

Conditional Random Fields (CRF) are a widely used probabilistic model for sequence labeling tasks, particularly suited to handling problems with interdependent labels. CRF models the input features and the output label sequence jointly, considering the dependencies between labels to maximize the global probability of the label sequence [33]. In emotion classification tasks, CRF is particularly useful for capturing transitions between emotional states. For instance, a user’s emotional state across multiple social media posts is often continuous, or the emotional states of different users may become correlated through interactions [10].

In the Emotion-RGC Net model, the CRF module plays a key role in post-processing. While the GNN generates feature vectors for each user node, incorporating both their own emotional features and those of their neighbors, emotion classification is not simply an isolated task. Emotional states often exhibit continuity over time or correlations across user relationships. By modeling the dependencies between emotional states, CRF performs a global optimization on the emotion features output by the GNN, thus improving the accuracy of the final emotion classification.

As shown in Fig 4, CRF takes the feature vectors output by the GNN, , as input and optimizes the global emotion classification by considering the dependencies between emotion labels. Each node’s feature vector corresponds to the predicted emotional state of user . CRF not only considers the features of individual nodes but also introduces transition probabilities between labels to ensure coherence in the global emotion classification results.

[Figure omitted. See PDF.]

CRF receives the node emotion features from GNN, calculates the matching score of each node’s emotion label using a feature function ψ, and models the dependencies between neighboring nodes’ labels using a label transition function ϕ. The Viterbi algorithm is then employed to globally decode the optimal emotion label sequence by maximizing the conditional probability.

In this model, CRF optimizes the emotion features output by GNN by modeling the joint probability of the emotion label sequence. Given the feature vectors from GNN, the goal of CRF is to predict the emotion labels by maximizing the conditional probability , where is the emotion label of node .

The conditional probability model for CRF is defined as:

(12)

where is the feature function for node , representing the compatibility between node ’s emotion label and its feature vector . is the transition function, modeling the dependencies between neighboring emotion labels and , capturing the transition patterns of emotional states. is the normalization factor that ensures the sum of the probability distribution equals 1:

(13)

The feature function represents how well the emotion label matches the feature vector for node , expressed through a linear model as:

(14)

where is the weight vector for emotion label y, and is a bias term. This function allows CRF to assign a preliminary probability to each node’s emotion label based on its feature vector.

The transition function models the dependencies between consecutive emotion labels, ensuring that the emotional states of neighboring nodes do not change abruptly. It is defined as:

(15)

where is the transition weight matrix representing the transition from emotion label to . This matrix can be learned from historical data by capturing the transition patterns between labels.

CRF finds the optimal emotion label sequence Y by maximizing the conditional probability . Since directly computing the conditional probability for all possible label sequences is computationally expensive, the Viterbi algorithm is employed to efficiently find the most likely label sequence globally, ensuring the consistency and coherence of the final classification results:

(16)

By introducing the transition function between labels and the feature matching function, CRF ensures that the emotion classification results are consistent both temporally and within the relational network. This global optimization process allows emotion classification to account for not only individual features but also the dynamic changes of emotions in context, significantly improving classification accuracy. The CRF module provides global optimization for the final output of Emotion-RGC Net, combining the local emotion features generated by the GNN with the coherence of the global emotion sequence, thus enhancing the model’s classification capability in social media emotion recognition.

Experiment

Datasets

To evaluate the effectiveness of Emotion-RGC Net in social media emotion recognition tasks, we conducted experiments using two widely utilized sentiment analysis datasets: the Sentiment140 Dataset and the Emotion Dataset. These datasets encompass a rich variety of social media content, offering diversity and representativeness, which effectively assess the model’s performance across different scenarios.

Sentiment140 is a well-established social media dataset in the sentiment analysis domain, primarily based on Twitter. It contains over 1.6 million tweets labeled with sentiment categories: positive, negative, and neutral. In addition to the tweet text, the dataset includes information about interactions between users, providing a foundation for modeling emotion propagation [34]. The tweets are typically short, highly informal, and feature a significant amount of emoticons, abbreviations, slang, and typos. Moreover, the subtlety and ambiguity in emotional expressions make emotion recognition particularly challenging. The Sentiment140 dataset is particularly suitable for testing the RoBERTa module of Emotion-RGC Net, specifically its capability to extract features from complex language structures, and for evaluating GNN’s performance in capturing user relationships and modeling emotion propagation.

The Emotion Dataset, in contrast, focuses on fine-grained multi-class emotion classification. Unlike Sentiment140, this dataset provides a richer set of emotion labels, including joy, anger, sadness, fear, surprise, disgust, and neutral. The data comes from social media comments, blogs, and conversation data, covering a wide range of emotional expressions and contexts [35]. This dataset offers an excellent opportunity to assess the CRF module in Emotion-RGC Net, particularly in its ability to enforce global consistency in emotion label optimization. Additionally, the dataset includes user interaction information, which can be used to further explore the paths and evolution of emotion propagation.

The Sentiment140 and Emotion datasets cover both coarse-grained and fine-grained emotion analysis tasks. The former focuses on basic sentiment classification, while the latter introduces more complex emotion categories. This diversity ensures that the experimental results span from fundamental to advanced emotion classification scenarios. Furthermore, both datasets are based on real-world social media content, providing high representativeness. The tweets, comments, and other content reflect the diverse forms of emotional expression prevalent on social media, enhancing the practical applicability of the experimental outcomes.

The Sentiment140 and Emotion datasets provide a comprehensive foundation for evaluating the performance of Emotion-RGC Net. They not only allow us to test RoBERTa’s ability to extract textual features but also validate GNN’s effectiveness in capturing user relationships and emotion propagation, as well as assess the contribution of the CRF module in optimizing global emotion classification.

Experimental details

In our experiments, we conducted detailed parameter tuning for each module of Emotion-RGC Net to ensure the model effectively handles social media emotion recognition tasks. Additionally, multiple evaluation metrics were employed to measure the model’s performance, ensuring its competitiveness across various dimensions. The parameters of RoBERTa, GNN, and CRF modules were carefully adjusted to optimize the model’s performance on different datasets. The specific parameter settings for each module are presented in Table 1:

[Figure omitted. See PDF.]

To comprehensively evaluate the performance of Emotion-RGC Net in emotion recognition tasks, we selected four key evaluation metrics. Accuracy measures the overall performance of the model by indicating the ratio of correctly classified samples to the total number of samples. Recall represents the model’s ability to identify actual positive samples. F1-score is the harmonic mean of precision and recall, balancing the model’s performance between these two aspects. AUC (Area Under the Curve) evaluates the model’s classification performance across various thresholds, with values ranging from 0 to 1, where a higher value indicates better classification performance.

(17)(18)(19)

These metrics provide a comprehensive evaluation of the model’s performance from different perspectives, ensuring that Emotion-RGC Net not only performs well in terms of accuracy but also effectively captures the true emotional categories. This is especially important when handling complex social media data, where robustness and generalization are critical.

In terms of data division, Sentiment140 Dataset contains over 1.6 million tweets, and after random shuﬄing, we divided them into the training and test set according to a ratio of 80% and 20%. For Emotion Dataset, this dataset contains about 100,000 text data of multi-category emotion labels, and we divided the training set by 70% , 15% and 15%, to ensure a balanced distribution of emotion categories in each dataset and avoid too many or too few samples in a certain category.

[Figure omitted. See PDF.]

In addition, all texts are standardized and cleaned, removing HTML tags, URs and special characters, and using RoBERTa’s tokenizer to transform each text into a fixed-length word vector representation (length 256), truncated or populated if necessary to meet the input requirements of the model. For Sentiment140 Dataset, the number of samples of each category is relatively balanced by oversampling technology. For Emotion Dataset, due to the inclusion of seven emotion categories and uneven distribution of categories, the category samples were balanced by oversampling techniques to improve the diversity of the training set and the generalization ability of the model.

Results analysis

From the loss curves (Fig 5), it can be observed that both the training loss and validation loss on the two datasets exhibit a significant downward trend as the training epochs increase, indicating that the model converges rapidly within a few training cycles. On the Sentiment140 dataset, the loss value drops sharply from an initially high value (around ) within the first 10 epochs, reflecting that Emotion-RGC Net effectively learns the emotional features of this dataset. A similar trend is observed on the Emotion dataset, where the initial loss is relatively high but quickly decreases within the first few epochs and gradually converges. This fast convergence can be attributed to the RoBERTa pre-trained model and the GNN module in the design. RoBERTa’s strong capability in extracting textual features allows it to capture key emotion-related information early on, accelerating the learning process. The GNN further captures the emotional propagation between users, helping the model quickly gain global insight during the early stages of training, which facilitates overall convergence.

On the Sentiment140 dataset, the validation loss closely mirrors the training loss, with only a small gap between the two, indicating that Emotion-RGC Net demonstrates good generalization on this dataset. In contrast, on the Emotion dataset, while the overall trend is similar, the validation loss shows greater fluctuation, particularly in the later stages of training (between 20 and 30 epochs), where a temporary increase in validation loss is observed. This suggests that the multi-emotion nature of the Emotion dataset poses greater challenges for the model, making it harder to generalize effectively when handling more complex emotion classification tasks. This phenomenon highlights that while Emotion-RGC Net performs stably in fine-grained emotion classification tasks (such as those in the Emotion dataset), its generalization capability on a broader range of emotion categories still requires further optimization. It also reflects that the model design of Emotion-RGC Net has a clear advantage in handling simpler emotion classification tasks, but improvements are needed when addressing more complex multi-class emotion classification.

From the loss curves, we can conclude that Emotion-RGC Net demonstrates good convergence and robustness on both datasets. The model’s training and validation losses remain stable, particularly on the Sentiment140 dataset, validating its effectiveness in coarse-grained emotion classification tasks. However, the greater fluctuation in validation loss on the Emotion dataset indicates that multi-class emotion classification presents a higher challenge for the model. Future work could focus on further optimizing the model structure and data processing methods to enhance performance in fine-grained emotion analysis.

The Emotion dataset contains seven emotion categories, and compared with the three types of emotion labels in the Sentiment140 dataset, the task is undoubtedly more complex for fine-grained emotion classification in the Emotion dataset. We observed that the validation loss curve of the model showed large fluctuations on the Emotion dataset, especially in the late training period. While observing the results of the confusion matrix (Fig 6), we found that the classification accuracy of the "anger" and "fear" emotion categories was lower, while the classification performance of "happy" and "sadness" was more stable. This phenomenon indicates that there is some crossover and confusion between the emotion categories of the Emotion dataset, especially in the fine-grained emotion classification task, where the context of some emotion categories is similar, causing the model to accurately distinguish between these categories.

[Figure omitted. See PDF.]

To further understand this phenomenon, we also performed detailed statistics on the accuracy, recall, and F1-score for each emotion category. The results showed that some categories in the Emotion dataset (such as "anger" and "fear") caused the model to perform poorly in the recognition of such emotions due to the ambiguity and polysemy of emotional expression. However, other categories such as "happy" and "sad" were more obvious due to the characteristics of emotional expression. Emotion-RGC Net shows strong ability in handling fine-grained emotion classification tasks, but also exposes limitations in multi-category emotion classification, especially when there is a large similarity or crossover between emotion categories.

As shown in the results of Table 2, we compared the performance of the Emotion-RGC Net model with several commonly used sentiment analysis models. Emotion-RGC Net demonstrated outstanding performance across various metrics on both the Sentiment140 and Emotion datasets, particularly excelling in key metrics such as accuracy, recall, F1-score, and AUC. On the Sentiment140 dataset, Emotion-RGC Net achieved an accuracy of 89.70%, and on the Emotion dataset, it reached 88.50%, significantly outperforming other models.

[Figure omitted. See PDF.]

Compared to traditional machine learning models such as Naive Bayes, Logistic Regression, and Decision Tree, Emotion-RGC Net showed substantial improvements. Specifically, on the Sentiment140 dataset, the accuracy of traditional models like Naive Bayes was 76.57%, and Logistic Regression reached 78.01%, whereas Emotion-RGC Net achieved 89.70%. This highlights the clear advantage deep learning models have in handling large-scale social media data, especially in capturing emotional context and complex relationships. Traditional models rely heavily on manual feature engineering, while Emotion-RGC Net leverages RoBERTa to automatically extract emotional features from text and uses GNN to model interactions between users, combined with CRF for global optimization. These results indicate that the design of Emotion-RGC Net significantly enhances both the accuracy and robustness of emotion classification.

However, despite RoBERTa’s excellent performance in sentiment feature extraction, we noticed that it may have some limitations when dealing with complex sentiment categories. Although RoBERTa can capture rich sentiment information, it may encounter the problem of "missing context" or "insufficient background information" when faced with the diversity and fine-grained distinction of sentiment categories. For example, when sentiment expression depends on deeper background knowledge or context dependence, the language model relying solely on RoBERTa may not be able to fully understand and distinguish these complex emotions. In our experiments, Emotion-RGC Net effectively utilizes the social relationship network between users by combining RoBERTa and GNN, thereby making up for the possible shortcomings of RoBERTa in dealing with subtle differences in sentiment categories. Specifically, GNN can capture the potential connections between different sentiment categories by modeling the interactive relationship between users, thereby improving the classification accuracy.

In comparison with other deep learning models, such as LSTM, GRU, and CNN-LSTM, Emotion-RGC Net still maintains a clear advantage. On the Sentiment140 dataset, LSTM achieved an accuracy of 79.10%, while Emotion-RGC Net reached 89.70%. Similarly, on the Emotion dataset, LSTM achieved 78.90%, whereas Emotion-RGC Net achieved 88.50%, showing a significant performance gap. Although models like LSTM and GRU are effective at handling sequential data and have shown promising results in sentiment analysis tasks, they are limited in capturing long-range dependencies and processing large-scale social media data. Emotion-RGC Net, by combining the strengths of RoBERTa and GNN, not only captures emotional features from individual texts but also utilizes the relational network between users to model emotion propagation, significantly boosting its classification ability.

Additionally, Emotion-RGC Net performed exceptionally well on both F1-score and AUC metrics. F1-score is the harmonic mean of precision and recall, and it provides a more balanced measure of a model’s performance, especially on imbalanced data. On the Sentiment140 dataset, Emotion-RGC Net achieved an F1-score of 90, and on the Emotion dataset, it reached 89, both far exceeding other models. This indicates that Emotion-RGC Net not only captures more positive samples but also avoids excessive false positives. In terms of AUC (Area Under the Curve), Emotion-RGC Net scored 0.95 on Sentiment140 and 0.93 on the Emotion dataset, while other models typically had AUCs around 0.85. This further confirms that Emotion-RGC Net excels in distinguishing between positive and negative classes across different thresholds, which is crucial for accurate classification in sentiment analysis tasks.

From the above results, we can see that Emotion-RGC Net not only performs well in traditional sentiment analysis tasks, but also shows strong advantages when facing the fine-grained problem of sentiment classification. The superiority of RoBERTa in sentiment feature extraction combined with the advantage of GNN in capturing relationships between users greatly improves the overall performance of the model and makes up for the limitations of RoBERTa in dealing with complex sentiment categories.

Table 3 presents the results of the ablation study, which demonstrates that each module of the Emotion-RGC Net is critical for improving overall performance.

[Figure omitted. See PDF.]

When any module is removed, significant drops in accuracy, recall, and F1-score are observed on both the Sentiment140 and Emotion datasets. Particularly, the removal of the RoBERTa and CRF modules results in substantial declines in performance. After removing RoBERTa, the accuracy on the Sentiment140 dataset drops from 89.70% to 79.10%, highlighting RoBERTa’s pivotal role in extracting emotional features from text. Similarly, removing the CRF module significantly affects the model’s performance, especially in terms of consistency in emotion classification (F1-score). CRF’s global optimization of emotion labels ensures coherent classification, and when this module is removed, the F1-score decreases by more than 3 percentage points.Moreover, the GNN module plays a crucial role in capturing the propagation of emotions between users. When the GNN module is removed, the model’s performance also experiences a substantial decline, indicating that the information between users is essential for optimizing emotion classification. These results collectively suggest that the design of Emotion-RGC Net is both rational and effective, with the collaboration of each module significantly enhancing the model’s performance in emotion recognition tasks.

Figs 7, 8 and 9 illustrates the visualization of the ablation study results for the Emotion-RGC Net model on both the Sentiment140 and Emotion datasets.

[Figure omitted. See PDF.]

Each module in the Emotion-RGC Net model plays an indispensable role in emotion recognition tasks. Especially when dealing with complex social media sentiment analysis tasks, the collaborative work of each module significantly improves the overall performance of the model. RoBERTa, as an emotional feature extraction module, plays a crucial role in extracting emotional information from text. After removing RoBERTa, the accuracy of the model dropped significantly, especially on the Sentiment140 data set, where the accuracy dropped from 89.70% to 79.10%. This shows that RoBERTa’s powerful text understanding capabilities play a key role in the model and can effectively capture emotion-related fine-grained features. Without RoBERTa, the model cannot extract enough emotional information from the text, resulting in significant performance degradation.

The removal of the CRF module also has a significant impact on model performance. CRF optimizes the consistency of emotion classification by modeling the global dependencies between emotion labels, especially in the temporal continuity and logical relationships between labels in emotion classification tasks. After removing CRF, the model’s F1-score dropped by more than 3 percentage points, indicating that after the relationship between emotional labels was ignored, more inconsistencies and errors appeared in the classification results. This result highlights the global optimization role of CRF in emotion classification tasks, especially when dealing with emotion labels with dependencies and temporal sequence.

In addition, the removal of the GNN module also has a greater impact on model performance. The core advantage of GNN is to capture the emotional communication relationship between users through the graph structure, which is crucial for social media sentiment analysis, because emotions are often not only expressed by individuals, but also spread through interactions in social networks. After removing the GNN module, the performance of the model dropped significantly, especially in terms of capturing the emotional propagation and dependence between users, and the accuracy and generalization ability of the model were significantly reduced. This result further proves the important role of GNN in emotional propagation modeling. Especially in large-scale social network data, the impact of interactive relationships between users on emotional analysis cannot be ignored.

The module design of Emotion-RGC Net is well thought out, and the collaborative work of each module significantly enhances the model’s performance in complex emotion classification tasks. RoBERTa provides the model with powerful text feature extraction capabilities, GNN provides additional contextual information by modeling the emotional communication relationship between users, and CRF further optimizes the global consistency of the classification results on this basis.

Conclusion

This paper proposes a novel emotion recognition model Emotion-RGC Net, combining three modules: RoBERTa, graph Neural Network (GNN) and conditional random field (CRF), aiming to improve the accuracy and robustness of emotion analysis in social media. The large-scale experimental results on Sentiment140 and Emotion datasets show that Emotion-RGC Net performs well in all metrics, especially in core metrics such as accuracy, recall and F1-score, and significantly outperforms traditional machine learning models and other deep learning models. RoBERTa The pre-training model plays an important role in processing the unstructured text of social media and extracting emotional features. By modeling the interaction between users, GNN captures the transmission and influence process of emotion in social networks, and effectively improves the accuracy of emotion recognition. CRF further enhances the global consistency and coherence of emotion classification results by modeling the dependence of emotion labels.

Despite the excellent experimental results of Emotion-RGC Net, the model still has some limitations, especially in terms of computational complexity and data dependency. First, the model relies on pre-trained RoBERTa and large-scale annotated data, and may face performance degradation when dealing with low-resource languages or data-scarce sentiment analysis scenarios. Second, the computational complexity of the GNN module is high, which may lead to a lot of time and resource consumption, especially when dealing with large-scale social networks. On the other hand, although the global optimization mechanism of CRF can improve the consistency of emotion classification, it may have limited effect when dealing with extreme emotions or atypical emotional expressions.

Future research will focus on enhancing the model’s performance in small-sample or unlabeled data scenarios through data augmentation or unsupervised learning techniques. Additionally, we aim to incorporate more emotional dimensions or fine-grained emotional categories to address more complex sentiment analysis challenges. As social media content becomes increasingly diverse and real-time, the dynamic temporal changes of emotions are also becoming an important issue. We plan to explore how to integrate temporal dynamics to capture the evolving patterns of emotions in social networks, thus improving the applicability and robustness of Emotion-RGC Net.

Emotion-RGC Net not only holds significant technical value but also demonstrates profound societal impact, particularly in the areas of online emotional support and mental health intervention. By analyzing users’ emotions in real time on social media, Emotion-RGC Net can provide timely and effective emotional support for online mental health interventions, helping to identify users experiencing emotional distress and providing relevant organizations with scientifically backed intervention data. This is of critical importance in addressing the growing mental health challenges in modern society. Additionally, Emotion-RGC Net can play a key role in social sentiment monitoring, where it can track public emotion fluctuations, offering governments and businesses early warnings on public sentiment and social dynamics. This can assist decision-makers in optimizing public policies and business strategies.

However, with the widespread application of sentiment recognition technology, privacy protection and data ethics issues are becoming increasingly prominent challenges. Sentiment analysis models rely on users’ personal data for training, which may involve the collection and use of sensitive information. Therefore, ensuring user privacy, data security, and adhering to ethical standards during model development and deployment are critical concerns. To address these, we recommend implementing strict privacy protection measures when using Emotion-RGC Net, such as data encryption and anonymization techniques. Additionally, transparency regarding data sources and usage purposes should be strengthened to ensure that user privacy is safeguarded and personal rights are not violated. We believe that by continuously optimizing the model and reinforcing ethical oversight, Emotion-RGC Net will better serve society and contribute to the healthy development of sentiment analysis technology.

References

1. 1. Chakraborty K, Bhatia S, Bhattacharyya S, Platos J, Bag R, Hassanien AE. Sentiment analysis of COVID-19 tweets by deep learning classifiers-a study to show how popularity is affecting accuracy in social media. Appl Soft Comput. 2020;97:106754. pmid:33013254

* View Article

* PubMed/NCBI

* Google Scholar

2. 2. Nemes L, Kiss A. Social media sentiment analysis based on COVID-19. J Inf Telecommun. 2020;5(1):1–15.

* View Article

* Google Scholar

3. 3. Chauhan P, Sharma N, Sikka G. The emergence of social media data and sentiment analysis in election prediction. J Ambient Intell Human Comput. 2020;12(2):2601–27.

* View Article

* Google Scholar

4. 4. Chen Q, He F, Wang G, Bai X, Cheng L, Ning X. Dual Guidance enabled fuzzy inference for enhanced fine-grained recognition. IEEE Trans Fuzzy Syst. 2025;33(1):418–30.

* View Article

* Google Scholar

5. 5. Mohbey KK, Meena G, Kumar S, Lokesh K. A CNN-LSTM-based hybrid deep learning approach for sentiment analysis on monkeypox tweets. New Gener Comput. 2023;42(1):89–107.

* View Article

* Google Scholar

6. 6. Behera RK, Jena M, Rath SK, Misra S. Co-LSTM: convolutional LSTM model for sentiment analysis in social big data. Inf Process Manag. 2021:58(1);102435.

* View Article

* Google Scholar

7. 7. Dang NC, Moreno-García MN, De la Prieta F. Sentiment analysis based on deep learning: a comparative study. Electronics 2020;9(3):483.

* View Article

* Google Scholar

8. 8. Liao W, Zeng B, Yin X, Wei P. An improved aspect-category sentiment analysis model for text sentiment analysis based on RoBERTa. Appl Intell. 2021;51:3522–33.

* View Article

* Google Scholar

9. 9. Benslimane S, Azé J, Bringay S, Servajean M, Mollevi C. A text and GNN based controversy detection method on social media. World Wide Web. 2022;26(2):799–825.

* View Article

* Google Scholar

10. 10. Huang B, Zhang J, Ju J, Guo R, Fujita H, Liu J. CRF-GCN: an effective syntactic dependency model for aspect-level sentiment analysis. Knowl-Based Syst. 2023;260:110125.

* View Article

* Google Scholar

11. 11. Kumar VS, Pareek PK, de Albuquerque VH, Khanna A, Gupta D, Renukadevi D. Multimodal sentiment analysis using speech signals with machine learning techniques. In: Proceedings of the 2022 IEEE 2nd Mysore Sub Section International Conference (MysuruCon). 2022. p. 1–8.

* View Article

* Google Scholar

12. 12. Ran, H, Gao, X, Li, L, Li, W, Tian, S, Wang, G, et al. Brain-inspired fast-and slow-update prompt tuning for few-shot class-incremental learning. IEEE Trans Neural Netw Learn Syst. 2024. IEEE.

* View Article

* Google Scholar

13. 13. Pandeya YR, Bhattarai B, Lee J. Deep-learning-based multimodal emotion classification for music videos. Sensors (Basel) 2021;21(14):4927. pmid:34300666

* View Article

* PubMed/NCBI

* Google Scholar

14. 14. Gandhi A, Adhvaryu K, Poria S, Cambria E, Hussain A. Multimodal sentiment analysis: a systematic review of history, datasets, multimodal fusion methods, applications, challenges and future directions. Inf Fusion. 2023;91:424–44.

* View Article

* Google Scholar

15. 15. Xu K, Lin Z, Zhao J, Shi P, Deng W, Wang H. Multimodal deep learning for social media popularity prediction with attention mechanism. In: Proceedings of the 28th ACM International Conference on Multimedia. 2020. p. 4580–4.

* View Article

* Google Scholar

16. 16. Figuerêdo JSL, Maia ALLM, Calumby RT. Early depression detection in social media based on deep learning and underlying emotions. Online Soc Netw Media. 2022;31:100225.

* View Article

* Google Scholar

17. 17. Sehar U, Kanwal S, Dashtipur K, Mir U, Abbasi U, Khan F. Urdu sentiment analysis via multimodal data mining based on deep learning algorithms. IEEE Access. 2021;9:153072–82.

* View Article

* Google Scholar

18. 18. Gandhi UD, Malarvizhi KP, Chandra Babu G, Karthick G. Sentiment analysis on Twitter data by using convolutional neural network (CNN) and long short term memory (LSTM). Wirel Pers Commun. 2021:1–10.

* View Article

* Google Scholar

19. 19. Basiri ME, Nemati S, Abdar M, Cambria E, Acharya UR. ABCDM: an attention-based bidirectional CNN-RNN deep model for sentiment analysis. Future Gen Comput Syst. 2021;115:279–94.

* View Article

* Google Scholar

20. 20. Singh M, Jakhar AK, Pandey S. Sentiment analysis on the impact of coronavirus in social life using the BERT model. Soc Netw Anal Min 2021;11(1):33. pmid:33758630

* View Article

* PubMed/NCBI

* Google Scholar

21. 21. Kauffmann E, Peral J, Gil D, Fernández A, Sellers R, Mora H. A framework for big data analytics in commercial social networks: a case study on sentiment analysis and fake review detection for marketing decision-making. Indust Market Manag. 2020;90:523–37.

* View Article

* Google Scholar

22. 22. Nazir A, Rao Y, Wu L, Sun L. Issues and challenges of aspect-based sentiment analysis: a comprehensive survey. IEEE Trans Affect Comput. 2022;13(2):845–63.

* View Article

* Google Scholar

23. 23. Liang B, Su H, Gui L, Cambria E, Xu R. Aspect-based sentiment analysis via affective knowledge enhanced graph convolutional networks. Knowl-Based Syst. 2022;235:107643.

* View Article

* Google Scholar

24. 24. Himelboim I, Xiao X, Lee DKL, Wang MY, Borah P. A social networks approach to understanding vaccine conversations on Twitter: network clusters, sentiment, and certainty in HPV social networks. Health Commun. 2020;35(5):607–15. pmid:31199698

* View Article

* PubMed/NCBI

* Google Scholar

25. 25. Azzouza N, Akli-Astouati K, Ibrahim R. TwitterBERT: framework for Twitter sentiment analysis based on pre-trained language model representations. In: Proceedings of the Emerging Trends in Intelligent Computing and Informatics: Data Science, Intelligent Information Systems and Smart Computing 4. 2020. p. 428–37.

* View Article

* Google Scholar

26. 26. Mao R, Liu Q, He K, Li W, Cambria E. The biases of pre-trained language models: an empirical study on prompt-based sentiment analysis and emotion detection. IEEE Trans Affective Comput. 2023;14(3):1743–53.

* View Article

* Google Scholar

27. 27. Mathew L, Bindu VR. A review of natural language processing techniques for sentiment analysis using pre-trained models. In: 2020 Fourth International Conference on Computing Methodologies and Communication (ICCMC). 2020. p. 340–5.

* View Article

* Google Scholar

28. 28. Alduailej A, Alothaim A. AraXLNet: pre-trained language model for sentiment analysis of Arabic. J Big Data 2022;9(1):72.

* View Article

* Google Scholar

29. 29. Najafi A, Varol O. TurkishBERTweet: fast and reliable large language model for social media analysis. Exp Syst Appl. 2024;255:124737.

* View Article

* Google Scholar

30. 30. Jain PK, Quamer W, Saravanan V, Pamula R. Employing BERT-DCNN with sentic knowledge base for social media sentiment analysis. J Ambient Intell Human Comput. 2022;14(8):10417–29.

* View Article

* Google Scholar

31. 31. Tan KL, Lee CP, Anbananthen KSM, Lim KM. RoBERTa-LSTM: a hybrid model for sentiment analysis with transformer and recurrent neural network. IEEE Access. 2022;10:21517–25.

* View Article

* Google Scholar

32. 32. Wang M, Hu G. A novel method for Twitter sentiment analysis based on attentional-graph neural network. Information. 2020:11(2);92.

* View Article

* Google Scholar

33. 33. Li W, Du Y, Li X, Chen X, Xie C, Li H, Li X. UD BBC: Named entity recognition in social network combined BERT-BiLSTM-CRF with active learning. Eng Appl Artif Intell. 2022;116:105460.

* View Article

* Google Scholar

34. 34. Behera R, Roy M, Dash S. A novel machine learning approach for classification of emotion and polarity in Sentiment140 dataset. In: Proceedings of the 3rd International Conference on Business & Information Management. 2016.

* View Article

* Google Scholar

35. 35. Kosti R, Alvarez JM, Recasens A, Lapedriza A. Emotic: Emotions in context dataset. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 2017. p. 61–9.

* View Article

* Google Scholar

36. 36. Kewsuwun N, Kajornkasirat S. A sentiment analysis model of agritech startup on Facebook comments using naive Bayes classifier. Int J Electr Comput Eng. 2022;12(3).

* View Article

* Google Scholar

37. 37. Saura JR, Palacios-Marqués D, Ribeiro-Soriano D. Exploring the boundaries of open innovation: evidence from social media mining. Technovation. 2023;119:102447.

* View Article

* Google Scholar

38. 38. Babu NV, Kanaga EGM. Sentiment analysis in social media data for depression detection using artificial intelligence: a review. SN Comput Sci 2022;3(1):74. pmid:34816124

* View Article

* PubMed/NCBI

* Google Scholar

39. 39. Isnain AR, Supriyanto J, Kharisma MP. Implementation of K-Nearest Neighbor (K-NN) algorithm for public sentiment analysis of online learning. Indones J Comput Cybern Syst. 2021;15(2):121–30.

* View Article

* Google Scholar

40. 40. Salman AH, Al-Jawher WAM. Performance comparison of support vector machines, AdaBoost, and random forest for sentiment text analysis and classification. J Port Sci Res. 2024;7(3):300–11.

* View Article

* Google Scholar

41. 41. Ali W, Yang Y, Qiu X, Ke Y, Wang Y. Aspect-level sentiment analysis based on bidirectional-GRU in SIoT. IEEE Access. 2021;9:69938–50.

* View Article

* Google Scholar

42. 42. Pimpalkar A, Raj R JR. MBiLSTMGloVe: embedding GloVe knowledge into the corpus using multi-layer BiLSTM deep learning model for social media sentiment analysis. Exp Syst Appl. 2022;203:117581.

* View Article

* Google Scholar

43. 43. Bharal R, Krishna OV. Social media sentiment analysis using CNN-BiLSTM. Int J Sci Res. 2021;10:656–61.

* View Article

* Google Scholar

Citation: Yan J, Pu P, Jiang L (2025) Emotion-RGC net: A novel approach for emotion recognition in social media using RoBERTa and Graph Neural Networks. PLoS ONE 20(3): e0318524. https://doi.org/10.1371/journal.pone.0318524

About the Authors:

Jiangting Yan

Roles: Conceptualization, Validation, Writing – original draft

E-mail: [email protected]

Affiliation: School of Art & Design, Shaanxi Fashion Engineering University, Xi’an, China

Pengju Pu

Roles: Data curation, Formal analysis, Writing – original draft

Affiliation: School of Art & Design, Shaanxi Fashion Engineering University, Xi’an, China

ORICD: https://orcid.org/0009-0006-2224-2168

Liheng Jiang

Roles: Supervision, Writing – review & editing

Affiliation: CVS Health Corporation, Woonsocket, Rhode Island, USA

ORICD: https://orcid.org/0009-0002-3992-0130

References

1. Chakraborty K, Bhatia S, Bhattacharyya S, Platos J, Bag R, Hassanien AE. Sentiment analysis of COVID-19 tweets by deep learning classifiers-a study to show how popularity is affecting accuracy in social media. Appl Soft Comput. 2020;97:106754. pmid:33013254

2. Nemes L, Kiss A. Social media sentiment analysis based on COVID-19. J Inf Telecommun. 2020;5(1):1–15.

3. Chauhan P, Sharma N, Sikka G. The emergence of social media data and sentiment analysis in election prediction. J Ambient Intell Human Comput. 2020;12(2):2601–27.

4. Chen Q, He F, Wang G, Bai X, Cheng L, Ning X. Dual Guidance enabled fuzzy inference for enhanced fine-grained recognition. IEEE Trans Fuzzy Syst. 2025;33(1):418–30.

5. Mohbey KK, Meena G, Kumar S, Lokesh K. A CNN-LSTM-based hybrid deep learning approach for sentiment analysis on monkeypox tweets. New Gener Comput. 2023;42(1):89–107.

6. Behera RK, Jena M, Rath SK, Misra S. Co-LSTM: convolutional LSTM model for sentiment analysis in social big data. Inf Process Manag. 2021:58(1);102435.

7. Dang NC, Moreno-García MN, De la Prieta F. Sentiment analysis based on deep learning: a comparative study. Electronics 2020;9(3):483.

8. Liao W, Zeng B, Yin X, Wei P. An improved aspect-category sentiment analysis model for text sentiment analysis based on RoBERTa. Appl Intell. 2021;51:3522–33.

9. Benslimane S, Azé J, Bringay S, Servajean M, Mollevi C. A text and GNN based controversy detection method on social media. World Wide Web. 2022;26(2):799–825.

10. Huang B, Zhang J, Ju J, Guo R, Fujita H, Liu J. CRF-GCN: an effective syntactic dependency model for aspect-level sentiment analysis. Knowl-Based Syst. 2023;260:110125.

11. Kumar VS, Pareek PK, de Albuquerque VH, Khanna A, Gupta D, Renukadevi D. Multimodal sentiment analysis using speech signals with machine learning techniques. In: Proceedings of the 2022 IEEE 2nd Mysore Sub Section International Conference (MysuruCon). 2022. p. 1–8.

12. Ran, H, Gao, X, Li, L, Li, W, Tian, S, Wang, G, et al. Brain-inspired fast-and slow-update prompt tuning for few-shot class-incremental learning. IEEE Trans Neural Netw Learn Syst. 2024. IEEE.

13. Pandeya YR, Bhattarai B, Lee J. Deep-learning-based multimodal emotion classification for music videos. Sensors (Basel) 2021;21(14):4927. pmid:34300666

14. Gandhi A, Adhvaryu K, Poria S, Cambria E, Hussain A. Multimodal sentiment analysis: a systematic review of history, datasets, multimodal fusion methods, applications, challenges and future directions. Inf Fusion. 2023;91:424–44.

15. Xu K, Lin Z, Zhao J, Shi P, Deng W, Wang H. Multimodal deep learning for social media popularity prediction with attention mechanism. In: Proceedings of the 28th ACM International Conference on Multimedia. 2020. p. 4580–4.

16. Figuerêdo JSL, Maia ALLM, Calumby RT. Early depression detection in social media based on deep learning and underlying emotions. Online Soc Netw Media. 2022;31:100225.

17. Sehar U, Kanwal S, Dashtipur K, Mir U, Abbasi U, Khan F. Urdu sentiment analysis via multimodal data mining based on deep learning algorithms. IEEE Access. 2021;9:153072–82.

18. Gandhi UD, Malarvizhi KP, Chandra Babu G, Karthick G. Sentiment analysis on Twitter data by using convolutional neural network (CNN) and long short term memory (LSTM). Wirel Pers Commun. 2021:1–10.

19. Basiri ME, Nemati S, Abdar M, Cambria E, Acharya UR. ABCDM: an attention-based bidirectional CNN-RNN deep model for sentiment analysis. Future Gen Comput Syst. 2021;115:279–94.

20. Singh M, Jakhar AK, Pandey S. Sentiment analysis on the impact of coronavirus in social life using the BERT model. Soc Netw Anal Min 2021;11(1):33. pmid:33758630

21. Kauffmann E, Peral J, Gil D, Fernández A, Sellers R, Mora H. A framework for big data analytics in commercial social networks: a case study on sentiment analysis and fake review detection for marketing decision-making. Indust Market Manag. 2020;90:523–37.

22. Nazir A, Rao Y, Wu L, Sun L. Issues and challenges of aspect-based sentiment analysis: a comprehensive survey. IEEE Trans Affect Comput. 2022;13(2):845–63.

23. Liang B, Su H, Gui L, Cambria E, Xu R. Aspect-based sentiment analysis via affective knowledge enhanced graph convolutional networks. Knowl-Based Syst. 2022;235:107643.

24. Himelboim I, Xiao X, Lee DKL, Wang MY, Borah P. A social networks approach to understanding vaccine conversations on Twitter: network clusters, sentiment, and certainty in HPV social networks. Health Commun. 2020;35(5):607–15. pmid:31199698

25. Azzouza N, Akli-Astouati K, Ibrahim R. TwitterBERT: framework for Twitter sentiment analysis based on pre-trained language model representations. In: Proceedings of the Emerging Trends in Intelligent Computing and Informatics: Data Science, Intelligent Information Systems and Smart Computing 4. 2020. p. 428–37.

26. Mao R, Liu Q, He K, Li W, Cambria E. The biases of pre-trained language models: an empirical study on prompt-based sentiment analysis and emotion detection. IEEE Trans Affective Comput. 2023;14(3):1743–53.

27. Mathew L, Bindu VR. A review of natural language processing techniques for sentiment analysis using pre-trained models. In: 2020 Fourth International Conference on Computing Methodologies and Communication (ICCMC). 2020. p. 340–5.

28. Alduailej A, Alothaim A. AraXLNet: pre-trained language model for sentiment analysis of Arabic. J Big Data 2022;9(1):72.

29. Najafi A, Varol O. TurkishBERTweet: fast and reliable large language model for social media analysis. Exp Syst Appl. 2024;255:124737.

30. Jain PK, Quamer W, Saravanan V, Pamula R. Employing BERT-DCNN with sentic knowledge base for social media sentiment analysis. J Ambient Intell Human Comput. 2022;14(8):10417–29.

31. Tan KL, Lee CP, Anbananthen KSM, Lim KM. RoBERTa-LSTM: a hybrid model for sentiment analysis with transformer and recurrent neural network. IEEE Access. 2022;10:21517–25.

32. Wang M, Hu G. A novel method for Twitter sentiment analysis based on attentional-graph neural network. Information. 2020:11(2);92.

33. Li W, Du Y, Li X, Chen X, Xie C, Li H, Li X. UD BBC: Named entity recognition in social network combined BERT-BiLSTM-CRF with active learning. Eng Appl Artif Intell. 2022;116:105460.

34. Behera R, Roy M, Dash S. A novel machine learning approach for classification of emotion and polarity in Sentiment140 dataset. In: Proceedings of the 3rd International Conference on Business & Information Management. 2016.

35. Kosti R, Alvarez JM, Recasens A, Lapedriza A. Emotic: Emotions in context dataset. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 2017. p. 61–9.

36. Kewsuwun N, Kajornkasirat S. A sentiment analysis model of agritech startup on Facebook comments using naive Bayes classifier. Int J Electr Comput Eng. 2022;12(3).

37. Saura JR, Palacios-Marqués D, Ribeiro-Soriano D. Exploring the boundaries of open innovation: evidence from social media mining. Technovation. 2023;119:102447.

38. Babu NV, Kanaga EGM. Sentiment analysis in social media data for depression detection using artificial intelligence: a review. SN Comput Sci 2022;3(1):74. pmid:34816124

39. Isnain AR, Supriyanto J, Kharisma MP. Implementation of K-Nearest Neighbor (K-NN) algorithm for public sentiment analysis of online learning. Indones J Comput Cybern Syst. 2021;15(2):121–30.

40. Salman AH, Al-Jawher WAM. Performance comparison of support vector machines, AdaBoost, and random forest for sentiment text analysis and classification. J Port Sci Res. 2024;7(3):300–11.

41. Ali W, Yang Y, Qiu X, Ke Y, Wang Y. Aspect-level sentiment analysis based on bidirectional-GRU in SIoT. IEEE Access. 2021;9:69938–50.

42. Pimpalkar A, Raj R JR. MBiLSTMGloVe: embedding GloVe knowledge into the corpus using multi-layer BiLSTM deep learning model for social media sentiment analysis. Exp Syst Appl. 2022;203:117581.

43. Bharal R, Krishna OV. Social media sentiment analysis using CNN-BiLSTM. Int J Sci Res. 2021;10:656–61.

Word count: 11155

Show less

© 2025 Yan et al. This is an open access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Emotion-RGC net: A novel approach for emotion recognition in social media using RoBERTa and Graph Neural Networks

Content area

Abstract

Full text

Introduction

Related work

Multimodal emotion analysis

Social media emotion propagation network models

Application of pre-trained language models

Methods

Overall model: emotion-RGC net

Emotion feature extraction module: RoBERTa

User relationship modeling module: GNN

Post-processing module: CRF

Experiment

Datasets

Experimental details

Results analysis

Conclusion

References