Full text

Turn on search term navigation

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Recently, multimodal approaches that combine various modalities have been attracting attention to recognizing emotions more accurately. Although multimodal fusion delivers strong performance, it is computationally intensive and difficult to handle in real time. In addition, there is a fundamental lack of large-scale emotional datasets for learning. In particular, Korean emotional datasets have fewer resources available than English-speaking datasets, thereby limiting the generalization capability of emotion recognition models. In this study, we propose a more lightweight modality fusion method, MMER-LMF, to overcome the lack of Korean emotional datasets and improve emotional recognition performance while reducing model training complexity. To this end, we suggest three algorithms that fuse emotion scores based on the reliability of each model, including text emotion scores extracted using a pre-trained large-scale language model and video emotion scores extracted based on a 3D CNN model. Each algorithm showed similar classification performance except for slight differences in disgust emotion performance with confidence-based weight adjustment, correlation coefficient utilization, and the Dempster–Shafer Theory-based combination method. The accuracy was 80% and the recall was 79%, which is higher than 58% using text modality and 72% using video modality. This is a superior result in terms of learning complexity and performance compared to previous studies using Korean datasets.

Details

Title
MMER-LMF: Multi-Modal Emotion Recognition in Lightweight Modality Fusion
Author
Kim, Eun-Hee 1 ; Myung-Jin, Lim 2   VIAFID ORCID Logo  ; Ju-Hyun, Shin 2   VIAFID ORCID Logo 

 Department of Computer Science, Chosun University, Gwangju 61452, Republic of Korea; [email protected] 
 Department of Future Convergence, Chosun University, Gwangju 61452, Republic of Korea; [email protected] 
First page
2139
Publication year
2025
Publication date
2025
Publisher
MDPI AG
e-ISSN
20799292
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
3217731456
Copyright
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.