Full Text

Turn on search term navigation

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

In today’s digital era, rumors spreading on social media threaten societal stability and individuals’ daily lives, especially multimodal rumors. Hence, there is an urgent need for effective multimodal rumor detection methods. However, existing approaches often overlook the insufficient diversity of multimodal samples in feature space and hidden similarities and differences among multimodal samples. To address such challenges, we propose MVACLNet, a Multimodal Virtual Augmentation Contrastive Learning Network. In MVACLNet, we first design a Hierarchical Textual Feature Extraction (HTFE) module to extract comprehensive textual features from multiple perspectives. Then, we fuse the textual and visual features using a modified cross-attention mechanism, which operates from different perspectives at the feature value level, to obtain authentic multimodal feature representations. Following this, we devise a Virtual Augmentation Contrastive Learning (VACL) module as an auxiliary training module. It leverages ground-truth labels and extra-generated virtual multimodal feature representations to enhance contrastive learning, thus helping capture more crucial similarities and differences among multimodal samples. Meanwhile, it performs a Kullback–Leibler (KL) divergence constraint between predicted probability distributions of the virtual multimodal feature representations and their corresponding virtual labels to help extract more content-invariant multimodal features. Finally, the authentic multimodal feature representations are input into a rumor classifier for detection. Experiments on two real-world datasets demonstrate the effectiveness and superiority of MVACLNet on multimodal rumor detection.

Details

Title
MVACLNet: A Multimodal Virtual Augmentation Contrastive Learning Network for Rumor Detection
Author
Liu, Xin 1 ; Pang, Mingjiang 1   VIAFID ORCID Logo  ; Li, Qiang 2 ; Zhou, Jiehan 3   VIAFID ORCID Logo  ; Wang, Haiwen 1 ; Yang, Dawei 1 

 Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum (East China), No. 66, West Changjiang Road, Huangdao District, Qingdao 266580, China; [email protected] (M.P.); [email protected] (H.W.); [email protected] (D.Y.) 
 Scientific and Technological Innovation Center of ARI, Beijing 100020, China; [email protected] 
 Information Technology and Electrical Engineering, University of Oulu, 90570 Oulu, Finland; [email protected] 
First page
199
Publication year
2024
Publication date
2024
Publisher
MDPI AG
e-ISSN
19994893
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
3059252233
Copyright
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.