Multi-modal transformer using two-level visual

Abstract

Fake news with multimedia data is ubiquitous on the Internet nowadays, and it is difficult for users to distinguish them. Therefore, it is necessary to design automatic multi-modal fake news detectors. However, the existing works make poor utilization of visual information, and do not fully consider the semantic interaction of multi-modal data. In this paper, we propose the multi-modal transformer using two-level visual features (MTTV) for fake news detection. First, we model texts and images from news uniformly as sequences that can be processed by transformer, and two-level visual features, i.e. global feature and entity-level feature, are used to improve the utilization of news images. Second, we extend the transformer model for natural language processing to multi-modal transformer which can make multi-modal data interact fully and capture the semantic relationships between them. In addition, we propose a scalable classifier to improve the classification balance of fine-grained fake news detection with the problem of class imbalance. Extensive experiments on two public datasets demonstrate that our method achieved significant performance improvement compared to the state-of-the-art methods. The source code is available at https://github.com/cqu-wb/MTTV.

Details

Title

Multi-modal transformer using two-level visual features for fake news detection

Author

Wang, Bin¹; Feng, Yong¹

; Xiong, Xian-cai²; Wang, Yong-heng³; Qiang, Bao-hua⁴

¹ Chongqing University, College of Computer Science, Chongqing, China (GRID:grid.190737.b) (ISNI:0000 0001 0154 0904)
² Key Laboratory of Monitoring, Evaluation and Early Warning of Territorial Spatial Planning Implementation, Ministry of Natural Resources, Chongqing, China (GRID:grid.453137.7) (ISNI:0000 0004 0406 0561); Chongqing Institute of Planning and Natural Resources Investigation and Monitoring, Chongqing, China (GRID:grid.453137.7)
³ 8# of Zhejiang Lab, Hangzhou, China (GRID:grid.510538.a) (ISNI:0000 0004 8156 0818)
⁴ Guilin University of Electronic Technology, Guangxi Key Laboratory of Trusted Software, Guilin, China (GRID:grid.440723.6) (ISNI:0000 0001 0807 124X)

Pages

10429-10443

Publication year

2023

Publication date

May 2023

Publisher

Springer Nature B.V.

ISSN

0924669X

e-ISSN

1573-7497

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.1007/s10489-022-04055-5

ProQuest document ID

2815842915

© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022. Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Multi-modal transformer using two-level visual features for fake news detection

Content area

Abstract

Details

Suggested sources