Abstract

In the field of visual reasoning, image features are widely used as the input of neural networks to get answers. However, image features are too redundant to learn accurate characterizations for regular networks. While in human reasoning, description is usually constructed to avoid irrelevant details. Inspired by this, a higher-level representation named semantic representation is introduced in this paper to make visual reasoning more efficient. The idea of the Gram matrix used in the neural style transfer research is transferred here to build a relation matrix which enables the related information between objects to be better represented. The model using semantic representation as input outperforms the same model using image features as input which verifies that more accurate results can be obtained through the introduction of high-level semantic representation in the field of visual reasoning.

Details

Title
Semantic representation for visual reasoning
Author
Ni, Xubin; Yin, Lirong; Chen, Xiaobing; Liu, Shan; Yang, Bo; Zheng, Wenfeng
Section
Data and Signal Processing
Publication year
2019
Publication date
2019
Publisher
EDP Sciences
ISSN
22747214
e-ISSN
2261236X
Source type
Conference Paper
Language of publication
English
ProQuest document ID
2276999851
Copyright
© 2019. This work is licensed under http://creativecommons.org/licenses/by/4.0 (the “License”). Notwithstanding the ProQuest Terms and conditions, you may use this content in accordance with the terms of the License.