Full text

Turn on search term navigation

Copyright © 2020 Qibin Zheng et al. This is an open access article distributed under the Creative Commons Attribution License (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. http://creativecommons.org/licenses/by/4.0/

Abstract

Cross-modal retrieval aims to find relevant data of different modalities, such as images and text. In order to bridge the modality gap, most existing methods require a lot of coupled sample pairs as training data. To reduce the demands for training data, we propose a cross-modal retrieval framework that utilizes both coupled and uncoupled samples. The framework consists of two parts: Abstraction that aims to provide high-level single-modal representations with uncoupled samples; then, Association links different modalities through a few coupled training samples. Moreover, under this framework, we implement a cross-modal retrieval method based on the consistency between the semantic structure of multiple modalities. First, both images and text are represented with the semantic structure-based representation, which represents each sample as its similarity from the reference points that are generated from single-modal clustering. Then, the reference points of different modalities are aligned through an active learning strategy. Finally, the cross-modal similarity can be measured with the consistency between the semantic structures. The experiment results demonstrate that given proper abstraction of single-modal data, the relationship between different modalities can be simplified, and even limited coupled cross-modal training data are sufficient for satisfactory retrieval accuracy.

Details

Title
Abstraction and Association: Cross-Modal Retrieval Based on Consistency between Semantic Structures
Author
Zheng, Qibin 1   VIAFID ORCID Logo  ; Ren, Xiaoguang 2   VIAFID ORCID Logo  ; Liu, Yi 2   VIAFID ORCID Logo  ; Qin, Wei 2 

 Army Engineering University of PLA, Nanjing, China 
 National Innovation Institute of Defense Technology (NIIDT), Beijing, China; Tianjin Artificial Intelligence Innovation Center (TAIIC), Tianjin, China 
Editor
Francesco Lolli
Publication year
2020
Publication date
2020
Publisher
John Wiley & Sons, Inc.
ISSN
1024123X
e-ISSN
15635147
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2403867369
Copyright
Copyright © 2020 Qibin Zheng et al. This is an open access article distributed under the Creative Commons Attribution License (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. http://creativecommons.org/licenses/by/4.0/