Full text

Turn on search term navigation

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

RGB-D salient object detection (SOD) aims at locating the most eye-catching object in visual input by fusing complementary information of RGB modality and depth modality. Most of the existing RGB-D SOD methods integrate multi-modal features to generate the saliency map indiscriminately, ignoring the ambiguity between different modalities. To better use multi-modal complementary information and alleviate the negative impact of ambiguity among different modalities, this paper proposes a novel Alternate Steered Attention and Trapezoidal Pyramid Fusion Network (A2TPNet) for RGB-D SOD composed of Cross-modal Alternate Fusion Module (CAFM) and Trapezoidal Pyramid Fusion Module (TPFM). CAFM is focused on fusing cross-modal features, taking full consideration of the ambiguity between cross-modal data by an Alternate Steered Attention (ASA), and it reduces the interference of redundant information and non-salient features in the interactive process through a collaboration mechanism containing channel attention and spatial attention. TPFM endows the RGB-D SOD model with more powerful feature expression capabilities by combining multi-scale features to enhance the expressive ability of contextual semantics of the model. Extensive experimental results on five publicly available datasets demonstrate that the proposed model consistently outperforms 17 state-of-the-art methods.

Details

Title
A2TPNet: Alternate Steered Attention and Trapezoidal Pyramid Fusion Network for RGB-D Salient Object Detection
Author
Duan, Songsong 1 ; Gao, Xiuju 2 ; Xia, Chenxing 3 ; Ge, Bin 1 

 College of Computer Science and Engineering, Anhui University of Science and Technology, Huainan 232001, China; [email protected] (S.D.); [email protected] (C.X.); [email protected] (B.G.) 
 School of Electrical and Information Engineering, Anhui University of Science and Technology, Huainan 232001, China 
 College of Computer Science and Engineering, Anhui University of Science and Technology, Huainan 232001, China; [email protected] (S.D.); [email protected] (C.X.); [email protected] (B.G.); Hefei Comprehensive National Science Center, Institute of Energy, Hefei 230031, China; Anhui Purvar Bigdata Technology Co., Ltd., Huainan 232001, China 
First page
1968
Publication year
2022
Publication date
2022
Publisher
MDPI AG
e-ISSN
20799292
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2685978400
Copyright
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.