Abstract

In recent years, transformer has made great achievements in the field of NLP and is gradually applied to Computer Vision. However, due to the particularity of images, the computational complexity of transformer is quite high. The windowing operation proposed by Swin transformer effectively solves this problem. We find that Swin transformer has the same hierarchical structure as CNN, so we propose SwinF network with feature fusion based on Swin transformer. On the coco type dataset, Swin transformer achieves 40.3mAP, while SwinF achieves 42.5mAP in the field of target detection.

Details

Title
SwinF: Swin Transformer with feature fusion in target detection
Author
Li, Te 1 ; Wang, Huajun 1 ; Li, Guangzhi 1 ; Liu, Songshan 1 ; Tang, Li 1 

 Chengdu University of Technology , Chengdu, Sichuan, 610000 , China 
First page
012027
Publication year
2022
Publication date
Jun 2022
Publisher
IOP Publishing
ISSN
17426588
e-ISSN
17426596
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2675242082
Copyright
Published under licence by IOP Publishing Ltd. This work is published under http://creativecommons.org/licenses/by/3.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.