Content area

Abstract

Aiming at the problems of SiamFC, such as shallow network architecture, a fixed template, a lack of semantic understanding, and temporal modeling, this paper proposes a robust target-tracking algorithm that incorporates both channel and spatial attention mechanisms. The backbone network of our algorithm adopts depthwise, separable convolution to improve computational efficiency, adjusts the output stride and convolution kernel size to improve the network feature extraction capability, and optimizes the network structure through neural architecture search, enabling the extraction of deeper, richer features with stronger semantic information. In addition, we add channel attention to the target template branch after feature extraction to make it adaptively adjust the weights of different feature channels. In the search region branch, a sequential combination of channel and spatial attention is introduced to model spatial dependencies among pixels and suppress background and distractor information. Finally, we evaluate the proposed algorithm on the OTB2015, VOT2018, and VOT2016 datasets. The results show that our method achieves a tracking precision of 0.631 and a success rate of 0.468, improving upon the original SiamFC by 3.4% and 1.2%, respectively. The algorithm ensures robust tracking in complex scenarios, maintains real-time performance, and further reduces both parameter counts and overall computational complexity.

Details

1009240
Title
An Enhanced Siamese Network-Based Visual Tracking Algorithm with a Dual Attention Mechanism
Author
Cai Xueying 1 ; Feng, Sheng 1   VIAFID ORCID Logo  ; Varshosaz, Masood 1 ; Senang, Ying 1 ; Zhou Binchao 1 ; Jia Wentao 1 ; Yang, Jianing 1 ; Wei Canlin 1 ; Feng Yucheng 2 

 Institute of Artificial Intelligence, Shaoxing University, Shaoxing 312000, China; [email protected] (X.C.); [email protected] (S.Y.); [email protected] (B.Z.); [email protected] (W.J.); [email protected] (J.Y.); [email protected] (C.W.) 
 College of Chemical, Dalian University of Technology, Dalian 116024, China 
Publication title
Volume
14
Issue
13
First page
2579
Number of pages
15
Publication year
2025
Publication date
2025
Publisher
MDPI AG
Place of publication
Basel
Country of publication
Switzerland
Publication subject
e-ISSN
20799292
Source type
Scholarly Journal
Language of publication
English
Document type
Journal Article
Publication history
 
 
Online publication date
2025-06-26
Milestone dates
2025-05-18 (Received); 2025-06-23 (Accepted)
Publication history
 
 
   First posting date
26 Jun 2025
ProQuest document ID
3229143373
Document URL
https://www.proquest.com/scholarly-journals/enhanced-siamese-network-based-visual-tracking/docview/3229143373/se-2?accountid=208611
Copyright
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Last updated
2025-07-11
Database
ProQuest One Academic