Abstract

Visual tracking of generic objects is one of the fundamental but challenging problems in computer vision. Here, we propose a novel fully convolutional Siamese network to solve visual tracking by directly predicting the target bounding box in an end-to-end manner. We first reformulate the visual tracking task as two subproblems: a classification problem for pixel category prediction and a regression task for object status estimation at this pixel. With this decomposition, we design a simple yet effective Siamese architecture based classification and regression framework, termed SiamCAR, which consists of two subnetworks: a Siamese subnetwork for feature extraction and a classification-regression subnetwork for direct bounding box prediction. Since the proposed framework is both proposal- and anchor-free, SiamCAR can avoid the tedious hyper-parameter tuning of anchors, considerably simplifying the training. To demonstrate that a much simpler tracking framework can achieve superior tracking results, we conduct extensive experiments and comparisons with state-of-the-art trackers on a few challenging benchmarks. Without bells and whistles, SiamCAR achieves leading performance with a real-time speed. Furthermore, the ablation study validates that the proposed framework is effective with various backbone networks, and can benefit from deeper networks. Code is available at https://github.com/ohhhyeahhh/SiamCAR.

Details

Title
Joint Classification and Regression for Visual Tracking with Fully Convolutional Siamese Networks
Author
Cui, Ying 1 ; Guo Dongyan 1   VIAFID ORCID Logo  ; Shao Yanyan 1 ; Wang, Zhenhua 1 ; Shen, Chunhua 2 ; Zhang, Liyan 3 ; Chen, Shengyong 4 

 Zhejiang University of Technology, College of Computer Science and Technology, Hangzhou, China (GRID:grid.469325.f) (ISNI:0000 0004 1761 325X) 
 Zhejiang University, Hangzhou, China (GRID:grid.13402.34) (ISNI:0000 0004 1759 700X) 
 Nanjing University of Aeronautics and Astronautics, College of Computer Science and Technology, Nanjing, China (GRID:grid.64938.30) (ISNI:0000 0000 9558 9911) 
 Tianjin University of Technology, School of Computer Science and Engineering, Tianjin, China (GRID:grid.265025.6) (ISNI:0000 0000 9736 3676) 
Pages
550-566
Publication year
2022
Publication date
Feb 2022
Publisher
Springer Nature B.V.
ISSN
09205691
e-ISSN
15731405
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2629163066
Copyright
© The Author(s) 2022. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.