Full text

Turn on search term navigation

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Video Object Tracking (VOT) is a critical task in computer vision. While Siamese-based and Transformer-based trackers are widely used in VOT, they struggle to perform well on the OTB100 benchmark due to the lack of dedicated training sets. This challenge highlights the difficulty of effectively generalizing to unknown data. To address this issue, this paper proposes an innovative method that utilizes tensor decomposition, an underexplored concept in object-tracking research. By applying L1-norm tensor decomposition, video sequences are represented as four-mode tensors, and a real-time background subtraction algorithm is introduced, allowing for effective modeling of the target–background relationship and adaptation to environmental changes, leading to accurate and robust tracking. Additionally, the paper integrates an improved multi-kernel correlation filter into a single frame, locating and tracking the target by comparing the correlation between the target template and the input image. To further enhance localization precision and robustness, the paper also incorporates Tucker2 decomposition to integrate appearance and motion patterns, generating composite heatmaps. The method is evaluated on the OTB100 benchmark dataset, showing significant improvements in both performance and speed compared to traditional methods. Experimental results demonstrate that the proposed method achieves a 15.8% improvement in AUC and a ten-fold increase in speed compared to typical deep learning-based methods, providing an efficient and accurate real-time tracking solution, particularly in scenarios with similar target–background characteristics, high-speed motion, and limited target movement.

Details

Title
TensorTrack: Tensor Decomposition for Video Object Tracking
Author
Gu, Yuntao 1 ; Zhao, Pengfei 2   VIAFID ORCID Logo  ; Cheng, Lan 1 ; Guo, Yuanjun 3   VIAFID ORCID Logo  ; Wang, Haikuan 2 ; Ding, Wenjun 4 ; Liu, Yu 4 

 College of Electrical and Power Engineering, Taiyuan University of Technology, Taiyuan 030024, China; [email protected] (Y.G.); [email protected] (L.C.) 
 School of Mechatronic Engineering and Automation, Shanghai University, Shanghai 200444, China; [email protected] (P.Z.); [email protected] (H.W.) 
 Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences Shenzhen, Shenzhen 518055, China 
 China Construction Third Bureau Digital Engineering Co., Ltd., Shenzhen 518106, China; [email protected] (W.D.); [email protected] (Y.L.) 
First page
568
Publication year
2025
Publication date
2025
Publisher
MDPI AG
e-ISSN
22277390
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
3171091691
Copyright
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.