Full Text

Turn on search term navigation

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

To address the issues of insufficient samples, limited scene diversity, missing perspectives, and low resolution in existing UAV-based pedestrian detection datasets, this paper proposes a novel UAV-based pedestrian detection benchmark dataset named the Novel Surveillance View (NSV). This dataset encompasses diverse scenes and pedestrian information captured from multiple perspectives, and introduces an innovative data mining approach that leverages tracking and optical flow information. This approach significantly improves data acquisition efficiency while ensuring annotation quality. Furthermore, an improved pedestrian detection method is proposed to overcome the performance degradation caused by significant perspective changes in top-down UAV views. Firstly, the View-Agnostic Decomposition (VAD) module decouples features into perspective-dependent and perspective-independent branches to enhance the model’s generalization ability to perspective variations. Secondly, the Deformable Conv-BN-SiLU (DCBS) module dynamically adjusts the receptive field shape to better adapt to the geometric deformations of pedestrians. Finally, the Context-Aware Pyramid Spatial Attention (CPSA) module integrates multi-scale features with attention mechanisms to address the challenge of drastic target scale variations. The experimental results demonstrate that the proposed method improves the mean Average Precision (mAP) by 9% on the NSV dataset, thereby validating that the approach effectively enhances pedestrian detection accuracy from UAV perspectives by optimizing perspective features.

Details

Title
Novel Surveillance View: A Novel Benchmark and View-Optimized Framework for Pedestrian Detection from UAV Perspectives
Author
Chen, Chenglizhao 1 ; Gao, Shengran 1 ; Pei, Hongjuan 2 ; Chen, Ning 3   VIAFID ORCID Logo  ; Shi, Lei 4 ; Zhang, Peiying 1   VIAFID ORCID Logo 

 Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum (East China), Qingdao 266580, China; [email protected] (C.C.); [email protected] (S.G.); [email protected] (P.Z.); Shandong Key Laboratory of Intelligent Oil & Gas Industrial Software, Qingdao 266580, China 
 School of Engineering Science, University of Chinese Academy of Sciences, Beijing 100049, China 
 School of Information and Communication Engineering, Beijing University of Posts and Telecommunications, Beijing 100876, China; [email protected] 
 Key Laboratory of Intelligent Game, Yangtze River Delta Research Institute of NPU, Taicang 215400, China; [email protected]; State Key Laboratory of Media Convergence and Communication, Communication University of China, Beijing 100024, China; Key Laboratory of Education Informatization for Nationalities (Yunnan Normal University), Ministry of Education, Kunming 650092, China 
First page
772
Publication year
2025
Publication date
2025
Publisher
MDPI AG
e-ISSN
14248220
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
3165918678
Copyright
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.