Full text

Turn on search term navigation

© The Author(s) 2025. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the "License"). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Most existing small object detection methods rely on residual blocks to process deep feature maps. However, these residual blocks, composed of multiple large-kernel convolution layers, incur high computational costs and contain redundant information, which makes it difficult to improve detection performance for small objects. To address this, we designed an improved feature pyramid network called L Feature Pyramid Network (L-FPN), which optimizes the allocation of computational resources for small object detection by reconstructing the original FPN structure. Based on L-FPN, we further proposed a small object detector named BPD-YOLO. We introduce a Dual-phase Asymptotic Feature Fusion mechanism (DAFF), where the shallow and deep semantic features extracted from the backbone network are initially fused in parallel to mitigate the semantic gap. Subsequently, the intermediate semantic layers are progressively integrated, enabling effective fusion of both shallow and deep feature representations. Additionally, we designed the Deep Spatial Pyramid Fusion module (DSPF), which generates multi-scale feature representations as an alternative to conventional residual block stacking, thereby reducing computational overhead. In the shallow feature extraction stage, DSPF focuses on semantic integration and enhances the extraction of small object features. This strategy, which adaptively selects different modules based on the resolution of the feature maps, is referred to as the Decoupled feature Extraction-semantic Integration mechanism (DEI). Finally, we conducted extensive experiments and thorough evaluations on both the VisDrone and TinyPerson datasets. The results demonstrate that, on the VisDrone dataset, compared to the baseline model YOLOv8n + p2, our BPD-YOLO model with L-FPN achieves a 2.8% improvement in mAP50 and a 1.4% increase in mAP50-95. On the TinyPerson dataset, BPD-YOLO further demonstrates its superiority in high-resolution feature extraction, effectively enhancing detection accuracy while significantly reducing computational costs.

Details

Title
A lightweight small object detection model for UAV images based on deep semantic integration
Author
Chao, Manxin 1 ; Peng, Can 2 ; Yun, Lijun 1 ; Zhang, Chunjie 2 ; Wang, Huihua 3 ; Chen, Zaiqing 2 

 The School of Information, Yunnan Normal University, Kunming, 650500, Yunnan, China (ROR: https://ror.org/00sc9n023) (GRID: grid.410739.8) (ISNI: 0000 0001 0723 6903); Engineering Research Center of Computer Vision and Intelligent Control Technology, Department of Education of Yunnan Province, Kunming, 650500, Yunnan, China (ROR: https://ror.org/02yrxdp92) (GRID: grid.481523.9) (ISNI: 0000 0004 1777 5849); Southwest United Graduate School, Kunming, 650092, Yunnan, China 
 The School of Information, Yunnan Normal University, Kunming, 650500, Yunnan, China (ROR: https://ror.org/00sc9n023) (GRID: grid.410739.8) (ISNI: 0000 0001 0723 6903); Engineering Research Center of Computer Vision and Intelligent Control Technology, Department of Education of Yunnan Province, Kunming, 650500, Yunnan, China (ROR: https://ror.org/02yrxdp92) (GRID: grid.481523.9) (ISNI: 0000 0004 1777 5849) 
 Engineering Research Center of Computer Vision and Intelligent Control Technology, Department of Education of Yunnan Province, Kunming, 650500, Yunnan, China (ROR: https://ror.org/02yrxdp92) (GRID: grid.481523.9) (ISNI: 0000 0004 1777 5849); School of Physics and Electronic Information, Yunnan Normal University, Kunming, 650500, Yunnan, China (ROR: https://ror.org/00sc9n023) (GRID: grid.410739.8) (ISNI: 0000 0001 0723 6903) 
Pages
31888
Section
Article
Publication year
2025
Publication date
2025
Publisher
Nature Publishing Group
e-ISSN
20452322
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
3244980640
Copyright
© The Author(s) 2025. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the "License"). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.