Full Text

Turn on search term navigation

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

A multi-scale UAV aerial image object detection model MS-YOLOv7 based on YOLOv7 was proposed to address the issues of a large number of objects and a high proportion of small objects that commonly exist in the Unmanned Aerial Vehicle (UAV) aerial image. The new network is developed with a multiple detection head and a CBAM convolutional attention module to extract features at different scales. To solve the problem of high-density object detection, a YOLOv7 network architecture combined with the Swin Transformer units is proposed, and a new pyramidal pooling module, SPPFS is incorporated into the network. Finally, we incorporate the SoftNMS and the Mish activation function to improve the network’s ability to identify overlapping and occlusion objects. Various experiments on the open-source dataset VisDrone2019 reveal that our new model brings a significant performance boost compared to other state-of-the-art (SOTA) models. Compared with the YOLOv7 object detection algorithm of the baseline network, the mAP0.5 of MS-YOLOv7 increased by 6.0%, the mAP0.95 increased by 4.9%. Ablation experiments show that the designed modules can improve detection accuracy and visually display the detection effect in different scenarios. This experiment demonstrates the applicability of the MS-YOLOv7 for UAV aerial photograph object detection.

Details

Title
MS-YOLOv7:YOLOv7 Based on Multi-Scale for Object Detection on UAV Aerial Photography
Author
Zhao, LiangLiang; Zhu, MinLing
First page
188
Publication year
2023
Publication date
2023
Publisher
MDPI AG
e-ISSN
2504446X
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2791603161
Copyright
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.