Full text

Turn on search term navigation

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Accurate recognition of fruits in the orchard is an important step for robot picking in the natural environment, since many CNN models have a low recognition rate when dealing with irregularly shaped and very dense fruits, such as a grape bunch. It is a new trend to use a transformer structure and apply it to a computer vision domain for image processing. This paper provides Swin Transformer and DETR models to achieve grape bunch detection. Additionally, they are compared with traditional CNN models, such as Faster-RCNN, SSD, and YOLO. In addition, the optimal number of stages for a Swin Transformer through experiments is selected. Furthermore, the latest YOLOX model is also used to make a comparison with the Swin Transformer, and the experimental results show that YOLOX has higher accuracy and better detection effect. The above models are trained under red grape datasets collected under natural light. In addition, the dataset is expanded through image data augmentation to achieve a better training effect. After 200 epochs of training, SwinGD obtained an exciting mAP value of 94% when IoU = 0.5. In case of overexposure, overdarkness, and occlusion, SwinGD can recognize more accurately and robustly compared with other models. At the same time, SwinGD still has a better effect when dealing with dense grape bunches. Furthermore, 100 pictures of grapes containing 655 grape bunches are downloaded from Baidu pictures to detect the effect. The Swin Transformer has an accuracy of 91.5%. In order to verify the universality of SwinGD, we conducted a test under green grape images. The experimental results show that SwinGD has a good effect in practical application. The success of SwinGD provides a new solution for precision harvesting in agriculture.

Details

Title
SwinGD: A Robust Grape Bunch Detection Model Based on Swin Transformer in Complex Vineyard Environment
Author
Wang, Jinhai 1   VIAFID ORCID Logo  ; Zhang, Zongyin 1 ; Luo, Lufeng 2   VIAFID ORCID Logo  ; Zhu, Wenbo 2   VIAFID ORCID Logo  ; Chen, Jianwen 1 ; Wang, Wei 1 

 College of Electronic and Information Engineering, Foshan University, Foshan 528000, China; [email protected] (J.W.); [email protected] (Z.Z.); [email protected] (J.C.); [email protected] (W.W.) 
 College of Mechanical and Electrical Engineering, Foshan University, Foshan 528000, China; [email protected] 
First page
492
Publication year
2021
Publication date
2021
Publisher
MDPI AG
e-ISSN
23117524
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2602045215
Copyright
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.