Full text

Turn on search term navigation

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Accurate and timely pest monitoring is essential for sustainable agriculture and effective crop protection. While recent deep learning-based pest recognition systems have significantly improved accuracy, they are typically trained for fixed label sets and narrowly defined tasks. In this paper, we present RefPestSeg, a universal, language-promptable segmentation model specifically designed for pest monitoring. RefPestSeg can segment targets at any semantic level, such as species, genus, life stage, or damage type, conditioned on flexible natural language instructions. The model adopts a symmetric architecture with self-attention and cross-attention mechanisms to tightly align visual features with language embeddings in a unified feature space. To further enhance performance in challenging field conditions, we integrate an optimized super-resolution module to improve image quality and employ diverse data augmentation strategies to enrich the training distribution. A lightweight postprocessing step refines segmentation masks by suppressing highly overlapping regions and removing noise blobs introduced by cluttered backgrounds. Extensive experiments on a challenging pest dataset show that RefPestSeg achieves an Intersection over Union (IoU) of 69.08 while maintaining robustness in real-world scenarios. By enabling language-guided pest segmentation, RefPestSeg advances toward more intelligent, adaptable monitoring systems that can respond to real-time agricultural demands without costly model retraining.

Details

Title
Universal Image Segmentation with Arbitrary Granularity for Efficient Pest Monitoring
Author
Minh, Dang L 1 ; Danish Sufyan 2   VIAFID ORCID Logo  ; Fayaz Muhammad 2   VIAFID ORCID Logo  ; Khan, Asma 2 ; Arzu, Gul E 2 ; Tightiz Lilia 2   VIAFID ORCID Logo  ; Song Hyoung-Kyu 3   VIAFID ORCID Logo  ; Moon Hyeonjoon 2   VIAFID ORCID Logo 

 Institute of Research and Development, Duy Tan University, Da Nang 550000, Vietnam, Faculty of Information Technology, Duy Tan University, Da Nang 550000, Vietnam, Department of Information and Communication Engineering and Convergence Engineering for Intelligent Drone, Sejong University, Seoul 05006, Republic of Korea; [email protected] 
 Department of Computer Science and Engineering, Sejong University, Seoul 05006, Republic of Korea; [email protected] (S.D.); [email protected] (M.F.); [email protected] (A.K.); [email protected] (G.E.A.); [email protected] (L.T.) 
 Department of Information and Communication Engineering and Convergence Engineering for Intelligent Drone, Sejong University, Seoul 05006, Republic of Korea; [email protected] 
First page
1462
Publication year
2025
Publication date
2025
Publisher
MDPI AG
e-ISSN
23117524
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
3286299865
Copyright
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.