Abstract

Despite significant advancements in robotic grasp driven by visual perception, deploying robots in unstructured environments to perform user-specified tasks still poses considerable challenges. Natural language offers an intuitive means of specifying task objectives, reducing ambiguity. In this study, we introduced natural language into a vision-guided grasp system by employing visual attributes as a mediating bridge between language instructions and visual observations. We propose a command-driven semantic grasp architecture that integrates pixel attention within the visual attribute recognition module and includes a modified grasp pose estimation network to enhance prediction accuracy. Our experimental results show that our approach improves the performance of the submodules including visual attribute recognition and grasp pose estimation compared to baseline models. Furthermore, we demonstrate that our proposed model exhibits notable effectiveness in real-world user-specified grasping experiments.

Details

Title
Command-driven semantic robotic grasping towards user-specified tasks
Author
Lyu, Qing 1 ; Ye, Qingwen 1 ; Chen, Xiaoyan 2 ; Zhang, Qiuju 1 

 Jiangnan University, School of Intelligent Manufacturing, Jiangyin, China (GRID:grid.258151.a) (ISNI:0000 0001 0708 1323); Jiangnan University, Jiangsu Key Laboratory of Advanced Food Manufacturing Equipment and Technology, Wuxi, China (GRID:grid.258151.a) (ISNI:0000 0001 0708 1323) 
 Jiangnan University, Jiangsu Key Laboratory of Advanced Food Manufacturing Equipment and Technology, Wuxi, China (GRID:grid.258151.a) (ISNI:0000 0001 0708 1323); Wuxi University, School of Automation, Wuxi, China (GRID:grid.258151.a) 
Pages
334
Publication year
2025
Publication date
Aug 2025
Publisher
Springer Nature B.V.
ISSN
21994536
e-ISSN
21986053
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
3218320848
Copyright
© The Author(s) 2025. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.