Full text

Turn on search term navigation

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Dialogue systems must understand children’s utterance intentions by considering their unique linguistic characteristics, such as syntactic incompleteness, pronunciation inaccuracies, and creative expressions, to enable natural conversational engagement in child–robot interactions. Even state-of-the-art large language models (LLMs) for language understanding and contextual awareness cannot comprehend children’s intent as accurately as humans because of their distinctive features. An LLM-based dialogue system should acquire the manner by which humans understand children’s speech to enhance its intention reasoning performance in verbal interactions with children. To this end, we propose a fine-tuning methodology that utilizes the LLM–human judgment discrepancy and interactive response data. The former data represent cases in which the LLM and human judgments of the contextual appropriateness of a child’s answer to a robot’s question diverge. The latter data involve robot responses suitable for children’s utterance intentions, generated by the LLM. We developed a fine-tuned dialogue system using these datasets to achieve human-like interpretations of children’s utterances and to respond adaptively. Our system was evaluated through human assessment using the Robotic Social Attributes Scale (RoSAS) and Sensibleness and Specificity Average (SSA) metrics. Consequently, it supports the effective interpretation of children’s utterance intentions and enables natural verbal interactions, even in cases with syntactic incompleteness and mispronunciations.

Details

Title
Child-Centric Robot Dialogue Systems: Fine-Tuning Large Language Models for Better Utterance Understanding and Interaction
Author
Da-Young, Kim 1   VIAFID ORCID Logo  ; Hyo Jeong Lym 2   VIAFID ORCID Logo  ; Lee, Hanna 2   VIAFID ORCID Logo  ; Ye, Jun Lee 2   VIAFID ORCID Logo  ; Kim, Juhyun 2   VIAFID ORCID Logo  ; Min-Gyu, Kim 2   VIAFID ORCID Logo  ; Baek, Yunju 3 

 Human-Robot Interaction Center, Korea Institute of Robotics & Technology Convergence (KIRO), Pohang 37553, Republic of Korea; [email protected] (D.-Y.K.); [email protected] (H.J.L.); [email protected] (H.L.); [email protected] (Y.J.L.); [email protected] (J.K.); [email protected] (M.-G.K.); Department of Information Convergence Engineering, Pusan National University, Busan 46241, Republic of Korea 
 Human-Robot Interaction Center, Korea Institute of Robotics & Technology Convergence (KIRO), Pohang 37553, Republic of Korea; [email protected] (D.-Y.K.); [email protected] (H.J.L.); [email protected] (H.L.); [email protected] (Y.J.L.); [email protected] (J.K.); [email protected] (M.-G.K.) 
 Department of Information Convergence Engineering, Pusan National University, Busan 46241, Republic of Korea 
First page
7939
Publication year
2024
Publication date
2024
Publisher
MDPI AG
e-ISSN
14248220
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
3149755124
Copyright
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.