Content area
Full text
1. Introduction
Young children are naturally curious about their surroundings and like to ask questions. Child development research has found that children seek factual information when encountering knowledge gaps in their understanding of the world (Chouinard et al., 2007). Besides asking their parents at home and teachers at school, children also reach out to search engines for answers. However, previous work found that children encountered challenges with typing, spelling, selecting proper keywords, and interpreting search results when searching on the Web (Druin et al., 2009).
Rapid advances in automatic speech recognition (ASR) and natural language processing (NLP) now enable children to seek information from voice-based conversational agents (VCAs) at a younger age by removing literacy requirements such as reading and spelling. Children can use voice commands and perform voice searches by pressing a button or using a “wake word” (e.g., Hey Siri, OK Google) to activate these devices. When adults are unavailable or lack the relevant information to answer children’s questions, VCAs empower children to seek information independently, fostering their critical thinking and self-directed learning. By providing immediate responses to children’s questions, VCAs support the continuous learning flow, encouraging curiosity and the exploration of new concepts in a conversational manner that can supplement traditional adult guidance.
Conversational voice search systems refer to systems that use speech recognition and synthesis to engage in dialogue with users. The primary components of VCAs include using ASR and NLP to recognize human speech, understand the natural language, monitor the dialogue status, retrieve dialogue history, and choose appropriate dialogue actions. Then, the natural language generation component generates the natural language output and converts it into speech using text-to-speech synthesis. From our definition of VCAs, we exclude social robots, which rely on nonverbal cues (e.g., facial expressions, physical appearances, and posture) for interaction.
A Common Sense Media report (Rideout and Robb, 2020) showed smart speaker ownership among children under 8 rose from 9% in 2017 to 41% in 2020. In addition, a Pew Research Center survey (Auxier et al., 2020) showed that 36% of parents reported their children under 12 used or interacted with a smart speaker. These reports indicate that this youth generation is growing up with advanced technology, and artificial intelligence (AI)-based tools and systems have...





