Introduction
With English as a global lingua franca, researchers highlight the critical roles of speaking and writing for English as a foreign language (EFL) learners’ academic and business success (Liu et al., 2016; Mahmood, 2023). Without proficiency in these areas, EFL learners may struggle to effectively communicate (Huang et al., 2024; Jao et al., 2023). To address English as a global communication medium, content and language-integrated learning (CLIL), supported by the European Commission’s Council of Education, has been widely adopted (Coyle et al., 2010). CLIL is an educational approach that teaches academic subjects in a non-native language, focusing on both language and content acquisition (Coyle, 2007). Taiwan has experienced a shift from conventional EFL instruction to the integration of subject content and English to improve overall English proficiency as part of the nation’s emergent bilingual efforts (Liu et al., 2023; Lo, 2020). However, the development of productive skills such as speaking and writing within the CLIL framework presents significant challenges, particularly due to the cognitive demands placed on learners. CLIL students’ learning process for speaking and writing is more challenging as they face a greater cognitive load compared to non-CLIL counterparts due to the technical and abstract nature of the conceptual knowledge (Tragant et al., 2016). Addressing these challenges is crucial for ensuring that students can effectively engage in academic and real-world communication.
Despite the benefits of CLIL in enhancing language proficiency and content knowledge (e.g., Bayram et al., 2019; Huang, 2020), several challenges persist, particularly in Taiwan. CLIL classes frequently emphasize receptive skills like listening and reading over productive skills such as speaking and writing, leading to an imbalance that can hinder the development of comprehensive communicative competence and leave students less prepared for real-world language use (Dallinger et al., 2016). Additionally, many of these classes are led by non-native speakers of the target language and integrated into content-specific lessons, which can present difficulties, especially when teachers lack confidence in both language proficiency and content delivery (Espinet et al., 2018; Valdés-Sánchez and Espinet, 2020). Moreover, teaching content knowledge in a foreign language can impede foreign/second language (L2) learning and comprehension of disciplinary content, particularly in demanding fields like CLIL science, where students face the dual challenge of learning abstract concepts and academic language in their less proficient L2 (Lo and Lin, 2015; 2019). These challenges can further lead to negative affectivity among younger learners, affecting their motivation and engagement (Otwinowska and Foryś, 2017). Consequently, researchers advocate for increased scaffolding and the implementation of more effective instructional strategies by teachers (Lo and Lin, 2015; 2019) to specifically support the development of speaking and writing skills in CLIL settings.
However, while multimodal input and digital tools have been proposed as solutions, the specific impact of virtual reality (VR) games on enhancing productive skills within the CLIL context remains underexplored. This study seeks to address this gap by investigating the role of VR games in facilitating the development of speaking and writing skills, essential for students’ success in both academic and real-world settings. Multimodal input involves incorporating various semiotic modes such as visuals, texts, images, symbols, and sounds to facilitate content knowledge learning and meaning creation for students (Gilabert et al., 2016; Lo and Lin, 2019; Nikula and Moore, 2019). Research indicates that multimodal input significantly enhances communication and supports L2 acquisition by engaging learners across textual, visual, aural, linguistic, spatial, and gestural dimensions (Beltran-Palanques, 2024; Liu et al., 2018). This comprehensive engagement leverages multiple senses, fostering a deeper understanding and retention of both language and content, crucial for developing productive skills. VR games, in particular, represent a promising yet insufficiently studied modality within the CLIL framework. These games are rich in multimodal input, integrating elements such as written language, visual shapes, animations, sounds, and graphs. These components not only contribute to constructing scientific knowledge but also scaffold learners’ understanding of disciplinary content and language at both discourse and linguistic levels (Fernández-Fontecha et al., 2020). Aligning with language acquisition principles such as interaction, comprehensible input, and output theory (Egbert et al., 2020), VR games provide engaging and interactive experiences that specifically enhance speaking and writing opportunities (Chen and Hsu, 2020; Lai and Chen, 2021). Moreover, by fostering an immersive and interactive learning environment, VR games are uniquely positioned to address the identified challenges of cognitive overload and the imbalance between receptive and productive skills in CLIL settings, which traditional methods may fail to overcome. Studies have also shown that VR games promote knowledge acquisition and retention, which are vital for mastering productive skills (Alrehaili and Al Osman, 2019; Shi et al., 2019).
Despite the potential of VR games, there remains a significant research gap concerning their role in enhancing multimodal output—particularly writing and speaking—in CLIL science education. Filling this gap is essential, as understanding the effectiveness of VR games could lead to more effective teaching strategies that align with the needs of EFL learners within CLIL environments. While traditional content assessments have predominantly relied on written language (Fernandes et al., 2017), studies investigating the comparative impact of VR games and PowerPoint-led (PPT) games on EFL learners’ multimodal output in CLIL science education—especially in writing and speaking—are notably lacking (Rubio-López, 2024; Törmälä and Kulju, 2023). Additionally, the limited evidence on the impact of VR games on L2 learners’ speaking and writing (Parmaxi, 2023), especially when mediated through self-regulated learning (SRL), highlights the need for further research in this area. Grounded in SRL theory, this quasi-experimental study aims to investigate the potential benefits of using VR games as a tool to enhance fourth-grade CLIL students’ productive language skills—specifically writing and speaking—by analyzing their ability to convey scientific concepts through multimodal output, including textual and graphical elements in poster designs and oral presentations. Specifically, it examines students’ English poster designs (writing) and oral presentations (speaking) using 4Cs (Content, Communication, Cognition, and Culture) framework in multimodal assessments. The study triangulates the impact of different types of multimodal input on students’ multimodal artifact designs and poster presentations, incorporating evaluations from expert raters.
Literature review
Theoretical framework: self-regulated learning (SRL)
Self-regulated learning (SRL) is a dynamic process based on social-cognitive theory, where learners actively manage their cognitive processes, social interactions, and motivational strategies to achieve educational goals (Usher and Schunk, 2018; Zimmerman, 2000). SRL comprises three essential components: cognition, behavior, and motivation (Cho and Shen, 2013; Pintrich, 2004). Cognitively, learners engage in strategic thinking, planning, and task analysis during the forethought phase (Zimmerman, 2002). They use metacognitive strategies like elaboration for deep learning and adapt their cognitive processes through planning and goal setting (Winne, 2018). Behaviorally, self-regulated learners structure their learning environment and manage time effectively, exhibiting behaviors that enhance learning across different contexts (Zimmerman, 2011). They adopt strategies such as outlining and minimizing distractions to optimize their learning environment (Weinstein et al., 2011). Motivationally, self-regulated learners are driven by the willingness to apply cognitive and metacognitive strategies, setting and achieving higher learning goals (Wallin and Adawi, 2018). Motivation influences the learner’s engagement and persistence in learning tasks. SRL strategies interact across cognitive, metacognitive, motivational, and behavioral dimensions, supporting one another to enhance learning outcomes. Effective SRL strategies are crucial as proficient self-regulated learners show improved performance compared to less proficient learners (Cho and Shen, 2013; Schwinger and Otterpohl, 2017).
Research has demonstrated that SRL strategies significantly enhance English speaking and writing skills in EFL contexts. For instance, Xu (2021), in a mixed-method study with Chinese university students, revealed that positive orientations toward written corrective feedback (WCF) in online English writing courses enhanced the use of SRL strategies, despite limitations in peer interactions due to low writing proficiency and online communication challenges. This study suggests that incorporating SRL strategies in online learning environments could lead to improvements in writing skills, a finding that demonstrates the potential benefits of SRL in digital and virtual learning contexts. Additionally, Bai, Wang, and Zhou (2022) conducted a quantitative pretest-posttest study with 468 Hong Kong primary school students, demonstrating that a self-regulated writing strategy-based intervention supported by e-learning tools significantly improved students’ use of SRL writing strategies, though it did not significantly enhance self-efficacy or perceived ease of use. The mixed results of this study highlight the need for further exploration into the factors that influence the effectiveness of SRL interventions, particularly in relation to technology-enhanced learning environments. Al-Hawamleh et al. (2022) conducted a qualitative study with Kuwaiti female EFL students and found that SRL through digital portfolios substantially improved speaking skills by facilitating anticipation, realization, and reflection phases during speaking tasks. This study illustrates the effectiveness of digital tools in promoting SRL, which aligns with the growing interest in using VR and other digital platforms to support language learning. In Xu and Wang’s (2024) quantitative study involving 362 Chinese university students, academic buoyancy and emotions were found to significantly correlate with SRL writing strategies in English writing classrooms, highlighting the importance of fostering positive emotions to enhance SRL in L2 writing contexts. This study provides valuable insights into the emotional aspects of SRL, suggesting that emotional support could be a critical component of successful SRL interventions in VR-based learning environments. These studies collectively indicate that SRL strategies play a crucial role in enhancing English speaking and writing skills. They also provide a strong foundation for investigating how SRL can be leveraged within VR games to enhance multimodal learning outcomes, particularly in CLIL settings.
Enhancing English speaking and writing through game-based virtual reality learning environments
Gamification is defined as the integration of game-like elements, such as points, bonuses, and challenges, into education contexts to motivate and engage users (Santhanam et al., 2016; Schöbel et al., 2020). When gamification is integrated into VR technologies, game-based virtual reality learning environments (VRLEs) emerge as captivating and immersive tools for enhancing language learning by embedding educational content within engaging digital games (Burguillo, 2010). The effectiveness of incorporating digital games into education hinges on adept instructional design and strategies that foster engagement, social collaboration, behavior enhancement, exploration, and motivation (Schonfeld, 2013). Game dynamics in VRLEs act as external stimuli that strategically guide and motivate desired learning behaviors through interactions between learners and key activities. These dynamics include elements such as unpredictable time pressure, meaningful social interactions, structured behavioral challenges, exploratory activities, and emotionally evocative incentives, all contributing to self-motivation and deeper learning engagement (Steffen et al., 2019).
While these studies demonstrate the general effectiveness of VRLEs, there is a distinct gap in the literature regarding their application within the CLIL framework, particularly for enhancing multimodal output such as writing and speaking. For instance, Yang et al. (2021) conducted a quantitative study with Chinese primary school students and found that using a spherical video-based virtual reality (SVVR) learning system significantly improved writing performance, thematic coherence, structural integrity, and linguistic expressiveness compared to traditional methods, although it did not significantly enhance creative thinking. This study suggests that VRLEs can effectively improve certain aspects of writing skills, though it also indicates that creative thinking may require additional support or different types of interventions within VR environments. Similarly, Chen et al. (2022) conducted a quasi-experimental study with 59 Chinese fourth-grade students and discovered that the SVVR approach significantly improved deep writing skills, linguistic expressiveness, and creative thinking. This study supports the notion that VR-based learning can enhance complex writing skills, providing a strong rationale for further exploring how VR can be used to support multimodal learning in CLIL contexts. However, the impact of VR games on speaking skills within CLIL settings remains underexplored, which this study aims to address. Additionally, Wiboolyasarin et al. (2023) carried out a pre-experimental study with 40 second-year undergraduate students across China, Japan, and Hong Kong, finding that collaborative tasks in the 3D virtual RILCA World significantly enhanced Thai-speaking proficiency, particularly in fluency, pronunciation, and language use, highlighting its potential to improve L2 speaking skills in authentic learning environments. These findings stress the value of collaborative, immersive VR environments in enhancing speaking skills, which is particularly relevant for CLIL students who need to develop both content knowledge and language proficiency. These studies collectively highlight the diverse applications and significant benefits of VRLEs in improving both writing and speaking skills and suggest that VRLEs can create immersive and interactive learning environments that traditional methods often fail to provide. This research, therefore, builds on the existing evidence by specifically investigating the impact of VRLEs on multimodal output in CLIL settings, addressing a critical gap in the literature and providing actionable insights for educators seeking to enhance productive skills through innovative digital tools.
Multimodal assessments
The increasing focus on multimodal assessment reflects a deeper acknowledgment of students’ diverse ways of understanding and meaning-making across different modes of communication. This recognition has spurred interest in developing comprehensive assessment practices that align with students’ multimodal productions (Beltran-Palanques, 2024). Multimodal assessment involves the use of multiple semiotic modes—such as text, visuals, audio, and gestures—to evaluate students’ knowledge and skills, offering a more holistic view of their learning (Campoy and Querol-Juli´an, 2021; Querol-Juli´an and Beltr´an-Palanques, 2021). Each modality has specific affordances that shape how meaning is constructed, and the combination of modalities can enhance the overall communicative potential, creating a richer and more accurate representation of student learning (Kress, 2000; Lemke, 2002). In the context of CLIL, multimodal assessments must encompass the 4Cs framework—Content, Communication, Cognition, and Culture—ensuring that assessments capture content mastery, communicative competence, cognitive engagement, and cultural awareness (Coyle, 2007). This approach not only supports diverse learner needs, particularly English learners (ELs) but also leverages the unique affordances of different modalities to enhance the accuracy and completeness of content learning evaluations (Grapin and Llosa, 2022). For instance, visual modalities can effectively communicate spatial relationships, while oral modalities can capture nuances of tone and emphasis, thus providing a more comprehensive understanding of students’ learning processes and outcomes (Grapin, 2019). By integrating multiple modalities, educators can better diagnose learning challenges and provide targeted feedback, ultimately fostering a more inclusive and effective educational environment (Heritage et al., 2015).
Previous research has demonstrated the effectiveness of multimodal assessment in providing a comprehensive evaluation of students’ content learning, particularly for ELs. For instance, Grapin (2020) conducted a mixed-methods study with 393 fifth-grade students to explore whether multimodal assessment tasks provided additional insights compared to traditional written assessments. The study revealed that multimodal assessments offered more comprehensive information about students’ science understanding, benefiting ELs significantly. This study illustrates the importance of using multiple modalities in assessment to capture a fuller picture of students’ understanding, particularly for language learners who may struggle with traditional assessments. Similarly, Grapin and Llosa (2022) examined the performance of fifth-grade students on science tasks using visual, written, and oral modalities. They found that triangulating visual and written responses with oral responses yielded more accurate interpretations of students’ science understanding, emphasizing the potential of multimodal assessments for a holistic evaluation of content knowledge. These findings suggest that multimodal assessments can enhance the validity and reliability of content assessments, which is crucial in CLIL contexts where students must demonstrate understanding across multiple dimensions. Moreover, Rubio-López (2024) explored the integration of multimodal communication using game-based learning with authentic materials in a mixed-method study. The study found that this approach significantly enhanced Spanish secondary students’ multimodal communicative competence, critical thinking, and digital literacy skills in the EFL classroom. This study highlights the potential for game-based learning environments to not only improve language skills but also to foster higher-order thinking skills, which are essential for success in academic and professional settings. These studies collectively highlight the strengths of multimodal assessments in capturing diverse aspects of students’ content learning and communicative competence. Building on this body of research, the present study seeks to explore how multimodal assessments within VR-based review games can provide deeper insights into students’ learning processes and outcomes, particularly in a CLIL science context.
While multimodal assessments have gained attention for their potential to capture a broader range of student skills and knowledge, there remains a gap in empirical studies examining their effectiveness within game-based VRLEs, particularly in CLIL contexts using the 4Cs framework. This study addresses this gap by investigating how different types of multimodal input provided through VR-based review games influence students’ multimodal output. By comparing these outcomes to those achieved through traditional PowerPoint (PPT)-led review sessions, this research aims to provide evidence-based insights into the unique affordances of VR in fostering comprehensive communicative competence in CLIL environments. Specifically, it seeks to empirically compare the impact of multimodal input with oral, visual, and written modalities in VR-based review games vs. PPT-led review games on fourth-grade students’ English multimodal output in English poster designs (writing) and oral presentations (speaking) within a CLIL science context in Taiwan. The study is guided by the following research questions:
RQ 1: Does participation in VR review games, compared to traditional PPT-led review sessions, enhance the textual and graphical composition of English posters created by fourth-grade students in a CLIL science setting, particularly in terms of Content, Communication, Cognition, and Culture, as evaluated through the 4Cs framework?
RQ 2: How do VR-based review games influence the quality of English poster presentations, compared to traditional PPT-led review sessions, particularly in terms of Content, Communication, Cognition, and Culture, as evaluated through the 4Cs framework?
RQ 3: How do expert raters perceive the influence of VR-based vs. PowerPoint-led review games on the multimodal output (textual, graphical, and oral) in poster creation and presentations by fourth-grade CLIL students?
Methodology
Research design and participants
This quasi-experimental study utilized a comparative case study design (Ivankova et al., 2006), focusing on four intact fourth-grade CLIL science classes in three Taiwanese public elementary schools. Fourth-grade students were selected because they are at a critical developmental stage where both language acquisition and cognitive skills are rapidly evolving, making them ideal candidates for investigating the impact of VR games on language and content learning. Additionally, these fourth-grade students in Taiwan have received 3.5 years of English instruction, providing a consistent foundation in both language proficiency and content knowledge, which is essential for a fair comparison between the experimental and control groups.
Two classes were randomly assigned as the experimental group (EG), using VR-based games for content review, and the other two as the control group (CG), using PPT presentations. The EG consisted of 40 students (EG1: 22; EG2: 18), and the CG consisted of 41 students (CG1: 22; CG2: 19). All participants were native Mandarin Chinese speakers aged 9 to 10 years old, enrolled in a CLIL-based English immersion program for 3.5 years, demonstrating comparable listening and speaking skills based on previous course assessments (t = 1.895, p = 0.092). Pre-intervention t-tests for science vocabulary and content knowledge showed no significant differences (t = 1.035, p = 0.304 for vocabulary; t = 1.761, p = 0.082 for content knowledge). The weekly two-period CLIL science curriculum targeted literacy enhancement by exploring topics such as life science and physical science in English, with instruction conducted entirely in English. The standardized instructional process for the four classes included topic introduction and presentation in the first period, followed by a 25-min review session using either VR games for the EG or PPT-led games for the CG in the second period across five consecutive science units. Identical instructional materials ensured a consistent learning experience and facilitated comparative analysis.
The relatively small sample size and specific context—fourth-grade students in Taiwanese CLIL science classes—were chosen deliberately to provide a focused examination of VR games’ impact in a controlled environment. To ensure the study had sufficient statistical power, a power analysis was conducted to confirm that it was adequately powered to detect medium to large effect sizes (Cohen’s d = 0.5). The power analysis indicated that a sample size of approximately 34 students per group would be sufficient to detect significant differences with 80% power at an alpha level of 0.05. Therefore, the current sample size of 40 students in the EG and 41 in the CG is considered sufficient to provide reliable results within the study’s scope. However, the results should be interpreted with caution regarding their generalizability to other contexts or age groups.
The intervention
PPT review games: gamification and multimodal input
Five units of PPT review games were designed with gamification principles, integrating elements such as points, bonuses, and challenges to motivate and engage students (Santhanam et al., 2016; Schöbel et al., 2020). These games were conducted after the topic introduction and presentation, with each session lasting 20 to 25 min. These games consisted of 20–25 interactive 5 W questions (who, what, when, where, why) that covered core scientific concepts and vocabulary. Students earn points by correctly answering questions, which often involve selecting one of three boxes or icons with hidden points (see Fig. 1), adding an element of unpredictability and excitement. Incorrect answers received immediate feedback, helping students learn from their mistakes. The teacher-led structure ensured that each student participated, with questions posed one by one. This structured yet dynamic format aimed to maintain student engagement through diverse question types. By incorporating these interactive review games, instructors aimed to create an engaging learning environment that reinforced vocabulary and content knowledge through repetition and active participation (Schmitt, 2008; Webb, 2019).
Fig. 1 [Images not available. See PDF.]
Overview of the PowerPoint (PPT) review game design.
This figure provides an overview of the interactive question format used in the PPT review game, featuring the “5W” questions (Who, What, When, Where, Why) to reinforce core scientific concepts and vocabulary. Students earned points by correctly answering questions, with options hidden under three selectable boxes to introduce an element of unpredictability and excitement. The scoring system assigned points randomly to maintain engagement and motivation, while incorrect answers triggered immediate feedback to facilitate learning. The game integrated gamification elements such as points, bonuses, and challenges to keep students motivated throughout the session. Additionally, the game included information slides that presented English words alongside their definitions, Chinese translations, and relevant images to support comprehension and retention. The PPT review game served as a control intervention, contrasting with the VR-based review games to evaluate the effectiveness of different multimodal inputs on students’ learning outcomes.
The design of PPT games incorporated multimodal input to cater to different learning preferences and enhance content retention. Key elements included:
(1) Vocabulary Presentation: Each game featured the English word, its definition in English, and the Chinese translation, supported by relevant images. This multimodal approach helped students associate words with their meanings and visual representations (Gilabert et al., 2016; Liu et al., 2018).
(2) Core Content and Diagrams: The games included core content from the textbook and PPT slides, such as explanations of the water cycle and states of matter. Diagrams were used extensively for labeling activities, which reinforced scientific concepts visually (Beltran-Palanques, 2024).
(3) Varied Question Types: To maintain student interest and cater to different learning abilities, the games included a mix of question types. Some questions required reading and defining terms, while others focused on visual comprehension and diagram labeling (Nikula and Moore, 2019).
(4) Interactive Elements: Each game incorporated interactive elements, such as animated GIFs, to illustrate processes like evaporation and condensation. These visuals provided clear examples of scientific concepts and kept students engaged (Fernández-Fontecha et al., 2020).
(5) Repetition and Integration: Repetition was a key strategy, with core diagrams and vocabulary appearing in multiple questions to reinforce learning. The games also linked new content to previous units, helping students build on their existing knowledge (Lo and Lin, 2019).
(6) Humor and Engagement: Humorous images and fast-moving animations were used to capture students’ attention and make learning enjoyable. Each game began with an explanation of its relevance to the unit content, and students were encouraged to keep their science books and worksheets handy for reference (Chen and Hsu, 2020; Lai and Chen, 2021).
(7) Scoring System: The points system was designed to be flexible, allowing teachers to adapt it to their own reward systems. Points were assigned randomly to add excitement and motivation (Alrehaili and Al Osman, 2019; Shi et al., 2019).
VR review games: gamification and multimodal input
The identical questions used in the PPT games for vocabulary reinforcement and content review across various units were adapted into VR games using CoSpaces Edu ProEach VR game session lasted 10–15 min, and when played in pairs, the total gameplay time was 20–25 min, matching the gameplay duration of the CG. Gamification elements included points, bonuses, levels, and challenges to motivate students (Santhanam et al., 2016; Schöbel et al., 2020). The process of designing VR content involved transforming existing game elements into 3D scenes, integrating interactive features, and optimizing the content for a VR experience. These games featured user-friendly interfaces, embedded educational content, and interactive gameplay that adhered to key language acquisition principles such as interaction, comprehensible input, and output theory (Egbert et al., 2020). The games aimed to facilitate language learning through meaningful interaction and exposure to comprehensible language input while encouraging learners to produce language output. This approach ensured that the games were not only educational but also engaging and immersive, thereby enhancing students’ motivation and participation.
The VR games incorporated various multimodal inputs to cater to different learning preferences and enhance content retention. Key elements included:
(1) Comprehensive Training: Before engaging with the VR games, participants received comprehensive training sessions on how to properly wear and operate the Meta Quest 2 head-mounted VR headsets. The buddy system further supported this training, ensuring that students were comfortable with the technology and could focus on the educational content (Fernández-Fontecha et al., 2020).
(2) Graphic and Textual Information: Each VR game began with clear instructions and incorporated graphic and textual information, referred to as initial information boards (see Fig. 2). Students needed to read these information boards aloud to their buddies. These information boards served as scaffolding, offering mediating texts and artifacts to support learning (Vygotsky, 1978; Lin et al., 2023). This approach aligns with Vygotsky’s Zone of Proximal Development, emphasizing the importance of targeted support structures within the immersive VR landscape.
Fig. 2 [Images not available. See PDF.]
Initial information board in VR-based review game.
This figure provides an overview of the initial information board presented at the beginning of the VR game, which combines graphic and textual information to introduce core scientific concepts. It includes detailed textual content offering definitions and explanations relevant to the unit topic, designed to scaffold students’ understanding through mediating texts. Graphical elements such as diagrams and images visually represent scientific phenomena to complement the text and enhance multimodal learning. Additionally, instructions are provided for students to read the information aloud to their partners, promoting collaborative learning and reinforcing comprehension. These initial boards were strategically designed to provide foundational knowledge and context before students engaged in interactive gameplay, aligning with the educational objectives of the VR-based review sessions.
(3) Interactive Features: The main VR games began with an avatar delivering a brief audio explanation of the target content knowledge to reinforce students’ conceptual understanding. During the main games, students would also encounter 5 to 10 information boards providing content knowledge texts (see Fig. 3), which they could access without needing to read aloud. These interactive features allowed students to engage with animals or characters within the VR environments to access additional vocabulary or content knowledge texts. These elements made the learning process dynamic and interactive, facilitating better retention of information (Chen and Hsu, 2020; Lai and Chen, 2021).
Fig. 3 [Images not available. See PDF.]
Information board in the main game.
This figure presents the information board used during the main phase of the VR game, offering essential content knowledge and interactive elements to enhance learning. The board features textual sections with detailed descriptions and definitions of key scientific concepts to reinforce students’ understanding. Graphical elements, such as diagrams and images, illustrate scientific processes to support the text and promote multimodal learning. Interactive features, including clickable areas and prompts, engage students by providing additional information and questions, encouraging deeper exploration of the concepts. These information boards were strategically placed throughout the game to facilitate interactive learning experiences and support students in applying and expanding their scientific knowledge during gameplay.
(4) Buddy System: Students formed buddy systems, where two students paired up to assist in equipment adjustment, safe navigation within the VR environments, and reading aloud the initial information boards, as well as the answers they chose for each question during the main games (see Fig. 4). This peer support mechanism enhanced collaborative learning and facilitated comprehension and engagement (Lin et al., 2023).
Fig. 4 [Images not available. See PDF.]
The buddy system in VR review games.
This figure depicts the buddy system used in the VR review game, where two students collaborate—one navigating the VR environment with a headset and controllers, while the other reads questions aloud and records answers. The buddy system fosters collaboration by encouraging students to assist each other in managing equipment, navigating the virtual environment safely, and discussing their responses. This peer-assisted interaction promotes comprehension through shared decision-making and verbal communication, ensuring smooth gameplay. Written consent to reproduce the images of children in this study was obtained from their parents or legal guardians.
(5) Sequential Navigation and Feedback Mechanisms: Students navigated through the VR games sequentially within specified time limits. Feedback mechanisms provided responses for both correct and incorrect answers, helping students learn from their mistakes and reinforcing correct information (Egbert et al., 2020).
(6) Integrated Content: The content used in the VR games was the same as that in the PPT games, ensuring consistency in the educational material. Questions covered vocabulary, meaning, definition, Chinese words, application of knowledge, and pictures. This integration ensured fairness and comprehensive coverage of the relevant content (Lo and Lin, 2019).
(7) Engaging Environments: The VR games featured various environments that matched the unit topics. For example, levels focused on pollution used environments like homes, parks, and waterfalls. These settings made the learning experience relatable and engaging for students, helping them to better understand and retain the content (Fernández-Fontecha et al., 2020).
(8) Narrative and Objectives: The games included a narrative element with characters like the Racoon King, who appeared in boss levels to add a mini-narrative throughout the game. This narrative made students feel more invested in the game and motivated them to complete tasks and achieve objectives. Bonus objectives, such as picking up trash, were also included to encourage exploration and reinforce the educational content (Alrehaili and Al Osman, 2019; Shi et al., 2019).
Multimodal assessments and analysis
Poster designs and analysis
In this study, groups of 3–5 students from both the EG (n = 40) and CG (n = 41) were tasked with creating posters reflecting the content knowledge they acquired over five units of instruction. Each group produced a total of 10 posters, resulting in 20 posters in total, with 4 posters for each unit. These posters served as multimodal artifacts that combined visual and textual elements to convey scientific information. All students were provided with detailed guidelines (Appendix A) on content inclusion, design methods, and reflection prompts, based on Neuman and Danielson (2021). Each group focused on a different unit, distributed across four intact classes, and addressed at least three key questions from their respective units. To enhance students’ poster design skills, the instructors delivered a uniform 20-min training session using PPT presentations, as students reported a lack of prior experience with poster design in English. To ensure fairness and objectivity, students were instructed to rely solely on their classroom learning for the poster content, without referring to textbooks, worksheets, or external resources.
The evaluation of the multimodal output (posters) was guided by the 4Cs framework (Content, Communication, Cognition, and Culture) within the CLIL framework. A specially developed rubric (Appendix B) using a six-point scale was used to assess the posters. This rubric focused on the effectiveness of the posters’ textual and graphical elements (Content), the use of targeted vocabulary, sentence length, sentence complexity, and grammar accuracy (Communication), the demonstration of understanding and critical thinking regarding the scientific content (Cognition), and the incorporation of culturally relevant ideas (Culture). According to Tedick and Wesely (2015), assessing syntactic complexity is crucial for evaluating students’ language production beyond basic vocabulary, as it includes the examination of grammatical structures and sentence variety essential for expressive proficiency. The rubrics were developed and validated by two CLIL experts familiar with the Taiwanese educational context. Poster assessments were carried out by two of the three instructors in the study using these rubrics. While assessing the posters, raters compared four posters (2 from the EG and 2 from the CG) on the same unit to ensure consistent comparison. Each student in a group contributed to the poster design; therefore, each student in the same group shared the same score for the poster.
Interrater reliability was perfect (1.0) for most categories between the EG and CG, indicating strong agreement between raters. However, for the categories of “CG Communication-sentence complexity” and “CG Culture,” the interrater reliability coefficients (kappa) were 0.859 and 0.853, respectively. Given the sample sizes were over 30 for both groups, the data were assumed to be normally distributed based on the Central Limit Theorem, justifying the use of independent-samples t-tests. Thus, independent-samples t-tests conducted with SPSS Statistics (version 25) compared the groups quantitatively, reporting means, standard deviations (SD), t-scores, p-values, and effect sizes (Cohen’s d) to detail the differences observed between the groups post-intervention. The statistical analyses were conducted with a significance level (alpha) set at 0.05, and all tests were two-tailed.
Oral poster presentations and analysis
After the collaborative poster design activity, the same groups of students from both the EG (n = 40) and CG (n = 41) engaged in small-group poster presentations two weeks later. Each group produced a total of 10 presentations, resulting in 20 presentations overall, with 4 presentations for each unit. Prior to these presentations, students participated in a 20-min training session conducted by their respective instructors to enhance their presentation skills (Wells, 1999). The session used a PPT presentation to improve students’ skills in English presentation structure, delivery, and language proficiency. Detailed guidelines (Appendix C) were provided to assist students in effectively presenting their posters. Students rehearsed their explanations once before the actual presentations, which were recorded for analysis. Each group presentation lasted about 2–4 min.
The analysis of recorded poster presentations employed a quantitative assessment using a rubric (Appendix D) developed by the same CLIL experts. This rubric utilized a six-point scale with detailed descriptors across four dimensions: (1) Content: Focus, depth, accuracy, and practical relevance of scientific content. (2) Communication: Use of scientific vocabulary, sentence structure, grammar, pronunciation, and overall fluency. (3) Cognition: Depth of understanding, critical thinking, problem-solving skills, and ability to apply scientific concepts. (4) Culture: Integration of cultural perspectives into the scientific content. The rubrics were used by the same two instructors to assess the presentations. While assessing the presentations, raters compared four presentations (2 from the EG and 2 from the CG) on the same unit to ensure consistent comparison. Each student in a group contributed to the presentation; therefore, each student in the same group shared the same score. Most categories showed perfect interrater reliability (1.0) between the EG and CG, indicating strong agreement between the raters. However, for the category of “CG Cognition,” the interrater reliability coefficient (kappa) was 0.853, suggesting slightly lower but still substantial agreement. Following the assessment, independent-samples t-tests were conducted using SPSS Statistics (version 25) to compare post-intervention results between the EG and CG. This analysis reported means, standard deviations (SD), t-scores, p-values, and effect sizes (Cohen’s d) to quantify the differences observed between the groups. The statistical analyses were conducted with a significance level (alpha) set at 0.05, and all tests were two-tailed.
Rater reflections and analysis
After assessing the poster designs and presentations, the two instructors participated in a reflective session to capture their expert insights. The rater reflections aimed to complement the quantitative data by providing detailed interpretations of how specific functionalities and experiences from both VR and PPT review games influenced students’ work. The raters individually documented their observations, focusing on specific aspects affected by the review activities. Guiding questions prompted reflections on how the review activities impacted poster designs and presentations. These written reflections provided detailed insights and interpretations of the connections between review experiences and poster outcomes. Thematic analysis was employed to analyze the reflective data, identifying recurring patterns, emerging themes, and nuanced observations regarding the influence of VR-based and PPT-led experiences on poster design and presentation quality.
The intervention procedure
Before the 11-week study, the three instructors voluntarily participated in a briefing on the study’s purpose, objectives, instructional methods, and ethical considerations. The CG continued using teacher-led PowerPoint-based review sessions, while the EG transitioned to VR-based review sessions designed using the same content. The four classes were randomly assigned to either the EG or CG, and students were briefed on the study’s purposes. Parental consent was obtained to ensure ethical compliance. From Weeks 2–6, the first five units were taught by their respective teachers. During Weeks 8–9, students from both groups spent two weeks designing posters in small groups. In Week 10, students presented their posters. In Week 11, the raters assessed all student work and conducted Rater Reflections to provide qualitative insights into the impact of the review activities on the poster designs and presentations.
Results
Impact of VR and PPT review games on English poster composition (RQ1)
The analysis of poster designs using the 4Cs framework revealed significant differences between the EG and CG. Independent-samples t-tests were conducted to compare the mean scores for various criteria assessed through a six-point scale rubric by two raters (Table 1). The EG posters scored significantly higher in Content (M = 3.60, SD = 1.046) compared to the CG posters (M = 2.70, SD = 1.302), suggesting that the EG students demonstrated greater depth, accuracy, and relevance in their scientific content. The moderate effect size (Cohen’s d = 0.76) indicates a meaningful difference between the groups, suggesting that the multimodal input provided by the VR games may have facilitated a more thorough understanding and application of scientific concepts. While there was no significant difference in Communication categories such as target vocabulary usage and sentence complexity, the EG group exhibited significantly higher cognitive processing in their poster designs. The EG had a significantly higher mean score in Cognition (M = 3.70, SD = 0.923) compared to the CG (M = 2.40, SD = 1.142), with a large effect size (Cohen’s d = 1.25), indicating a substantial impact of VR games on enhancing students’ critical thinking and application of concepts. For Culture, although the EG posters had a higher mean score (M = 1.80, SD = 1.105) compared to the CG posters (M = 1.55, SD = 1.099), this difference was not statistically significant (t = 0.717, p = 0.478, Cohen’s d = 0.23), suggesting similar levels of cultural integration in both groups.
Table 1. Comparisons of EG and CG poster designs using 4Cs rubrics.
Independent-samples t-tests (EG vs. CG) | |||||||
---|---|---|---|---|---|---|---|
Mean | SD | t | p | Effect size (Cohen’s d) | |||
EG = 40 (10 groups), CG = 41(10 groups) | EG | CG | EG | CG | |||
Content | 3.60 | 2.70 | 1.046 | 1.302 | 2.410 | 0.021* | 0.76 |
Communication—target vocabulary usage | 4.40 | 4.10 | 0.821 | 1.252 | 0.896 | 0.376 | 0.28 |
Communication—sentence length | 2.60 | 2.20 | 1.536 | 1.361 | 0.872 | 0.389 | 0.28 |
Communication—sentence complexity | 1.50 | 1.65 | 0.688 | 0.933 | −0.578 | 0.566 | −0.18 |
Communication—grammar accuracy | 2.00 | 1.60 | 0.795 | 0.821 | 1.566 | 0.126 | 0.49 |
Cognition | 3.70 | 2.40 | 0.923 | 1.142 | 3.958 | 0.000* | 1.25 |
Culture | 1.80 | 1.55 | 1.105 | 1.099 | 0.717 | 0.478 | 0.23 |
Significance of p < 0.05 is indicated by an asterisk (*); the rubrics were used by two raters based on a six-point scale (0–5).
Raters’ reflections complementing quantitative poster composition results
Content: Rater reflections revealed that the content design features of EG posters were significantly influenced by VR games, resulting in a balanced integration of graphical elements and textual information to showcase their understanding and knowledge. For example, in EG 2’s Unit 11 poster, students included animals and plants from the VR game, showing how these elements interact with the environment and creatively integrating features like multiple suns seen in the VR game (see Fig. 5). EG 1’s Unit 14 poster borrowed its entire graphic design from the VR game’s initial information board, featuring elements like a water roller coaster and water droplets. These examples highlight how VR experiences directly influenced the graphical content and design choices of the EG posters, reflecting a meaningful integration of virtual learning into their poster representations. Conversely, CG posters focused more on textual explanations, resulting in less visually cohesive presentations that might affect accessibility and engagement. CG posters also featured original graphics reflecting individual interpretations, which differed from the VR-influenced designs of the EG groups.
Fig. 5 [Images not available. See PDF.]
Correspondence between VR game elements and student poster.
This figure showcases the relationship between the VR game elements and a student’s poster. It highlights various animals, plants, and environmental features encountered during gameplay and how these were incorporated into EG 2’s Unit 11 poster to represent their needs and interactions. Students creatively integrated visual elements such as multiple suns from the VR game into their poster designs, demonstrating their ability to transfer conceptual knowledge from the VR experience into multimodal output. The figure also emphasizes the specific correspondences between VR elements and the poster content, illustrating how the immersive experience enriched the students’ understanding and representation of scientific concepts.
Cognition: The cognitive processing in EG poster designs consistently demonstrated higher-order cognitive skills and effective cross-unit application, showcasing robust engagement and understanding. For instance, EG 1 consistently demonstrated higher-order cognitive skills of Bloom’s taxonomy, such as evaluation and application, evident in their use of Maslow’s hierarchy of needs (see Fig. 6) and detailed depictions of water sources and pollution impacts (Units 13 and 15). Similarly, EG 2 displayed strong analytical skills, particularly in Unit 14, where they presented a comprehensive water-cycle representation using clear scientific terminology. Additionally, the EG posters also revealed an integration of content learned across different units, demonstrating an ability to connect and apply cross-unit knowledge. For example, EG 1’s Unit 13 poster highlighted various water sources and their uses, drawing on concepts from multiple units to create a comprehensive visualization. Conversely, CG poster designs generally exhibited cognitive processing aligned with lower levels of Bloom’s taxonomy, focusing primarily on recall and basic understanding. This trend was consistent across various units, where CG groups relied on straightforward visual representations, emphasizing recall and comprehension over higher-order thinking skills.
Fig. 6 [Images not available. See PDF.]
EG poster engaging in higher-order cognitive processing.
This figure presents an example of EG 1’s poster demonstrating higher-order cognitive skills as outlined in Bloom’s taxonomy. The poster integrates Maslow’s hierarchy of needs to evaluate and apply scientific concepts, reflecting the students’ ability to analyze and synthesize information critically. The cognitive processes of evaluation and application are showcased, illustrating the students' capacity for deeper engagement with scientific content. The highlighted elements in the poster further reflect the complexity of student understanding, demonstrating how theoretical frameworks were effectively integrated into their designs within the CLIL context.
Culture: The cultural design features of both EG and CG posters reflected a blend of personal and societal influences. EG posters prominently integrated cultural symbols like Taiwanese icons and local references (e.g., Taiwanese noodles, flags), adding depth and connecting academic content to real-world contexts. Similarly, CG posters occasionally included cultural references, reflecting individual or cultural interpretations. However, the overall impact on cultural integration was comparable between the groups, suggesting that the review methods did not significantly influence cultural expression.
Impact of VR/PPT review games on English poster presentations (RQ2)
The analysis of poster presentations (Table 2) mirrored the findings from the poster composition. The EG outperformed the CG in terms of Content (M = 3.60, SD = 0.681 vs. M = 3.10, SD = 0.718), indicating that the EG presentations demonstrated greater scientific depth and clarity. The significant difference in Communication, particularly in target vocabulary usage and sentence complexity, suggests that the VR games helped students use more precise and varied language when explaining scientific concepts. In terms of Cognition, the EG presentations demonstrated a significantly higher mean score (M = 3.70, SD = 1.129) compared to the CG presentations (M = 2.35, SD = 0.875), t = 4.228, p = 0.000, with a large effect size (Cohen’s d = 1.34). This indicates a deeper understanding, critical thinking, and application of scientific concepts among EG students. For Culture, there was no significant difference between the groups, with EG (M = 1.30, SD = 1.302) and CG (M = 1.20, SD = 0.410), t = 0.328, p = 0.745, Cohen’s d = 0.10. This suggests similar levels of cultural integration in both groups.
Table 2. Comparisons of poster presentations using 4Cs rubrics.
Independent-samples t-tests (EG vs. CG) | |||||||
---|---|---|---|---|---|---|---|
Mean | SD | t | p | Effect size (Cohen’s d) | |||
EG = 40 (10 groups), CG = 41 (10 groups) | EG | CG | EG | CG | |||
Content | 3.60 | 3.10 | 0.681 | 0.718 | 2.260 | 0.030* | 0.71 |
Communication—target vocabulary usage | 4.20 | 3.20 | 1.005 | 0.768 | 3.536 | 0.001* | 1.12 |
Communication—sentence length | 1.90 | 1.40 | 1.071 | 0.503 | 1.890 | 0.066 | 0.60 |
Communication—sentence complexity | 1.20 | 1.00 | 0.410 | 0.000 | 2.179 | 0.036* | 0.69 |
Communication—grammar accuracy | 1.40 | 1.20 | 0.681 | 0.410 | 1.125 | 0.267 | 0.36 |
Cognition | 3.70 | 2.35 | 1.129 | 0.875 | 4.228 | 0.000* | 1.34 |
Culture | 1.30 | 1.20 | 1.302 | 0.410 | 0.328 | 0.745 | 0.10 |
Significance of p < 0.05 is indicated by an asterisk (*); the rubrics were used by two raters based on a six-point scale (0–5).
Table 3. Raters’ reflections on content depth and language clarity.
Group | EG (VR) Groups | CG (PPT) Groups |
---|---|---|
Raters’ Reflection | Clearer and more comprehensible language | More incomprehensible language |
Examples of Presentation Excerpts | EG2-Unit 12 S1: “There is a lot of air in atmosphere. First is nitrogen, then second is oxygen. You know what is the wind and you know what about the atmosphere.” [Clearer and deeper explanation of gas distribution] EG1-Unit 13 S1: “We have water and land but water is more than land. Water had salty and fresh but salty water have 75%. Fresh water just have 3 percent, so we need to save water. 97% is salty water, 2% is icecap, 3% is fresh water. 1% is groundwater.” [Detailed explanation of the percentages of fresh and salty water sources] | CG1-Unit 12 S2: “Wind will change to typhoon or tomodo [tornado]” S3: “How do make wind…hb Cold air move to the space. This is how we make. And server.” [More incomprehensible expressions] CG2-Unit 13 S3: “We can get water from ocean..gane..water..from pout.” [More direct answer without depth or clear compressibility.] |
The bolded italicized parts contributed to the interpretation in the brackets.
Table 4. Raters’ reflections on target vocabulary usage and sentence complexity.
Group | EG (VR) Groups | CG (PPT) Groups |
---|---|---|
Raters’ Reflection | More target vocabulary usage | Less target vocabulary usage |
Examples of Presentation Excerpts | EG1-Unit 11 S3: “People need sleep, shelter, air, water, clothes and food. They are people’s need under (Point to Maslow’s pyramid of the bottom need). …We need safety, love, esteem and self.” S4: “Baby, children, adults, and older persons also need shelter, water, food, air, and clothes. Different baby need love and milk. Children need love. Adult persons need work. Older persons need to sleep.” [Inclusion of more target vocabulary] | CG1-Unit 11 S3: “Humans need foods that (stick?). People and animals are living things. Bread and food, toast can’t move is not living things.” S4: “Plants need water and sun and air. Plant is living things.” [Inclusion of less target vocabulary] |
Raters’ Reflection | More varied and complex structures | Simpler sentence structures |
Examples of Presentation Excerpts | EG1-Unit 14 S1: “..We are group 4, and our poster is about water cycle. Do you think water is ever lost?… Water cycle start from the sun, and the sun gives heat to ocean and the ocean evaporate.” S3: “Evaporation important because evaporation will be cloud, and then the cloud will evapo..precipitation. So we had water cycle.” S2: “Now I will describe the journey of a water droplet through the water cycle from the beginning. The sun heats the ocean water, and the water droplet evaporate, and they have condensation into clouds then water droplet precipitation.” S4: “I will describe four sources of fresh water. Where can we see them in the water cycle? In the river, pond, lake, and groundwater.” S1: “Now you know why water is never lost and why water cycle is important.” [More varied and complex sentence structures] | CG2-Unit 14 S1: “Hi, everyone, we are group 4. Today we make a water cycle poster.” S2: “First we need to know what is the start of water cycle. The sun start from the water cycle. It makes the heat to the ocean. The ocean evaporate into.into the water vapor. The water vapor condensation to the cloud… And groundwater going to the ocean.” S3: “Now, I will give you 3 questions if you correct. I will give you this cookie!” S1: “Water cycle start from the sun. True or False?” S2: “True.” S1: “What is this one?” S2: “Evaporate.” S1: “What.. comes from windmill?” S2: “False” S3: “Good job.” [More incomplete sentences] |
The bolded italicized parts contributed to the interpretation in the brackets.
Table 5. Raters’ refection’s on cognitive processing.
Group | EG (VR) Groups | CG (PPT) Groups |
---|---|---|
Raters’ Reflection | Higher-order cognitive processing | Lower-order of cognitive processing |
Examples of Presentation Excerpts | EG1-Unit 11 S3: “People need sleep, shelter, air, water, clothes and food. They are people’s need under (Point to Maslow’s pyramid of the bottom need). But we need more. We need safety, love, esteem and self-actualization.” [APPLY/EVALUATE] S4: “Baby, children, adults, and older persons also need shelter, water, food, air, and clothes. Different baby need love and milk. Children need love. Adult persons need work. Older persons need to sleep.” [CONTRAST & ANYLYZE] | CG1-Unit 11 S3: “People and animals are living things. Bread and food, toast can’t move is not living things.” [REMMEBR/UNDERSTAND] S4: “Plants need water and sun and air. Plant is living things.” [REMMEBR/UNDERSTAND] |
The bolded italicized parts contributed to the interpretation in the brackets.
Raters’ reflections on poster presentations: complementing quantitative results
Content: Raters’ analysis of poster presentation content revealed that the EG groups exhibited more comprehensive and detailed content (Table 3). EG presentations demonstrated a deeper understanding of scientific concepts, integrating examples from the games into their explanations (e.g., gas distribution in the atmosphere, wind creation, fresh and salty water percentages, air pollution impacts). In contrast, the CG groups relied on more straightforward presentations, often listing answers to prompted questions without delving deeply into the underlying scientific principles.
Communication: Raters noted that the EG groups demonstrated a more extensive and precise use of scientific terminology compared to the CG groups (Table 4). EG posters incorporated targeted vocabulary more effectively to explain scientific concepts, enhancing the clarity and sophistication of their presentations. In contrast, the CG groups exhibited less nuanced vocabulary usage. Regarding sentence complexity, EG posters displayed more varied and complex sentence structures compared to CG posters, contributing to a more engaging and informative presentation style. Conversely, CG groups tended to use simpler sentence structures.
Cognition: Raters reflected that the EG groups demonstrated deeper cognitive levels of understanding and analysis of scientific concepts compared to the CG groups (Table 5). For instance, EG 1-Unit 11 showcased advanced cognitive processes by applying Maslow’s hierarchy of needs to different life stages, indicating depth and critical thinking. Similarly, EG 2-Unit 12 demonstrated understanding and application by describing wind energy concepts and linking them to real-world examples like air pollution in Taiwan. In contrast, the CG groups primarily exhibited basic cognitive processing, often focusing on recall and simple understanding.
Discussion
Overall, the EG multimodal output in posters and oral presentations demonstrated significantly higher scores in Content compared to the CG groups. Rater reflections indicated that the EG posters demonstrated greater depth, accuracy, and relevance of scientific content, as well as a more balanced integration of graphical elements and textual information to showcase their understanding and knowledge. These enhanced performances in multimodal output could partly be attributed to the multimodal input provided by the VR games, which likely enhanced content retention. Specifically, the information boards that combined graphic and textual information provided scaffolding, thereby supporting learning through mediating texts and artifacts (Vygotsky, 1978; Lin et al., 2023). Interactive features, such as avatars delivering brief audio explanations and information boards with content knowledge texts, enabled students to engage dynamically with additional vocabulary and content knowledge, thus making the learning process more interactive and immersive (Chen and Hsu, 2020; Lai and Chen, 2021). Moreover, the buddy system, which facilitated collaboration as students worked in pairs to read aloud information boards and answer questions, likely enhanced both vocabulary acquisition and comprehension (Lin et al., 2023). Sequential navigation with feedback for both correct and incorrect answers may have reinforced learning and retention (Egbert et al., 2020). Consistent content across the VR and PPT games ensured comprehensive coverage of vocabulary, definitions, applications, and pictures, promoting uniform learning experiences (Lo and Lin, 2019). Notably, the engaging environments in the VR games, such as pollution-focused and water-cycle settings, made the learning experience more relatable and engaging (Fernández-Fontecha et al., 2020). Together, the multimodal input from the VR games may have facilitated better retention of information, deeper understanding, and more engaging presentations of scientific concepts in both posters and oral presentations. Additionally, the immersive and interactive nature of the VR games allowed EG students to integrate and apply learned content more effectively, contributing to their superior performance in content and cognitive aspects.
The findings of this study provide significant insights into the role of VR games in enhancing multimodal output within CLIL science education. The superior performance of the EG in Content and Cognition suggests that the immersive and interactive nature of VR games facilitated deeper engagement with scientific content. The multimodal input provided by the VR games, such as the combination of graphic and textual information, seems to have supported students in achieving a more nuanced and comprehensive understanding of scientific concepts. These results align with existing research highlighting the effectiveness of VR environments in fostering cognitive engagement and content retention (Yang et al., 2021; Chen et al., 2022). However, this study goes beyond these findings by demonstrating that VR games can also enhance higher-order cognitive skills, such as analysis and application, which are critical for deep learning in CLIL contexts.
In terms of Communication, while there were no significant differences in sentence length or grammar accuracy, the higher scores in vocabulary usage and sentence complexity for the EG suggest that the VR games provided a more engaging and supportive environment for language use. This finding indicates that VR games can serve as a valuable tool for enhancing language proficiency, particularly in complex linguistic tasks. Raters noted that the EG groups demonstrated more extensive and precise use of scientific terminology (e.g., sink, nitrogen, self-esteem) compared to the CG groups. EG posters also incorporated targeted vocabulary more effectively to explain scientific concepts (e.g., human needs, water cycle, atmosphere, air composition), enhancing clarity and sophistication. Additionally, EG posters displayed more varied and complex sentence structures, using sophisticated language to describe scientific phenomena and concepts. In contrast, the CG groups exhibited less nuanced vocabulary usage and simpler sentence structures, perhaps due to their reliance on group-based review games that emphasized comprehension over linguistic complexity.
These Communication results in the EG multimodal output could be attributed to the specific multimodal input provided by the VR games. The VR games included detailed information boards combining graphic and textual information, which supported vocabulary learning and sentence complexity through scaffolding and mediating texts (Vygotsky, 1978; Lin et al., 2023). Interactive features such as avatars delivering brief audio explanations and engaging with additional vocabulary content likely made the learning process dynamic and immersive, reinforcing the use of precise scientific terminology (Chen and Hsu, 2020; Lai and Chen, 2021). The buddy system, where students read aloud and discussed information, may have enhanced vocabulary acquisition and encouraged the use of more complex sentences (Lin et al., 2023). The immersive and interactive nature of VR environments, along with structured feedback mechanisms, provided consistent and engaging multimodal input, fostering a deeper and more sophisticated use of language in both posters and presentations (Egbert et al., 2020; Fernández-Fontecha et al., 2020). The study results align with Rubio-López (2024), who found that integrating multimodal communication using game-based learning significantly enhanced students’ multimodal communicative competence. This parallels the present study, where the use of VR games improved content comprehension and facilitated better communication of scientific concepts through multimodal output. Additionally, Wiboolyasarin et al. (2023) highlighted the potential of 3D virtual worlds in enhancing L2 speaking proficiency, particularly in fluency, pronunciation, and language use. This finding is consistent with the present study’s results, where EG students’ oral presentations were notably better in terms of content depth and vocabulary usage, indicating that the interactive and immersive nature of VR environments supports more effective language practice and application.
Regarding the lack of significant differences in sentence length and grammar accuracy in presentations, raters’ reflections revealed that both groups demonstrated comparable levels of confidence and fluency. This similarity may be attributed to their reliance on notes rather than memorization, which allowed for more natural and composed delivery styles. Furthermore, the comparable grammar accuracy between the EG and CG groups indicates similar levels of linguistic proficiency in their presentations. This could be due to the unedited nature of the presentations, enabling authentic and unrehearsed expression. Additionally, the VR games likely prioritized content comprehension and engagement, potentially overlooking explicit grammar instruction.
The EG multimodal output in posters and oral presentations demonstrated significantly higher scores in Cognition compared to the CG groups. Rater reflections indicated that the EG groups exhibited higher-order cognitive skills, such as evaluation and application, and effective cross-unit application, showcasing robust engagement and understanding in their posters. For presentations, the EG groups displayed a more comprehensive understanding of scientific concepts by integrating examples from the games into their explanations, whereas the CG groups primarily exhibited basic cognitive processing, often focusing on recall and simple understanding, listing answers to prompted questions without delving deeply into the underlying scientific principles. These cognitive results in the EG multimodal output could be attributed to the specific multimodal input provided by the VR games. The VR games included detailed information boards that combined graphic and textual information, which supported higher-order cognitive skills through scaffolding and mediating texts (Vygotsky, 1978; Lin et al., 2023). Interactive features such as avatars delivering brief audio explanations and engaging with additional vocabulary content made the learning process dynamic and immersive, reinforcing the application and evaluation of scientific concepts (Chen and Hsu, 2020; Lai and Chen, 2021). The buddy system, where students read aloud and discussed information, likely enhanced comprehension and encouraged deeper cognitive processing (Lin et al., 2023).
However, much of the existing research on game-based VRLEs focuses on basic language proficiency and creativity (e.g., Chen et al., 2022; Yang et al., 2021) rather than on higher-level cognitive processing skills. The present study reveals that the EG groups demonstrated significantly higher scores in Cognition compared to the CG groups. This finding suggests that VRLEs can enhance not only language skills but also higher-order cognitive skills such as evaluation, application, and cross-unit knowledge integration. This discrepancy indicates that while previous studies have primarily observed improvements in creativity and language proficiency, the potential of VRLEs to foster advanced cognitive skills remains underexplored.
Despite these positive findings, the lack of significant differences in Culture highlights a potential limitation in the design of both VR and PPT games. Future studies should consider incorporating more explicit cultural content to fully leverage the potential of VR in fostering cultural awareness alongside cognitive and linguistic skills. The PPT games maintained student engagement through varied question types, repetition, and interactive elements, but these features primarily aimed at enhancing content retention and engagement rather than cultural understanding. Similarly, the VR games focused on scientific target vocabulary and content learning, limiting opportunities for cultural exploration and integration. Without explicit incorporation of cultural elements or contexts within the VR games, learners’ exposure to cultural diversity and awareness may have been insufficient. Additionally, the study’s emphasis on scientific concepts and language proficiency may have overshadowed the cultural dimensions typically associated with GVRLEs. This highlights the need for deliberate integration of cultural content within immersive learning environments to achieve desired outcomes in cultural awareness and sensitivity.
Self-regulated learning (SRL) is a dynamic process where learners actively manage their cognitive processes, social interactions, and motivational strategies to achieve educational goals (Usher and Schunk, 2018; Zimmerman, 2000). The findings of this study reflect SRL principles in several ways. The higher scores in Content and Cognition for the EG groups suggest that the VR games facilitated better cognitive and metacognitive engagement. The interactive features of the VR games, such as avatars and detailed information boards, likely supported strategic thinking and task analysis during the forethought phase (Zimmerman, 2002). Additionally, the buddy system exemplifies behavioral strategies of SRL, where learners structured their environment and managed tasks collaboratively, enhancing comprehension and content retention (Weinstein et al., 2011). The immersive nature of VR games may have also fostered intrinsic motivation, driving students to engage deeply with the content (Wallin and Adawi, 2018). However, the lack of significant differences in cultural awareness suggests a gap in SRL application, particularly in the motivational domain concerning cultural sensitivity. This highlights the need for incorporating explicit cultural elements to fully leverage SRL strategies in enhancing not just cognitive and linguistic skills, but also cultural competence.
Conclusion
This study has made several significant contributions, both empirically and theoretically. Empirically, it has demonstrated that game-based VRLEs can substantially enhance students’ multimodal output in terms of content and cognitive skills. The findings showed that students in the EG using VR games produced posters and presentations with greater depth, accuracy, and relevance of scientific content, as well as higher-order cognitive skills such as evaluation and application. This highlights the effectiveness of immersive and interactive VR environments in fostering advanced cognitive processing and content understanding, aligning with the findings of Yang et al. (2021) and Chen et al. (2022) on VR’s potential to improve language learning outcomes. Furthermore, the study revealed that the multimodal input from VR games, including detailed information boards, interactive features, and the buddy system, significantly contributed to these improvements, providing practical insights for educational technology integration.
Methodologically, this study employed a rigorous quasi-experimental design, utilizing multimodal assessments to evaluate both written and oral output in a CLIL context. The use of independent-samples t-tests and rater reflections provided a comprehensive analysis of the data, ensuring that the research questions were answered accurately and that the findings were robust. The integration of multimodal assessments—specifically evaluating student-created posters (writing) and oral presentations (speaking)—was a key methodological strength, as it allowed for a holistic evaluation of students’ learning processes and outcomes. This methodological approach highlights the importance of using diverse semiotic modes to capture a comprehensive understanding of student learning, aligning with the theoretical frameworks proposed by Kress (2000) and Lemke (2002).
Theoretically, this study extends the application of SRL by illustrating how VRLEs can enhance SRL strategies in a CLIL context. The study demonstrated that the immersive nature of VR games supported strategic thinking, task analysis, and collaborative learning, which are core components of SRL. Additionally, the study highlights a critical gap in existing research: while much of the literature focuses on basic language proficiency and creativity, this study emphasizes the potential of VRLEs to foster higher-order cognitive skills, such as evaluation and application, and cross-unit knowledge integration. This theoretical contribution suggests that VRLEs can be a powerful tool not just for language acquisition but for comprehensive educational development. Moreover, the findings indicate the need for deliberate integration of cultural content within immersive learning environments to enhance cultural awareness and sensitivity, an area that has been relatively underexplored. These contributions provide a robust foundation for future research and practical applications in the use of VR technology in education.
This study also contributes significantly to the existing body of research on multimodal assessments by empirically demonstrating their effectiveness in evaluating both written and oral output in a CLIL context. By employing multimodal assessments—specifically evaluating student-created posters (writing) and oral presentations (speaking)—this study highlights the importance of using diverse semiotic modes to capture a comprehensive understanding of students’ learning processes and outcomes. The integration of text, visuals, and oral modalities provided a richer and more accurate representation of student knowledge and skills, aligning with the theoretical frameworks proposed by Kress (2000) and Lemke (2002). This approach not only supported content mastery but also enhanced communicative competence, cognitive engagement, and cultural awareness, as emphasized by Coyle’s (2007) 4Cs framework. The findings reinforce the potential of multimodal assessments to offer holistic evaluations, thereby enabling educators to diagnose learning challenges more effectively and provide targeted feedback. Aligning with Grapin (2020) and Grapin and Llosa (2022), this study extends the empirical evidence on the benefits of multimodal assessments, particularly in game-based VRLEs, and underscores the need for such practices in fostering a more inclusive and effective educational environment.
Limitations and suggestions for future studies
Despite the promising findings, this study has several limitations that warrant consideration. First, the sample size was relatively small and limited to a specific context of fourth-grade students in a CLIL science course in Taiwan, which may limit the generalizability of the results to other educational settings and age groups. The study’s focus on this specific context means that the findings may not fully capture the potential variations in how VRLEs impact students of different ages or in different educational environments. Second, the study focused primarily on the cognitive and content dimensions of learning, with less emphasis on the affective, social, and cultural aspects that could also influence learning outcomes in game-based VRLEs. While the VR games effectively supported content understanding and cognitive processing, they lacked explicit integration of cultural elements, which could have further enhanced students’ cultural awareness and sensitivity. Additionally, while the study employed a comprehensive multimodal assessment approach, the potential for observer bias in qualitative rater reflections cannot be entirely ruled out. Furthermore, the study did not explore the long-term effects of VRLEs on learning outcomes, which is a critical area for future research. Understanding how the benefits of VRLEs persist or evolve over time would provide deeper insights into their sustained impact on student learning. Lastly, although this study has demonstrated the effectiveness of VR-based review games, the lack of a publicly accessible project website limits the opportunity for broader dissemination of the game samples. Future research could address this by creating an online platform to showcase these materials and promote further adoption of VR-based learning tools in educational contexts.
Future research should not only consider larger and more diverse sample sizes to enhance the generalizability of findings but also expand to include a broader range of educational settings and age groups. Moreover, longitudinal studies could provide valuable insights into the long-term effects of VRLEs on students’ learning outcomes, offering a more comprehensive understanding of their effectiveness over time. It would also be beneficial to explore the integration of cultural content within VR environments, as this could play a significant role in fostering cultural awareness alongside cognitive and linguistic skills. Exploring the affective and social dimensions of learning in VR environments, such as student motivation, engagement, and collaboration, is also important. Lastly, incorporating more objective measures and advanced analytical techniques, such as eye-tracking and interaction analysis, could provide a more nuanced understanding of how students interact with multimodal input in VRLEs and how these interactions impact their learning processes and outcomes.
Author contributions
The author solely conducted the research, analyzed the data, and prepared the manuscript.
Data availability
The datasets used and analyzed during the current study are available in the Zenodo repository, accessible via the following https://doi.org/10.5281/zenodo.13974102. This repository provides open access to all the raw and curated data.
Competing interests
The author declares no competing interests.
Ethical approval
This study was approved by the Research Ethics Committee of the National Changhua University of Education, Taiwan (ROC), in compliance with relevant guidelines and regulations, including the Declaration of Helsinki. The approval number is NCUEREC-111-043, and the approval date was July 6, 2023. The approval covers all interventions and data collection processes conducted in this study.
Informed consent
Written informed consent was obtained from the parents or legal guardians of all participants before the study began. The consent covered participation in the study, data collection, and publication of the research findings. The consent process was conducted by Cheng-Ji Lai on September 2, 2023. All participants and their legal guardians were informed that anonymity would be maintained, and participation was voluntary, with the option to withdraw at any time.
Supplementary information
The online version contains supplementary material available at https://doi.org/10.1057/s41599-024-03999-y.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
Al-Hawamleh, MS; Alazemi, AF; Al-Jamal, DA. Digital portfolio and self-regulation in speaking tasks. Asian-Pac J Second Foreign Lang Educ; 2022; 7,
Alrehaili, E; Al Osman, H. A virtual reality role-playing serious game for experiential learning. Interact Learn Environ; 2019; 30,
Bai, B; Wang, J; Zhou, H. An intervention study to improve primary school students’ self-regulated strategy use in English writing through e-learning in Hong Kong. Comput Assist Lang Learn; 2022; 25,
Bayram, D; Öztürk, RÖ; Atay, D. Reading comprehension and vocabulary size of CLIL and non-CLIL students: a comparative study. Lang Teach Educ Res; 2019; 2,
Beltran-Palanques, V. Assessing video game narratives: implications for the assessment of multimodal literacy in ESP. Assess Writ; 2024; 60, [DOI: https://dx.doi.org/10.1016/j.asw.2024.100809]
Burguillo, JC. Using game theory and competition-based learning to stimulate student motivation and performance. Comput & Educ; 2010; 55,
Campoy C, Querol-Juli´an M (2021) Assessing multimodal listening comprehension through online informative videos: The operationalisation of a new listening framework for ESP in higher education. In: Diamantopoulou S, Ørevik S (eds.), Multimodality in English language learning. Routledge. pp. 238–256
Chen, HJH; Hsu, HL. The impact of a serious game on vocabulary and content learning. Comput Assist Lang Learn; 2020; 33,
Chen, Y-T; Li, M; Huang, C-Q; Han, Z-M; Hwang, G-J; Yang, G. Promoting deep writing with immersive technologies: an SVVR-supported Chinese composition writing approach for primary schools. Br J Educ Technol; 2022; 53,
Cho, M-H; Shen, D. Self-regulation in online learning. Distance Educ; 2013; 34,
Coyle, D. Content and language integrated learning: towards a connected research agenda for CLIL pedagogies. Int J Bilingual Educ Bilingualism; 2007; 10,
Coyle D, Hood P, Marsh D (2010) CLIL: Content and language integrated learning. Cambridge University Press
Dallinger, S; Jonkmann, K; Hollm, J; Fiege, C. The effect of content and language integrated learning on students’ English and history competences—killing two birds with one stone?. Learn Instr; 2016; 41, pp. 23-31. [DOI: https://dx.doi.org/10.1016/j.learninstruc.2015.09.003]
Egbert J, Shahrokni, SA, Zhang X, Herman D, Yahia I, Mohamed A, Lopez-Lopez S (2020) Revisiting gaps in the CALL literature. In: Zou B, Thomas M (eds.). Recent developments in technology-enhanced and computer-assisted language learning. IGI Global. pp. 1–29
Espinet M, Valdés-Sánchez L, Hernández M (2018) Science and language experience narratives of pre-service primary teachers learning to teach science in multilingual contexts. In: Danielsson K, Tang KS (eds.). Global developments in literacy research for science education. Springer. pp. 321–338
Fernandes, A; Kahn, LH; Civil, M. A closer look at bilingual students’ use of multimodality in the context of an area comparison problem from a large-scale assessment. Educ Stud Math; 2017; 95,
Fernández-Fontecha, A; O’Halloran, KL; Wignell, P; Tan, S. Scaffolding CLIL in the science classroom via visual thinking: a systemic functional multimodal approach. Linguist Educ; 2020; 55, [DOI: https://dx.doi.org/10.1016/j.linged.2019.100788]
Gilabert, R; Manchón, RM; Vasylets, O. Mode in theoretical and empirical TBLT research: advancing research agendas. Annu Rev Appl Linguist; 2016; 36, pp. 117-135. [DOI: https://dx.doi.org/10.1017/s0267190515000112]
Grapin, SE. Multimodality in the new content standards era: implications for English learners. TESOL Q; 2019; 53,
Grapin SE (2020) Multimodal assessment of English learners in science: expanding what “counts” as evidence of content learning (Publication No. 27834657) [Doctoral dissertation. New York University] ProQuest Dissertations and Theses Global
Grapin, SE; Llosa, L. Multimodal tasks to assess English learners and their peers in science. Educ Assess; 2022; 27,
Heritage M, Walqui A, Linquanti R (2015) English language learners and the new standards: developing language, content knowledge, and analytical practices in the classroom. Harvard Education Press
Huang, H-YC; Lo, M-F; Tseng, C-J. Applying pedagogical translanguaging via Google Translate to facilitate non-English major juniors in writing scripts for English presentations. Educ Technol Soc; 2024; 27,
Huang, Y-C. The effects of elementary students’ science learning in CLIL. Engl Lang Teach; 2020; 13,
Ivankova, NV; Creswell, JW; Stick, SL. Using mixed-methods sequential explanatory design: From theory to practice. Field Methods; 2006; 18,
Jao, C-Y; Yeh, H-C; Huang, W-R; Chen, N-S. Using video dubbing to foster college students’ English-speaking ability. Comput Assist Lang Learn; 2023; 37,
Kress, G. Multimodality: challenges to thinking about language. TESOL Q; 2000; 34, pp. 337-340. [DOI: https://dx.doi.org/10.2307/3587959]
Lai, K-WK; Chen, H-JH. A comparative study on the effects of a VR and PC visual novel game on vocabulary learning. Comput Assist Lang Learn; 2021; 36,
Lemke, J. Travels in hypermodality. Vis Commun; 2002; 1, pp. 299-325. [DOI: https://dx.doi.org/10.1177/147035720200100303]
Lin, V; Barrett, NE; Liu, G-Z; Chen, N-S; Jong, MS-Y. Supporting dyadic learning of English for tourism purposes with scenery-based virtual reality. Comput Assist Lang Learn; 2023; 36,
Liu, GZ; Lin, V; Kou, X; Wang, HY. Best practices in L2 English source use pedagogy: a thematic review and synthesis of empirical studies. Educ Res Rev; 2016; 19, pp. 36-57. [DOI: https://dx.doi.org/10.1016/j.edurev.2016.06.002]
Liu, JE; Lo, YY; Xin, JJ. CLIL teacher assessment literacy: a scoping review. Teach Teach Educ; 2023; 129, [DOI: https://dx.doi.org/10.1016/j.tate.2023.104150]
Liu, Y; Jang, BG; Roy-Campbell, Z. Optimum input mode in the modality and redundancy principles for university ESL students’ multimedia learning. Comput Educ; 2018; 127, pp. 190-200. [DOI: https://dx.doi.org/10.1016/j.compedu.2018.08.025]
Lo YY (2020) Introduction. In: Professional development of CLIL teachers. CLIL—Is it possible to define or delineate?. Springer. (pp. 3–13) https://doi.org/10.1007/978-981-15-2425-7_1
Lo, YY; Lin, AMY. Special issue: designing multilingual and multimodal CLIL frameworks for EFL students. Int J Bilingual Educ Bilingualism; 2015; 18,
Lo YY, Lin AMY (eds.) (2019). Special issue: teaching, learning and scaffolding in CLIL science classrooms. J Immers Content-Based Lang Educ 7(2), 151–328. https://doi.org/10.1075/bct.115
Mahmood, RQ. Enhancing EFL speaking and pronunciation skills: Using explicit formal instruction in a Kurdish university. Issues Educ Res; 2023; 33,
Nikula, T; Moore, P. Exploring translanguaging in CLIL. Int J Bilingual Educ Bilingualism; 2019; 22,
Neuman, SB; Danielson, K. Enacting content-rich curriculum in early childhood: the role of teacher knowledge and pedagogy. Early Educ Dev; 2021; 32,
Otwinowska, A; Foryś, M. They learn the CLIL way, but do they like it? Affectivity and cognition in upper-primary CLIL classes. Int J Bilingual Educ Bilingualism; 2017; 20,
Parmaxi, A. Virtual reality in language learning: a systematic review and implications for research and practice. Interact Learn Environ; 2023; 31,
Pintrich, PR. A conceptual framework for assessing motivation and self-regulated learning in college students. Educ Psychol Rev; 2004; 16, pp. 385-407. [DOI: https://dx.doi.org/10.1007/s10648-004-0006-x]
Querol-Juli´an, M; Beltr´an-Palanques, V. PechaKucha presentations to develop multimodal communicative competence in ESP and EMI live online lectures: A team-teaching proposal. Comput Assist Lang Learn Electron J; 2021; 22,
Rubio-López, BP. Developing EFL students’ multimodal communicative competence through Lady Whistledown’s Society Papers: a teaching proposal. Profile: Issues Teach’ Professional Dev; 2024; 26,
Santhanam, R; Liu, D; Shen, W-CM. Gamification of technology-mediated training: Not all competitions are the same. Inf Syst Res; 2016; 27,
Schonfeld P (2013) Learning, civic engagement, and digital media: A case study of young adolescents making games and animations about civic issues [Master’s thesis, University of Wisconsin]. University of Wisconsin Digital Library. http://digital.library.wisc.edu/1793/66165
Schöbel, SM; Janson, A; Söllner, M. Capturing the complexity of gamification elements: A holistic approach for analysing existing and deriving novel gamification designs. Eur J Inf Syst; 2020; 29,
Schmitt, N. Instructed second language vocabulary learning. Lang Teach Res; 2008; 12,
Schwinger, M; Otterpohl, N. Which one works best? Considering the relative importance of motivational regulation strategies. Learn Individ Differ; 2017; 53, pp. 122-132. [DOI: https://dx.doi.org/10.1016/j.lindif.2016.12.003]
Shi, A; Wang, Y; Ding, N. The effect of game–based immersive virtual reality learning environment on learning outcomes: designing an intrinsic integrated educational game for pre–class learning. Interact Learn Environ; 2019; 30,
Steffen, JH; Gaskin, JE; Meservy, TO; Jenkins, JL; Wolman, I. Framework of affordances for virtual reality and augmented reality. J Manag Inf Syst; 2019; 36,
Tedick, DJ; Wesely, PM. A review of research on content-based foreign/second language education in US K-12 contexts. Lang Cult Curric; 2015; 28,
Törmälä, V; Kulju, P. Work descriptions written by third-graders: an aspect of disciplinary literacy in primary craft education. J Writ Res; 2023; 15,
Tragant, E; Marsol, A; Serrano, R; Llanes, À. Vocabulary learning at primary school: A comparison of EFL and CLIL. Int J Bilingual Educ Bilingualism; 2016; 19,
Usher EL, Schunk DH (2018) Social cognitive theoretical perspective of self-regulation. In: Schunk DH, Greene JA (eds.). Handbook of self-regulation of learning and performance. 2nd edn. Routledge. pp. 19–35
Valdés-Sánchez, L; Espinet, M. Coteaching in a science-CLIL classroom: changes in discursive interaction as evidence of an English teacher’s science-CLIL professional identity development. Int J Sci Educ; 2020; 42,
Vygotsky LS (1978) Mind in society: The development of higher psychological processes. Harvard University Press
Wallin, P; Adawi, T. The reflective diary as a method for the formative assessment of self-regulated learning. Eur J Eng Educ; 2018; 43,
Webb S (Ed) (2019) The Routledge handbook of vocabulary studies. Routledge
Weinstein CE, Acee TWJung J (2011) Self-regulation and learning strategies. New Direct Teach Learn 126:45–53
Wells CG (1999) Dialogic inquiry (vol. 10). Cambridge University Press
Wiboolyasarin, W; Jinowat, N; Wiboolyasarin, K et al. Enhancing L2 speaking proficiency through collaborative tasks in RILCA world: the case of East Asian learners. Asian J Second Foreign Lang Educ; 2023; 8, 37. [DOI: https://dx.doi.org/10.1186/s40862-023-00209-1]
Winne, P. Theorizing and researching levels of processing in self-regulated learning. Br J Educ Psychol; 2018; 88,
Xu, J. Chinese University Students’ L2 writing feedback orientation and self-regulated learning writing strategies in online teaching during COVID-19. Asia-Pac Educ Researcher; 2021; 30,
Xu, J; Wang, Y. The impact of academic buoyancy and emotions on university students’ self-regulated learning strategies in L2 writing classrooms. Read Writ; 2024; 37, pp. 49-67. [DOI: https://dx.doi.org/10.1007/s11145-023-10411-9]
Yang, G; Chen, Y-T; Zheng, X-L; Hwang, G-J. From experiencing to expressing: a virtual reality approach to facilitating pupils’ descriptive paper writing performance and learning behavior engagement. Br J Educ Technol; 2021; 52,
Zimmerman BJ (2000) Attaining self-regulation: a social cognitive perspective. In: Boekaerts M, Pintrich PR, Zeider M (eds.). Handbook of self-regulation. Academic Press. pp. 13–39
Zimmerman, BJ. Becoming a self-regulated learner: an overview. Theory Pract; 2002; 41,
Zimmerman BJ (2011) Motivational sources and outcomes of self-regulated learning and performance. In: Zimmerman BJ, Schunk DH (eds.). Handbook of self-regulation of learning and performance. Routledge. pp. 49–64
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© The Author(s) 2024. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
Despite the growing recognition of the importance of multimodal input and digital virtual reality (VR) games in enhancing EFL learners’ productive language skills, a significant gap remains in empirical research examining their impact on multimodal output—particularly writing and speaking—within content and language integrated learning (CLIL) science education. This quasi-experimental study addresses this gap by investigating the potential benefits of using VR games to enhance fourth-grade CLIL students’ productive language skills, specifically writing and speaking, through the analysis of their ability to convey scientific concepts in multimodal output. Grounded in self-regulated learning (SRL) theory, the study compares the effects of multimodal input embedded in VR games with those of traditional PowerPoint (PPT)-led games on students’ English poster designs (writing) and oral presentations (speaking), using the 4Cs (Content, Communication, Cognition, and Culture) framework in multimodal assessments. The study involved 81 fourth-grade students from three Taiwanese public elementary schools, divided into an experimental group (EG = 40) using VR-based games and a control group (CG = 41) using PPT-led games for content review. A mixed-methods approach was employed, combining quantitative evaluations with rubrics based on the 4Cs framework and qualitative rater reflections to provide a comprehensive understanding of how different review methods influenced student performance and creative output. Quantitative findings revealed that students using VR review games significantly outperformed those using traditional PPT games in aspects of Content and Cognition for both poster designs and presentations, demonstrating greater depth, accuracy, and application of scientific concepts and higher-order cognitive skills. In terms of Communication, the EG showed higher target vocabulary usage and sentence complexity in presentations, but no significant differences were found in Culture outcomes between the groups or in Communication in posters. Expert raters’ reflections further highlighted that students using VR games exhibited more innovative and integrated use of scientific content, critical thinking, and multimodal expressions, reflecting deeper engagement with the material. This study empirically demonstrates that game-based virtual reality learning environments (VRLEs) significantly enhance students’ multimodal output in content and cognitive skills. Theoretically, it extends the application of SRL in CLIL contexts by highlighting the potential of VRLEs to foster advanced cognitive skills and emphasizes the importance of multimodal assessments in capturing comprehensive student learning outcomes. Future research should explore integrating cultural content into VR environments to enhance students’ cultural awareness and sensitivity.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer