Content area
Aim
This review determined the effectiveness of education based on extended reality (XR) for nursing and midwifery students’ anatomy, physiology and pathology education.
BackgroundUnderstanding anatomy, physiology and pathology is essential for nursing and midwifery students. XR improves health science students’ anatomical knowledge more than traditional education; however, consistent findings regarding nursing and midwifery students remain lacking.
DesignSystematic review and meta-analysis.
MethodsWe searched MEDLINE, CINAHL, Web of Science, ERIC, CENTRAL and Igaku Chuo Zasshi databases. All randomized controlled trials (RCTs) on XR’s effectiveness for nursing and midwifery students’ anatomy, physiology and pathology education were identified. Pooled effect estimates related to knowledge and learning load were calculated. The certainty of evidence was assessed using the Grading of Recommendations Assessment, Development and Evaluation approach.
ResultsWe searched 619 references and identified 6 RCTs. Compared with traditional education, XR moderately positively affected post-education knowledge, but there was no significant difference (five trials; SMD = 1.04 [95 % CI: −0.25–2.33]). Regarding differences in knowledge pre- and post-education, XR showed a large positive effect (four trials; SMD = 5.86 [95 % CI: 2.48–9.25]) and exhibited a moderately significant negative effect on learning load (three trials; SMD = −0.45 [95 % CI: −0.75 to −0.14]). The certainty of evidence was “very low” and “low” for knowledge and learning load, respectively.
ConclusionXR use in nursing and midwifery students’ anatomy, physiology and pathology education yielded a low learning load and effectively improved knowledge. Nevertheless, few studies were included in the meta-analysis, necessitating large RCTs
Nursing and midwifery students must develop clinical skills through early and ongoing clinical experiences ( Mannino et al., 2021). However, past studies have indicated that nursing and midwifery students are not prepared for clinical practice ( Joolaee et al., 2015; Taniguchi et al., 2015). Nursing and midwifery students have expressed concerns about their competence, patient safety and making mistakes during their first clinical placement, highlighting the need for better preparation activities and support systems to increase their confidence and competence ( Levett-Jones et al., 2015; Taniguchi et al., 2015). Recent studies have focused on the use of extended reality (XR) in nursing and midwifery education.
2 BackgroundXR is defined as a term referring to all real-and-virtual combined environments and human–machine interactions generated by computer technology and wearables and is a generic term for augmented reality (AR), virtual reality (VR) and mixed reality (MR) ( Fast-Berglund et al., 2018). XR is an effective supplement to traditional educational methods and can provide a no-risk learning environment, addressing such issues as reduced preparation and opportunities for clinical practice ( Hong and Wang, 2023; Woon et al., 2021). A systematic review of XR technology in medical education showed that AR and VR implemented through head-mounted displays (HMDs) are most commonly used for training in such fields as surgery and anatomy, with quantitative methods predominating the study design ( Adnan and Xiao, 2023; Barteit et al., 2021).
The educational effects of XR on nursing and midwifery students have been demonstrated in many studies, which have shown that XR is involved in improving knowledge ( Chen et al., 2020; Liu et al., 2023; Ota et al., 2024; Uslu-Sahan et al., 2023; Woon et al., 2021). Knowledge has also been shown to be effective regardless of the type of XR or immersion level ( Ota et al., 2024). However, the results of existing meta-analyses have revealed a high degree of heterogeneity, highlighting the need to identify specific factors influencing the educational effects of XR, such as the areas and contents of nursing and midwifery education ( Ota et al., 2024).
Foundational knowledge in such areas as anatomy, physiology and pathology is critical with respect to shaping clinical competence in nursing and midwifery ( Donkin et al., 2022). Traditionally, education has used didactic lectures, textbooks and atlases, which present two-dimensional images and use donor dissections and plastic models ( Papa et al., 2019). XR goes beyond the two-dimensional limitations of traditional teaching methods and can help learners understand their lessons by providing three-dimensional visualization and interaction with the inside of the human body, which is not normally possible. A systematic review and meta-analysis of health science students using XR technology demonstrated improvements in anatomical knowledge, suggesting the utility of these technologies in learning anatomy ( García-Robles et al., 2024; Uruthiralingam and Rea, 2020). Furthermore, researchers have conducted studies using VR and AR with nursing and midwifery students in the physiology and pathology fields, which are closely related to anatomy, but there are few such reports and the effects are not clear ( Bilen, 2018; Gray et al., 2022).
Despite these findings, no systematic reviews or meta-analyses have specifically validated the educational effectiveness of XR in the context of anatomy, physiology, or pathology education for nursing and midwifery students. Therefore, this review aimed to clarify this through a systematic review and meta-analysis.
3 Methods3.1 Protocol and registration
We performed this review according to the recommendations of the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) ( Page et al., 2021). The protocol for the systematic review was registered with the International Prospective Register of Systematic Reviews (registration number: CRD42024552643; registration date: June 9, 2024). Institutional review board approval was not required, as this study did not include individual patient data.
3.2 Eligibility criteriaWe used a population and intervention framework to guide the selection of eligible studies. The inclusion criteria were as follows: (1) population: nursing and midwifery students; (2) intervention: using XR for anatomy, physiology and pathology education; (3) control: traditional education using textbooks, didactic lectures and atlases; and (4) study design: randomized controlled trials (RCTs). Language restrictions were not imposed. Master’s theses and doctoral dissertations were included. Observational studies, non-randomized studies, qualitative studies, case studies, letters, commentaries, review studies, books, conference abstracts, preprints and articles published in non-academic journals were excluded.
3.3 Search strategyWe searched the MEDLINE (via PubMed), CINAHL (via EBSCOhost), Web of Science (via Ovid), ERIC (via ProQuest), Cochrane Central Register of Controlled Trials (CENTRAL) and Igaku Chuo Zasshi databases from their inception to June 10, 2024. We also conducted a citation search (based on Google Scholar) of the studies.
The search terms used in all databases included terms related to population and intervention based on eligibility criteria. The search terms included “nursing student” and “midwife” for nursing and midwifery students as well as “virtual reality,” “augmented reality,” “three-dimension,” “holography,” “anatomy,” “physiology,” and “pathology” as XR for anatomy, physiology and pathology. Furthermore, the search terms for the RCT filter provided by Cochrane were used to efficiently search for RCTs. The complete search strategy used in all the databases is presented in Text A.1.
3.4 Study selectionSearch results were imported into Endnote version 21 (Clarivate Analytics, Tokyo, Japan) to remove duplicates. Screening was based on standard protocols and conducted in two stages: first and second screenings. In the first screening, two of six researchers independently screened study titles and abstracts to identify potentially relevant studies using Rayyan (Rayyan Systems, Cambridge, Massachusetts; https://www.rayyan.ai). In the screening process, studies were identified based on the research design and conditions that matched the population, intervention and control of the eligibility criteria. The researchers’ screening results were collated, and agreement and disagreement were confirmed. Disagreements between the researchers were resolved through consensus discussions. If disagreements could not be resolved, a third researcher was assigned as the arbiter. We then obtained the full text of the research papers that met the eligibility criteria. In the secondary screening, two of six researchers independently evaluated the eligibility of the studies based on a full-text review. The screening was conducted based on the same eligibility criteria as the first screening. The researchers’ screening results were collated and agreement and disagreement were confirmed using the same procedure as that for the first screening, with a third researcher as arbiter in the case of disagreements that could not be resolved. In this way, the studies to be included in the review were identified.
3.5 Data extractionData extraction was performed independently by two of six researchers using a standard Google spreadsheet. The items to be extracted were listed in the first row and the information for the identified studies entered in the subsequent rows. The extracted data included the first author, year of publication, country where the study was carried out, participants, sample size, educational content, study design (types of RCT), XR group intervention content, control group intervention content, whether the intervention is complementary or a replacement, outcomes and measures, time of measurement after intervention, summary of results and protocol registration. Outcomes were not specified and those compared between intervention and control groups were extracted. To pool the results, the means and standard deviations of the outcome data were extracted.
3.6 Risk of bias assessmentTwo of six researchers independently assessed the risk of bias using the Cochrane Risk of Bias 2.0 ( Sterne et al., 2019). This tool evaluates bias in randomized trials across five domains: bias arising from the randomization process, deviations from intended interventions, missing outcome data, measurement of the outcome and selection of the reported result. Each domain was rated as “low risk,” “some concerns,” or “high risk” according to the predefined criteria. Disagreements between the researchers were resolved through consensus discussions. If disagreements could not be resolved, a third researcher was assigned as the arbiter.
3.7 OutcomeA meta-analysis was conducted on knowledge as the primary outcome. Secondary outcomes were not limited. They were identified in the review. The reason for not limiting the outcomes was to clarify the various positive and negative effects of educational interventions using XR. Meta-analysis was conducted for outcomes evaluated in multiple studies and outcomes evaluated in only one study could not be integrated. Therefore, such outcomes were qualitatively presented.
3.8 StatisticsThe graphs depicting risk of bias were generated using R statistical software version 4.4.1 (R Development Core Team) and the “robvis” package. Review Manager software version 5.4 (Copenhagen, Denmark) was used to perform the meta-analysis based on the Cochrane Handbook for Systematic Reviews of Intervention and PRISMA guidelines ( Higgins, 2024; Page et al., 2021). The primary analysis was performed using post-intervention data. To confirm the robustness of the effects, a sensitivity analysis was performed using the difference in data between pre- and post-intervention. Subgroup analysis was conducted on the participants and XR types. Participants were divided into nursing and midwifery students. XR types were divided into immersive VR, non-immersive VR and AR. Immersive VR was defined as being realized by a VR viewer with a head-mounted display ( Omlor et al., 2022). The estimates were integrated using a random-effects model. The results were expressed as forest plots using standardized mean differences (SMDs) with 95 % confidence intervals (CIs). Heterogeneity among studies was assessed using the I 2 statistics. I 2 was classified as low (0–40 %), moderate (30–60 %), substantial (50–90 %), or considerable (75–100 %).
For RCTs that compared three or more arms, the heterogeneity of the interventions was first assessed qualitatively. If the heterogeneity was low, the groups were combined. If heterogeneity was judged to be high, the control group was divided by the number of groups to increase the unit of analysis ( Axon et al., 2023). If the direction of the rating scale was reversed, then the outcome data were inverted. The inverted outcome data were calculated by subtracting the mean from the sum of the maximum and minimum values.
Holography was excluded from the primary analysis because it did not fall within the definition of XR. However, holography is like XR in terms of stereoscopic viewing as an educational intervention; therefore, we also conducted meta-analyses that included holography interventions.
3.9 Certainty of evidence assessmentTo assess the certainty of the evidence for the meta-analysis outcomes, the Grading of Recommendations Assessment, Development and Evaluation approach was used based on the Cochrane Handbook ( Higgins, 2024). For each outcome, two researchers (GA, HS) independently assessed the seriousness of the risk of bias, inconsistency, indirectness and imprecision. From those results, the certainty of evidence was graded on a scale comprising “high,” “moderate,” “low,” and “very low.” If there were limitations in any of these areas, we downgraded the certainty. For the certainty assessment process, certainty was initially set as high, as all of the studies in this research were RCTs. If there were no factors that reduced certainty in each domain, there was no downgrade; if there were significant factors, there was a one-level downgrade; and if there were very significant factors, there was a two-level downgrade. Additionally, based on the certainty and effects determined in the meta-analysis, the importance of the outcomes was assessed.
4 Results4.1 Characteristics of included studies
The flowchart of the literature search is shown in Fig. 1. Our search strategy identified 619 studies from electronic databases and citations, leaving 525 records after removing duplicates. Following title and abstract screening, 32 studies were deemed eligible for a full-text review. Twenty-six studies were excluded because they did not meet the inclusion criteria, leaving six studies for analysis. The characteristics of the six studies were qualitatively summarized and a meta-analysis was conducted on the five studies that showed the same results. Studies excluded from full-text screening and the reasons for their exclusion, are listed in Table A.1.
Table 1 shows the characteristics of the included six studies, such as author, year of publication, country, participants, educational content, study design, XR intervention, control intervention, outcomes, measures, timing of measurement after intervention and summary of results. All the studies were written in English and conducted between 2018 and 2022. Three of the six studies were carried out in the United States ( Aebersold et al., 2018; Bilen, 2018; Hackett and Proctor, 2018), whereas the remaining three were undertaken in Turkey ( Bolatli and Kizil, 2022), South Korea ( An et al., 2022) and Australia ( Gray et al., 2022). Five were targeted at nursing students ( Aebersold et al., 2018; An et al., 2022; Bilen, 2018; Bolatli and Kizil, 2022; Hackett and Proctor, 2018) and one was targeted at midwifery students ( Gray et al., 2022). Three of the six studies dealt with the anatomy of the circulatory system (e.g., anatomy of the mitral valve and structure of the heart) ( An et al., 2022; Bilen, 2018; Hackett and Proctor, 2018), while two dealt with the anatomy of the genital system (e.g., anatomy and physiology of the uterus) ( Bolatli and Kizil, 2022; Gray et al., 2022). The remaining study dealt with the anatomy related to the nasogastric tube ( Aebersold et al., 2018). There was no research that taught only physiology and only pathology; rather, there was research that taught one of the subjects combined with anatomy, such as research on heart dissection and mitral valve disease ( Bilen, 2018) and research on uterus dissection and physiological function ( Gray et al., 2022). In the XR type, one study used immersive VR ( Bilen, 2018), three studies used non-immersive VR ( Bolatli and Kizil, 2022; Gray et al., 2022; Hackett and Proctor, 2018) and two studies used AR ( Aebersold et al., 2018; An et al., 2022). All studies were parallel RCTs, five of which had a two-armed design while the remaining one was a three-armed parallel RCT ( Hackett and Proctor, 2018). Only one of the six studies had a protocol registration ( Bolatli and Kizil, 2022). Regarding whether the educational tools using XR technology were a replacement for or complement to traditional educational resources, four were replaced ( Aebersold et al., 2018; Bilen, 2018; Bolatli and Kizil, 2022; Hackett and Proctor, 2018) and two were used as complementary tools ( An et al., 2022; Gray et al., 2022).
The outcomes evaluated were knowledge, learning load of XR technology (e.g., mental effort and cognitive load, academic stress), competences (e.g., skill, self-regulated learning and perceived learning), learning flow and anxiety level. Regarding the type of assessment used to quantify knowledge, most studies used tests consisting of multiple-choice questions, whereas one study employed a combination of multiple-choice, drawing, labeling and open-ended questions. Self-report questionnaires using Likert scales were used to assess the learning load of XR technology. Three were evaluated immediately ( Bilen, 2018; Gray et al., 2022; Hackett and Proctor, 2018), one was evaluated three days later ( Bolatli and Kizil, 2022) and one was re-evaluated one month later ( Gray et al., 2022). The timing was not described in the two studies ( Aebersold et al., 2018; An et al., 2022).
In these six RCTs, the educational methods using XR were compared with conventional teaching materials and lectures. In two of the five studies that evaluated knowledge, non-immersive VR improved knowledge in the short term ( Bolatli and Kizil, 2022; Gray et al., 2022). In the remaining three studies that used immersive VR, non-immersive VR and AR, there was no significant difference from the control group ( An et al., 2022; Bilen, 2018; Hackett and Proctor, 2018). Even in the short-term study were knowledge was improved, the difference between the two groups was no longer significant after 1 month ( Gray et al., 2022). Non-immersive VR was significantly lower in mental effort than color printing images ( Hackett and Proctor, 2018). There was no significant difference in the effectiveness of cognitive load between immersive VR and class notes ( Bilen, 2018). AR was more significantly effective than animation and didactic lectures for skill competency ( Aebersold et al., 2018). Textbooks were significantly more effective than AR for self-regulated learning competency ( An et al., 2022). There was no difference in the effectiveness of passive learning competencies, learning flow and academic stress between AR and textbooks ( An et al., 2022). Non-immersive VR had a significantly lower anxiety level than the anatomical models, textbooks and atlases ( Bolatli and Kizil, 2022).
Meta-analyses were conducted on the outcomes of knowledge (five studies) and learning load (three studies) that have been evaluated in multiple studies. Three studies evaluated three things: mental effort, cognitive load and academic stress. These outcomes were integrated as “learning load,” that is, the load associated with learning ( Bedewy and Gabriel, 2015; Makransky et al., 2019; Stone et al., 2024; Sweller, 1988). Because there was only one study for each of skill competency, self-regulated learning competency, perceived learning competency, learning flow and anxiety level, it was not possible to conduct a meta-analysis.
4.2 Risk of biasThe risk of bias was assessed for the knowledge and learning load set as outcomes in multiple studies. Regarding knowledge outcomes overall, the studies of Bilen (2018) and Hackett and Proctor (2018) were assessed as high risk and those of Gray et al. (2022), Bolatli and Kizil (2022) and An et al. (2022) were assessed as of concern (Figure A.1). In Bilen (2018), concerns were raised for D1 (bias owing to randomization) and D5 (bias owing to the selection of reported outcomes), which increased the overall risk of bias. Hackett and Proctor (2018) assessed there would be a high risk of bias for D2 (bias owing to deviation from the intended intervention), increasing the overall risk of bias. In the studies of An et al. (2022), Bolatli and Kizil (2022) and Gray et al. (2022), there were concerns about D5 (bias owing to the selection of reported outcomes).
Regarding learning load outcomes, An et al. (2022), Bilen (2018) and Hackett and Proctor (2018) were assessed as being at high risk (Figure A.2). In particular, all of the studies were assessed as being at high risk for D4 (bias in outcome measurement) and D5 (bias owing to the selective reporting of results), which increased the overall risk of bias.
4.3 Meta-analysis4.3.1 Knowledge
Five RCTs evaluated the effects of XR on knowledge after education. The meta-analysis showed that XR had a moderate positive effect on knowledge after education compared with traditional education but did not reach statistical significance (five trials, n = 280; SMD = 1.04 [95 % CI: −0.25, 2.33]; I 2 = 95 %) ( Fig. 2). In the subgroup analysis, immersive VR and AR could not be integrated because there was only one trial. Non-immersive VR had a moderate positive effect, but did not reach statistical significance (three trials, n = 189; SMD = 1.86 [95 % CI: −0.46, 4.19]; I 2 = 97 %). In a subgroup analysis of participants, XR had a moderate positive effect on nursing students compared with traditional education, but did not reach statistical significance (four trials, n = 244; SMD = 0.97 [95 % CI: −0.58, 2.52]; I 2 = 96 %) (Figure A.3). Only one study was conducted on midwifery students; therefore, it could not be integrated.
A meta-analysis of differences between pre- and post-education showed that XR produced a significant very large positive effect (four trials, n = 251; SMD = 5.86 [95 % CI: 2.48, 9.25]; I 2 = 99 %) ( Fig. 3). In the subgroup analysis, AR could not be integrated because there was only one trial. Non-immersive VR had a significant very large positive effect (three trials, n = 189; SMD = 9.36 [95 % CI: 1.18, 17.54]; I 2 = 99 %). In a subgroup analysis of participants, XR produced a significant very large positive effect on nursing students (three trials, n = 215; SMD = 5.91 [95 % CI: 2.16, 9.62]; I 2 = 99 %) (Figure A.4). Only one study was conducted on midwifery students; therefore, it could not be integrated.
One study was a three-arm comparison using holography; as such, a meta-analysis was conducted, including holographic intervention ( Hackett and Proctor, 2018). The meta-analysis showed that XR, including holographic intervention, had a moderate positive effect compared with traditional education, but did not reach statistical significance (five trials, n = 369; SMD = 1.10 [95 % CI: −0.06, 2.26]; I 2 = 95 %) (Figure A.5). In the subgroup analysis, immersive VR and AR could not be integrated, because there was only one trial. Non-immersive VR had a moderate positive effect, but did not reach statistical significance (three trials, n = 278; SMD = 1.97 [95 % CI: −0.13, 4.08]; I 2 = 97 %).
The meta-analysis of differences between pre- and post-education showed that XR, including the holographic intervention, produced a significant very large positive effect (four trials, n = 340; SMD = 5.66 [95 % CI: 3.11, 8.21]; I 2 = 98 %) (Figure A.6). AR could not be integrated in the subgroup analysis because there was only one trial. Non-immersive VR had a significant very large positive effect (three trials, n = 278; SMD = 9.92 [95 % CI: 3.18, 16.66]; I 2 = 99 %).
4.3.2 Learning loadThree RCTs evaluated the effects of XR on learning load. The meta-analysis showed that XR had a significant moderate negative effect (three trials, n = 181; SMD = −0.45 [95 % CI: −0.75, −0.14]; I 2 = 0 %) ( Fig. 4). In the subgroup analysis, immersive VR, non-immersive VR and AR could not be integrated because there was only one trial. All three studies were conducted on nursing students, and none were conducted on midwifery students.
The meta-analysis including holographic interventions produced a significant moderate negative effect (three trials, n = 270; SMD = −0.51 [95 % CI: −0.77, −0.25]; I 2 = 3 %) (Figure A.7). In the subgroup analysis, immersive VR, non-immersive VR and AR could not be integrated because there was only one trial.
4.4 Evidence certainty and importanceThe certainty of the evidence is presented in Table 2. The certainty of the evidence from the five studies was evaluated for knowledge outcomes. The risk of bias, inconsistency and imprecision were identified as serious factors that downgrade the certainty of evidence. As a result, the certainty was graded as “very low” and importance as “not important.” The certainty of evidence from the three studies was evaluated for learning load outcomes. The risk of bias, inconsistency and imprecision were identified as very serious factors that downgrade the certainty of evidence. As a result, certainty of evidence was graded as “low” and importance as “important.”
5 DiscussionThe present systematic review and meta-analysis indicated that XR had no statistically significant effect on the post-education knowledge of anatomy, physiology and pathology education for nursing and midwifery students. However, in terms of differences from the pre-education scores, the findings revealed that XR significantly increased knowledge. Moreover, XR was shown to significantly decrease the learning load compared with traditional education.
At the post-education time point only, XR had no significant effect compared with traditional education, whereas XR had a significant positive effect on pre- and post-education differences. This may be explained by the differences in the baselines of the populations covered across the studies. The participants in this review were students and their levels of knowledge varied from one institution to another. Therefore, studies that included institutions whose participants already had a high level of knowledge may not have shown significant differences in effectiveness, which could have affected our results ( Simonsmeier et al., 2022). The lack of significant differences in post-education knowledge scores suggests that XR does not have an absolute advantage over traditional education. However, the finding that XR had a significant effect on the difference between pre- and post-educational knowledge suggests that it had a certain effect on the target population. In other words, XR improved the learners’ knowledge.
These results partly conflict with those of a meta-analysis investigating the effects of XR on anatomy education among health science students ( García-Robles et al., 2024). The previous study revealed that the XR technology results in better knowledge acquisition than traditional approaches. This discrepancy may be attributable to population differences. Our review focused solely on nursing students, including midwifery students, whereas previous studies have included nursing, medical, biomedical and physiotherapy students. Previous research has shown that it was effective only for undergraduate students and not for graduate students. However, the analysis was not stratified by type of undergraduate school. In actual education, it is rare to educate students from many different undergraduate schools together. Therefore, integrating them by type of undergraduate school, as in this study, provides important findings at the level of the educational field. In the future, it will be necessary to conduct a meta-analysis to compare and examine whether the effects of XR differ depending on the education target, such as medical students, biomedical students, nursing students and physiotherapist students.
Using XR in education decreases the students’ learning load compared with traditional education. The reason for this may be that XR education adds three-dimensional information; three-dimensional visualization may facilitate the conversion of two-dimensional and three-dimensional thinking, resulting in a reduced cognitive load. In a previous study that used VR to educate medical students on anatomy, participants were positive about the use of VR and perceived it as helping them visualize anatomical structures in three-dimensions ( da Cruz Torquato et al., 2023). However, there are still no data in nursing students and future research is required to examine the mechanism by which 3D visualization reduces cognitive load in nursing students. Additionally, all three studies that evaluated the learning load integrated education into the cardiac system and it is not possible to generalize the findings to other fields. Therefore, it may be necessary to conduct research to verify whether cognitive load can be reduced in the same way through 3D visualization in the learning of areas other than the cardiac system. If the mechanism were to be clarified, it would demonstrate that the XR tool, which provides 3D visualization in anatomy education, aids learning better than traditional teaching methods do.
Based on these results, XR, as a teaching method for pathophysiology and anatomy, has a lower learning load and is more effective in improving knowledge. A systematic review of previous research reported that for medical professional students, 72 % of studies showed that active learning is superior to passive instruction in terms of lower-level cognition, such as recalling, understanding and applying material; meanwhile 84 % of studies showed that active learning is superior to passive instruction in terms of higher-level cognition, such as improving students’ confidence and performance in analysis, evaluation and creativity ( Harris and Welch Bacon, 2019). Therefore, traditional teaching methods are passive approaches, such as textbooks, atlases and didactic lectures, whereas XR might have effectively acquired knowledge because it includes an active element of interaction. However, the risk of bias may be higher for learning load, as this is a self-reported outcome. Additionally, XR does not require the use of specimens for dissection; therefore, it has fewer educational restrictions and is less costly.
6 Limitations and directions for future researchThis review has some limitations. First, the results of this meta-analysis may only be generalizable to nursing students and not to midwifery students. This is because the only study of midwifery students was that by Gray et al. (2022) and it was not possible to integrate it. Additionally, when we conducted a subgroup analysis of nursing students and midwifery students separately, the results did not change for nursing students. This shows the robustness of the results for the primary outcome. Therefore, it can be determined that XR improved knowledge compared with traditional education for nursing students. In addition, the research by Gray et al. has shown that it increases knowledge to a statistically significant degree. If more similar RCTs are conducted in the future, it may be possible to demonstrate the effectiveness of XR for midwifery students. Second, this result is probably valid only for anatomy. This may be because, although this review included anatomy, physiology and pathology in its search, only one study dealt with physiology and only one with pathology. Furthermore, these studies taught these subjects in combination with anatomy. Therefore, it is not possible to generalize the results of this review to physiology and pathology education. It would be beneficial to have more research on education using XR for physiology and pathology in the future. Third, the results of this review should be interpreted with caution, as few studies were included in the data synthesis. Echoing the abovementioned point, there is a need for well-designed large-scale RCTs using XR on anatomy, physiology and pathology for nursing and midwifery students in the future. However, only one study included in this review was registered in a protocol. To avoid increasing the risk of bias, it will be necessary to comply with such procedures as randomization and protocol registration. Fourth, the certainty of the evidence ranged from low to very low. Heterogeneity was high in knowledge integration, which was largely because of differences in the intervention content and measurement methods. The high risk of bias related to these outcome measures reduced the quality of the evidence. In particular, the assessment methods were developed independently in each study and their validity was not evaluated. As the number of studies was small and could not be analyzed, the possibility of publication bias could not be ruled out. Conversely, selection bias was reduced in this review because no language restrictions were imposed and dissertations were included. Fifth, the included studies did not evaluate the long-term educational effectiveness, which remains unclear. To demonstrate that knowledge has been retained, it will be necessary to re-evaluate knowledge after at least 3–6 months. However, the results of this study clearly show that knowledge improved up to 1 month post-education compared with pre-education. This means that XR has a short-term knowledge improvement effect.
7 ConclusionThis meta-analysis evaluated the effectiveness of XR-based education on nursing and midwifery students’ anatomy, physiology and pathology education. The findings suggest that educational methods using XR have a lower learning load than traditional methods do and are also more effective in improving nursing and midwifery students’ knowledge of anatomy. However, the results of this review should be interpreted with caution, as few studies were included in the meta-analysis. Therefore, well-designed, large-scale RCTs are necessary in the future.
FundingThis work was supported by Kanto Gakuin University Nursing Research Institute Research Grant (grant number: N/A). The funding source had no role in the design, practice, or analysis of this study.
CRediT authorship contribution statementAikawa Gen: Writing – review & editing, Writing – original draft, Resources, Project administration, Methodology, Investigation, Funding acquisition, Formal analysis, Data curation, Conceptualization. Kawashima Tetsuharu: Writing – original draft, Visualization, Resources, Investigation, Formal analysis, Data curation. Ota Yuma: Writing – original draft, Investigation, Formal analysis, Data curation, Conceptualization. Watanabe Mayumi: Writing – original draft, Visualization, Investigation, Formal analysis, Data curation. Nishimura Ayako: Writing – original draft, Investigation, Data curation, Conceptualization. Sakuramoto Hideaki: Writing – original draft, Methodology, Investigation, Funding acquisition, Data curation, Conceptualization.
Declaration of Competing InterestThe authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Appendix
| | |
| | |
| 1 | Students, Nursing[Mesh Terms] OR "Nursing Student*"[Title/Abstract] OR nurse[Title/Abstract] OR nurses[Title/Abstract] OR nursing[Title/Abstract] OR midwife*[Title/Abstract] |
| 2 | student*[Title/Abstract] OR pupil*[Title/Abstract] OR apprentice*[Title/Abstract] OR trainee*[Title/Abstract] OR baccalaureate[Title/Abstract] OR undergraduate*[Title/Abstract] OR pre-licensure[Title/Abstract] OR pre-registration[Title/Abstract] OR college[Title/Abstract] OR university[Title/Abstract] |
| 3 | #1 AND #2 |
| 4 | Anatomy[Mesh Terms] OR anatom*[Title/Abstract] OR Physiology[Mesh Terms] OR physiolog*[Title/Abstract] OR Pathology[Mesh Terms] OR patholog*[Title/Abstract] OR "Diseases Category"[Mesh Terms] |
| 5 | Virtual Reality[Mesh Terms] OR Augmented Reality[Mesh Terms] OR Patient-Specific Modeling[Mesh Terms] OR User-Computer Interface[Mesh Terms] OR "realit*"[Title/Abstract] OR "simulat*"[Title/Abstract] OR VR[Title/Abstract] OR AR[Title/Abstract] OR MR[Title/Abstract] OR XR[Title/Abstract] OR "User-Computer Interface"[Title/Abstract] OR "three-dimension*"[Title/Abstract] OR Imaging, Three-Dimensional[Mesh Terms] OR "3D"[Title/Abstract] OR "virtual*"[Title/Abstract] OR Holography[Mesh Terms] OR "hologra*"[Title/Abstract] |
| 6 | (randomized controlled trial[pt] OR controlled clinical trial[pt] OR randomized[pt] OR placebo[Title/Abstract] OR drug therapy[sh] OR randomly[Title/Abstract] OR trial[Title/Abstract] OR groups[Title/Abstract]) NOT (animals[Mesh Terms] NOT humans[Mesh Terms]) |
| 7 | #3 AND #4 AND #5 AND #6 |
| | |
| 1 | Students, Nursing[Mesh Terms] OR "Nursing Student*"[Title/Abstract] OR nurse[Title/Abstract] OR nurses[Title/Abstract] OR nursing[Title/Abstract] OR midwife*[Title/Abstract] |
| 2 | student*[Title/Abstract] OR pupil*[Title/Abstract] OR apprentice*[Title/Abstract] OR trainee*[Title/Abstract] OR baccalaureate[Title/Abstract] OR undergraduate*[Title/Abstract] OR pre-licensure[Title/Abstract] OR pre-registration[Title/Abstract] OR college[Title/Abstract] OR university[Title/Abstract] |
| 3 | #1 AND #2 |
| 4 | Anatomy[Mesh Terms] OR anatom*[Title/Abstract] OR Physiology[Mesh Terms] OR physiolog*[Title/Abstract] OR Pathology[Mesh Terms] OR patholog*[Title/Abstract] OR "Diseases Category" [Mesh Terms] |
| 5 | Virtual Reality[Mesh Terms] OR Augmented Reality[Mesh Terms] OR Patient-Specific Modeling[Mesh Terms] OR User-Computer Interface[Mesh Terms] OR "realit*"[Title/Abstract] OR "simulat*"[Title/Abstract] OR VR[Title/Abstract] OR AR[Title/Abstract] OR MR[Title/Abstract] OR XR[Title/Abstract] OR "User-Computer Interface"[Title/Abstract] OR "three-dimension*"[Title/Abstract] OR Imaging, Three-Dimensional[Mesh Terms] OR "3D"[Title/Abstract] OR "virtual*"[Title/Abstract] OR Holography[Mesh Terms] OR "hologra*"[Title/Abstract] |
| 6 | #3 AND #4 AND #5 |
| | |
| 1 | SU(Nursing Students) OR TI,AB("nursing student") OR SU(Nurses) OR TI,AB(nurse) OR SU(nursing) OR TI,AB(midwi*) |
| 2 | SU(Students) OR TI,AB(student*) OR TI,AB(pupil*) OR SU(Apprenticeships) OR TI,AB(apprentice*) OR SU(Trainees) OR TI,AB(trainee*) OR SU("Bachelors Degrees") OR TI,AB(baccalaureate) OR SU(Undergraduate Students) OR TI,AB(undergraduate*) OR TI,AB(pre-licensure) OR TI,AB(pre-registration) OR SU(Colleges) OR TI,AB(college) OR SU(Universities) OR TI,AB(university) |
| 3 | #1 AND #2 |
| 4 | SU(Anatomy) OR TI,AB(anatom*) OR SU(Physiology) OR TI,AB(physiolog*) OR SU(Pathology) OR TI,AB(patholog*) OR SU(Disease) |
| 5 | SU(Computer Interfaces) OR SU(Simulation) OR SU(Computer Simulation) OR TI,AB(realit*) OR TI,AB(simulat*) OR TI,AB(VR) OR TI,AB(AR) OR TI,AB(MR) OR TI,AB(XR) OR TI,AB(three-dimension*) OR TI,AB(3D) OR TI,AB(virtual*) OR TI,AB(hologra*) OR TI,AB("user-computer interface*") |
| 6 | TI,AB(random*) OR TI,AB,SU(placebo) OR TI,AB("double-blind*") |
| 7 | #3 AND #4 AND #5 AND #6 |
| | |
| 1 | MH(Students, Nursing) OR TI,AB("Nursing Student*") OR TI,AB(nurse) OR TI,AB(nurses) OR TI,AB(nursing) OR TI,AB(midwife*) |
| 2 | TI,AB(student*) OR TI,AB(pupil*) OR TI,AB(apprentice*) OR TI,AB(trainee*) OR TI,AB(baccalaureate) OR TI,AB(undergraduate*) OR TI,AB(pre-licensure) OR TI,AB(pre-registration) OR TI,AB(college) OR TI,AB(university) |
| 3 | #1 AND #2 |
| 4 | MH(anatomy) OR TI,AB(anatom*) OR MH(physiology) OR TI,AB(physiolog*) OR MH(pathology) OR TI,AB(patholog*) OR MH("Diseases Category") |
| 5 | MH("Virtual Reality") OR MH("Augmented Reality") OR MH("Patient-Specific Modeling") OR MH("user-computer interface*") OR TI,AB(realit*) OR TI,AB(simulat*) OR TI,AB(VR) OR TI,AB(AR) OR TI,AB(MR) OR TI,AB(XR) OR TI,AB("user-computer interface*") OR TI,AB("three-dimension*") OR MH("Imaging, Three-Dimensional") OR TI,AB(3D) OR TI,AB(virtual*) OR MH("holography") OR TI,AB(hologra*) |
| 6 | (PT(randomized controlled trial) OR TI,AB("controlled clinical trial") OR TI,AB(randomized) OR TI,AB(placebo) OR MH(drug therapy) OR TI,AB("drug therapy") OR TI,AB(randomly) OR TI,AB(trial) OR TI,AB(groups)) NOT (MH(Animals) NOT MH(Human)) |
| 7 | #3 AND #4 AND #5 AND #6 |
| | |
| 1 | TS=("Nursing Student*" OR nurse OR nurses OR nursing OR midwi*) |
| 2 | TS=(student* OR pupil* OR apprentice* OR trainee* OR baccalaureate OR undergraduate* OR pre-licensure OR pre-registration OR college OR university) |
| 3 | #1 AND #2 |
| 4 | TS=(anatomy OR anatom* OR physiology OR physiolog* OR pathology OR patholog*) |
| 5 | TS=("Patient-Specific Modeling" OR realit* OR simulat* OR VR OR AR OR MR OR XR OR "user-computer interface*" OR "three-dimension*" OR 3D OR virtual* OR hologra*) |
| 6 | TS=("randomized controlled trial" OR "randomised controlled trial" OR "randomized trial" OR "randomised trial" OR "random allocation" OR "randomly allocated" OR "randomized" OR "randomised" OR "placebo controlled" OR "double blind" OR "single blind") AND TS=("trial" OR "study" OR "clinical trial" OR "clinical study") |
| 7 | #3 AND #4 AND #5 AND #6 |
| | |
| 1 | 看護学生/TH OR 看護師/TH OR 看護/TH OR看護/TA OR助産師/TH OR 助産/TA |
| 2 | 学生/TH OR 生徒/TA OR 実習生/TA OR 研修生/TA OR 学士/TA OR 学位/TA OR 学部生/TA OR 免許取得前/TA OR 助産学校/TA OR 専門学校/TH OR 大学/TA OR 短大/TA OR 基礎教育/TA OR 大学/TH |
| 3 | #1 AND #2 |
| 4 | 生理学/TH OR 解剖学/TH OR 病態生理/TH OR 生理/TA OR 解剖/TA OR 病態/TA |
| 5 | ユーザーインターフェース/TH OR バーチャルリアリティ/TH OR VR/TA OR オーグメンテッドリアリティ/TA OR AR/TA OR 患者特異的モデル化/TH OR 複合現実/TA OR 拡張現実/TH OR 仮想現実/TA OR クロスリアリティ/TA OR コンピューターインターフェース/TA OR シミュレーション環境/TA OR 仮想/TA OR 3次元/TA OR 3D/TA |
| 6 | ランダム化比較試験/TH OR 準ランダム化比較試験/TH OR ランダム化/AL OR 無作為化/AL OR 比較試験/AL OR 臨床試験/AL OR プラセボ/AL OR 対照/AL OR コントロール/AL OR 臨床研究/AL |
| 7 | #3 AND #4 AND #5 AND #6 |
| Author | Year | Title | Exclude reason |
| Akhu-Zaheya et al. | 2013 | Effectiveness of simulation on knowledge acquisition, knowledge retention, and self-efficacy of nursing students in Jordan | Wrong intervention |
| Alous | 2019 | Lecture-based education versus simulation in educating student nurses about central line-associated bloodstream infection-prevention guidelines | Wrong intervention |
| Avci and Kilic | 2024 | The effect of augmented reality applications on intravenous catheter placement skill in nursing students: A randomized controlled study | Wrong intervention |
| Bani Salame et al. | 2024 | Effect of virtual reality simulation as a teaching strategy on nursing students’ satisfaction, self-confidence, performance, and physiological measures in Jordan | Wrong intervention |
| Berg and Steinsbekk | 2020 | Is individual practice in an immersive and interactive virtual reality application non-inferior to practicing with traditional equipment in learning systematic clinical observation? A randomized controlled trial | Wrong intervention |
| Chung et al. | 2011 | The effects of practicing with a virtual ultrasound trainer on FAST Window Identification, Acquisition, and Diagnosis. CRESST Report 787 | Wrong population |
| Cioffi et al. | 2005 | A pilot study to investigate the effect of a simulation strategy on the clinical decision making of midwifery students | Wrong intervention |
| Das and Mitchell | 2013 | Comparison of three aids for teaching lumbar surgical anatomy | Wrong population |
| Dick-Smith et al. | 2020 | Comparing real-time feedback modalities to support optimal cardiopulmonary resuscitation for undergraduate nursing students: A quasi-experimental cross-over simulation study | Wrong intervention |
| Ferguson | 2013 | Interpretation of feeding and swallowing in preterm infants: influence of video simulation training | Wrong intervention |
| Gray | 2022 | Together we learn: Using three-dimensional visualisation of the third stage of labour to teach midwifery students | Wrong publication |
| Gunay Ismailoglu et al. | 2018 | Comparison of the Effectiveness of a Virtual Simulator with a Plastic Arm Model in Teaching Intravenous Catheter Insertion Skills | Wrong intervention |
| HadaviBavil and Ilcioglu | 2024 | Artwork in anatomy education: A way to improve undergraduate students' self-efficacy and attitude | Wrong intervention |
| Lee et al. | 2023 | Virtual reality simulation-enhanced blood transfusion education for undergraduate nursing students: A randomised controlled trial | Wrong intervention |
| Menon et al. | 2022 | Augmented reality in nursing education - A pilot study | Wrong intervention |
| Ozdemir and Unal | 2023 | The effect of breast self-examination training on nursing students by using hybrid-based simulation on knowledge, skills, and ability to correctly evaluate pathological findings: Randomized controlled study | Wrong intervention |
| Pereda-Nunez et al. | 2023 | Pelvic + Anatomy: A new interactive pelvic anatomy model. Prospective randomized control trial with first-year midwife residents | Wrong intervention |
| Rodriguez-Abad et al. | 2022 | Effectiveness of augmented reality in learning about leg ulcer care: A quasi-experimental study in nursing students | Wrong intervention |
| Sezgunsay and Basak | 2020 | Is Moulage effective in improving clinical skills of nursing students for the assessment of pressure injury? | Wrong intervention |
| Stone et al. | 2024 | Cognitive and physiological evaluation of virtual reality training in nursing | Wrong intervention |
| Veredas et al. | 2014 | A web-based e-learning application for wound diagnosis and treatment | Wrong intervention |
| Zare Bidaki and Ehteshampour | 2019 | Designing, producing, application, and evaluation of virtual reality-based multimedia clips for learning purposes of medical and nursing students | Wrong publication |
| 刘靓 | 2018 | 3D 多媒体软件在护理专业人体解剖学教学中的应用 | Wrong study design |
| No data | 2020 | Teaching fetal development with virtual reality | Wrong study design |
| No data | 2022 | Gamification and augmented reality in mechanical ventilation teaching for nursing students | Wrong study design |
| | | | | | | | | |
| Aebersold et al. (2018), USA | Sophomore and junior nursing students attending a baccalaureate nursing program | Nasogastric tube placement skills | Two-arm RCT | Training videos were viewed, and simulation exercises were performed using iPad anatomy-augmented virtual simulation training module, replacing control (n = 35) | Animated video and didactic content (n = 34) | Skill competency: Checklist which included 21 activities that are required to demonstrate competency in successful nasogastric tube placement, assessed by faculty assessors. | No information | The AR group marked significantly higher score on skill competency (Control: 15.39 ± 1.01 vs. AR: 15.96 ± 0.75, p = 0.011) |
| Bilen (2018), USA | Students in nursing department of a university | Anatomy and disease of the mitral valve | Two-arm RCT, pre-posttest design | Immersive VR models through “3D4Medical software” in a head-mounted display system, replacing control (n = 15) | Class notes (n = 14) | Knowledge: The test of 15 questions, including drawing and labeling the four components of mitral valve, knowledge of mitral stenosis and mitral regurgitation, the application of the knowledge, by multiple-choice questions and open-ended questions.
Cognitive load: four items to rate mental effort invested during learning the mitral valve. Direct subjective 6-point or 9-point scales. | Post-test was conducted after a 10 minutes break. | No significant difference in the mean scores of both groups for knowledge of anatomical structure of the mitral valve and clinical cases.
No significant differences in cognitive load for both groups. . |
| Hackett and Proctor (2018), USA | Nursing students at the Medical Education and Training Campus | Cardiac anatomy | Three-arm RCT, pre-posttest design | Monoscopic 3-D: Heart models were converted to 3-DPDF using the PDF3D ReportGen program (PDF3D, Ascot, Berkshire, UK) and displayed on a laptop computer, replacing control (n = 60).
Holograms: Holograms printed on photopolymer were displayed for an autostereoscopic experience, replacing control (n = 60). | 2-D images of the heart model printed in color (n = 59) | Knowledge: Six multiple-choice conceptual questions and nine questions requiring participants to match anatomical names to the corresponding structures.
Mental effort: the NASA-Task Load Index, a subjective multidimensional assessment scale, wherein participants report the workload required to study cardiac anatomy. | Post-test of knowledge and mental effort were conducted immediately after the intervention. | In the post-test knowledge test, the hologram group scored significantly higher than the other two groups (hologram: 80.0 ± [66.7–86.7] vs. Monoscopic 3-D: 66.7 [53.3–80.0] vs. Printed images: 66.7 [53.3–80.0], p = 0.008).
In mental effort test, the printed images group scored significantly higher than the other two groups (hologram: 4.9 ± 3.56 vs. Monoscopic 3-D: 4.9 ± 3.79 vs. Printed images: 7.5 ± 4.9, p = 0.003). |
| An et al. (2022), Korea | First- and second-year nursing students from two universities | Skeletal system and structure of the heart | Two-arm RCT, pre-posttest design | An image-based AR mobile application (DEVAR Entertainment LLC, Marlton, NJ, USA) was operated on iPhone and/or Android, displaying three-dimensional virtual anatomical objects in an AR book, complementing control (n = 31). | Textbook (n = 31) | Knowledge: 10 multiple-choice questions
Self-Regulated Learning competency: the Online Self-Regulated Learning Questionnaire, a 24-item questionnaire with a 5-point Likert scale. Perceived learning competency (perceived cognitive, affective, and psychomotor effects of Self-Regulated Learning): the CAP Perceived Learning Scale, a nine-item, 6-point Likert scale. Learning flow (the degree to which participants experienced immersion in anatomy learning): the short version of the Flow State Scale, nine items measured on a 5-point Likert scale. Academic stress: the Perception of Academic Stress Scale, an 18 item with a 5-point Likert scale. | No information | There was not significant interaction between the effects of time and the intervention in Perceived Learning competency, knowledge, academic stress, and learning flow. Self-Regulated Learning competency in the textbook group significantly increased than in the AR group
after intervention (p = 0.020) |
| Bolatli and Kizil (2022), Turkey | Students learning genital system anatomy | Genital system anatomy | Two-arm RCT, pre-posttest design | The mobile application “genitalsystem.apk” installed on the smartphones of the students, replacing control (n = 32) | Anatomical model of the genital system organs, textbooks and atlases (n = 31) | Knowledge: 22 multiple-choice questions including questions on anatomical structure.
Anxiety level: State-Trait Anxiety Inventory, with two subscales: the state anxiety and the trait anxiety. Each subscale consisted of 20 items and was graded based on a 4-point Likert style scale. | Posttest took place three days later. | After training, the experimental group marked significantly higher score on knowledge test (Control: 40.4 ± 6.9 vs. Experimental: 67.7 ± 5.5, p = 0.045) and scored significantly lower on State Anxiety (Control: 45.6 ± 8.7 vs. Experimental: 40.4 ± 8.3, p = 0.022) |
| Gray et al. (2022), Australia | Midwifery students in their second year of a Bachelor of Midwifery | The third stage of labor (anatomy and physiology of the uterus, and the process of achieving hemostasis to prevent hemorrhage) | Two-arm RCT, pre-posttest design | A narrated three-dimensional midwifery visualization resource was sent to the students' mobile phones, which they viewed using three-dimensional glasses attached to their devices, complementing control (n = 19). | Traditional educational methods (n = 17) | Knowledge: 30 Multiple-choice questions | Posttest took place immediately after and one month later. | Immediately after training, the experimental group marked significantly higher score on knowledge test (3DMVR group: 23.0 vs. Control group: 18.4, p < 0.001), but after one month the difference between the two groups was no longer significant (3DMVR group: 21.2 ± 3.5 vs. Control group: 19.4 ± 4.2, p = 0.34). |
| | № of patients | | | | |||||||
| | | | | | | | | | | ||
| | |||||||||||
| 5 | randomized trials | serious a | serious b | not serious | serious c | none | 157 | 123 | SMD 1.1
(−0.25–2.33) | ⨁◯◯◯
Very low | NOT IMPORTANT e |
| | |||||||||||
| 3 | randomized trials | very serious d | not serious | not serious | not serious | none | 106 | 75 | SMD −0.45
(−0.75 to −0.14) | ⨁⨁◯◯
Low | IMPORTANT f |
©2025. Elsevier Ltd