Content area
Background
Surgery has evolved from a hands-on discipline where skills were acquired via the “learning by doing” principle to a surgical science with attention to patient safety, health care effectiveness and evidence-based research. A variety of simulation modalities have been developed to meet the need for effective resident training. So far, research regarding surgical training for minimally invasive surgery has been extensive but also heterogenous in grade of evidence.
MethodsA literature search was conducted to summarize current knowledge about simulation training and to guide research towards evidence-based curricula with translational effects. This was conducted using a variety of terms in PubMed for English articles up to October 2024. Results are presented in a structured narrative review.
ResultsFor virtual reality simulators, there is sound evidence for effective training outcomes. The required instruments for the development of minimally invasive surgery curricula combining different simulation modalities to create a clinical benefit are known and published.
ConclusionSurgeons are the main creators for minimally invasive surgery training curricula and often follow a hands-on oriented approach that leaves out equally important aspects of assessment, evaluation, and feedback. Further high-quality research that includes available evidence in this field promises to improve patient safety in surgical disciplines.
Surgical training was for a long time mainly conducted through exposition of the trainee to procedures in the operating theatre, following the Halstedian apprenticeship model of “see one, do one, teach one”. 1 Several developments in the surgical sciences made this model inadequate for the training of residents. For example, the advent and rise of laparoscopic surgery required a much different skillset involving three-dimensional handling of structures via two-dimensional video, working with long instruments operated over a single pivot point (fulcrum effect) and advanced hand-eye coordination. As a result, the learning curve for laparoscopic surgery is longer than that for open surgery. 2 This means higher risk of error and complications during the learning process, 3 and with the rising importance of patient safety and evidence based medicine, the concept of “learning by doing” was called into question. This situation moved even more into the center of attention with the increased implementation of robotic surgery, which poses even another set of new challenges. 4 Furthermore, because of working time restrictions in the beginning of the 21st century, the exposition needed for training could not be achieved in a reasonable amount of time just in the operating room (OR). 5 When there is no substitute training model, residents are less experienced in surgical procedures, which leads to reduced patient safety. 6 , 7 Last but not least, costs and expenses exert a growing influence on the healthcare system, rendering the expensive OR unfit for resident training due to additional associated cost. 8 , 9
To counteract on this development, simulation-based surgical training has become increasingly popular both in research and in surgical curricular design. Basic skills can be acquired without patient endangerment, there is less time pressure and results can be quantified. Over time, these simulators developed from simple box simulators and wet lab training to the newest advancement of virtual reality simulators. In addition, web-based learning modules 10 , 11 and high-quality surgical videos 12 provide the necessary theoretical background. Together with traditional face-to-face teaching, this concept of “blended learning” is another advancement of modern surgical training. 13 , 14 Over the past 20 years, much research has been carried out regarding the value of simulation systems compared to each other and performance in real surgical procedures, resulting in a heterogenous mix of different study designs and results.
This narrative review aims to explore how virtual reality fits in a comprehensive simulation training framework along with other modalities such as box trainers, wet lab and blended learning. Moreover, it investigates how research can be designed and carried out in order to lay the path towards evidence-based training curricula. Finally, it aims to define the key concepts of translational research implementation with a view to examine the effect of simulation training on the real clinical environment. Three questions have been the basis for these aims.
- What is the scientific groundwork for training?
- What training instruments in what level of evidence are available?
- How can these training instruments be used to achieve a structured training curriculum with measurable effect on clinical outcome?
Accordingly, basics and instruments for structured evaluation of simulator training are introduced and available simulator modalities are characterized. Based on this, current knowledge on how to build evidence-based curricula involving these different modalities will be presented as well as means to investigate the effect on clinical outcomes.
2 MethodsA broad literature search was conducted to summarize current knowledge about simulation training for minimally invasive surgery. A variety of terms was used in the PubMed database for English articles from 1990 up to October 2024, as the first surgical simulators emerged in the 90s. During research, some older publications regarding the concept of validity have been added. The search terms included surgery, training, simulation, VR, box trainer, animal, laparoscopic simulation, robotic simulation, skill transfer, validity, blended learning, non-technical skills, training curriculum, cost effectiveness and combinations of these terms including different writing styles and punctuation. Results are presented in a structured narrative review consisting of three parts: assessment - training modalities - curricula and translational research.
3 Assessment: Understanding the scientific concept behind training3.1 Assessment of surgical skills: validity, metrics and proficiency
Assessment of skills is done by testing, and whether tests actually measure and discriminate different levels of skills is determined by their validity. The same concept can be applied to simulation training. Simulators are constructed to replicate certain aspects of reality. Whether a simulator successfully depicts reality (validity) is measured by five categories 15: Face validity is a scale for how well the simulator looks and feels like a real surgical scenario (e.g., haptic feedback of virtual reality simulation). Content validity describes whether the simulation outcome variables fit the ones of reality (e.g., time to complete a procedure). Another way to account for the value of simulators is to compare them against each other (concurrent validity). This is closely connected to the concept of transfer or predictive viability, which involves the ability of a simulator to improve the simulated scenario outcomes in real life circumstances. The capability to separate different levels of expertise of a simulation is measured by construct validity (e.g., is the outcome variable “time to complete a procedure” different for an intern vs. an attending surgeon). 16 , 17 While face and content validity are subjective categories, the others can be quantified by different tests and scores (see Table 1).
These classic concepts of validity have been unified in a framework focused on constructs by Messick 18 in 1989 which defines validity as the extent of how well score interpretations that have been collected from any kind of assessment tool fit real world constructs. In this framework, evidence for validity originates from five sources: content, response process, internal structure, relations with other variables, and consequences (see Table 2 ) 19–21 These categories are less abstract than the traditional dimensions of validity and represent the current standard in medical education testing 22:
The quantification of assessment usually involves a score that consists of different variables and a rater that needs a higher level of expertise than the test subject. Scores to measure surgical performance in simulation and real life have been developed with the need for valid and reliable assessment of surgical training since the early 1990s. 25 One of the first scores is the “Objective structured assessment of technical skill (OSATS)” by Martin et al. 26: It consists of seven categories that are attributed a score from 1 to 5 and can be applied to many different procedures as it focuses on general principles of surgery and not specific steps ( Table 3 ). This is an important factor for comparability. The OSATS score has proven feasibility, reliability and validity. 25 , 27 , 28
Specifically for laparoscopic surgery, the “Global Operative Assessment of Laparoscopic Skills” (GOALS) was developed in the same fashion in 2003 by Vassiliou et al. 29 and is now widely used in surgical training research ( Table 4 ). 30 It was modified for application in robotic surgery by Goh et al. 31 in 2012 by adding an item for “robotic control”. Both of these scores have been successfully tested for validity and reliability. 25 , 32 , 33
Because of limited capturing of unique robotic skills such as wrist movement and third/fourth arm inclusion, Liu et al. developed an alternative score for robotic skills that concentrates more on individual skills rather than overall performance. 34
Virtual reality simulators have the advantage to automatically generate validated and reliable metrics such as economy of movement, length of path, and instrument errors. 17 , 35 These metrics correspond to the OSATS scoring of real life procedures, 36 which then corresponds to real life patient outcomes (reoperation, reoperation and complication rates). 37
Lastly, assessment of simulator training should produce reliable results, meaning that it should give the same results when applied at two different points of the same proficiency level (test-retest reliability). This is important to attribute changes in scoring to changing proficiency levels. Another important aspect of reliability is inter-rater consistency. Scores should give the same result if applied to the same test subject by different raters. 17 Blinding of assessment can be achieved through later scoring of recorded video footage, which is also more convenient and therefore more feasible. The tool presented in the following paragraph does not need experienced raters and therefore the aspect of inter-rater consistency does not apply.
3.2 Automated assessmentAnother tool that is made possible by the availability of these automatically generated metrics is automated assessment, possibly supported by artificial intelligence. While the aforementioned scores already provide objective assessment with good reliability, an experienced surgeon is needed to perform the rating. This is time consuming and thereby cost-intensive and limits its application due to time and personnel being one of the most limited resources in surgical departments.
Metrics that have been used in automated assessment are total task time, kinematics (instrument movement), number of errors, critical errors and certain events unique to the corresponding task. 38 Using these metrics, several studies proved face, construct, content and concurrent validity for virtual robotic simulators, 39–43 as well as correlation with GEARS 44. While strong correlation was reported for efficiency and time to complete, economy of motion, depth perception as well as overall score, weak correlation was found for bimanual dexterity and economy of motion as well as robotic control and instrument collisions.
Metrics generated in real human robotic procedures were able to differentiate between novice and expert surgeons as well. 45 , 46 Even resident involvement in at least one of the main steps of robotic prostatectomy could be distinguished by automated performance metrics in one study. 47 On a side note, resident involvement did not affect the postoperative outcome.
The newest addition to automated assessment is artificial intelligence. It is mostly used through video-based evaluation (computer vision). This term mainly represents the automated recognition and tracking of instruments, which is then translated in various kinematic metrics. 48 These models have been successfully used in real laparoscopic and robotic surgery as well as robotic simulators. 49–51 Machine learning has also been applied directly on the available kinematic data from the robotic platform to differentiate between novice and expert levels in suturing. 52 Furthermore, small-scale data like needle angulation seem to be of more predictive value than scores for the whole suture.
3.3 Concept of proficiencyAnother question that must be answered is which score corresponds to proficiency for a certain task or procedure. This is often solved by benchmarks; the mean score of high-level surgeons is defined as the standard for proficiency. 53 “Training to proficiency” or “proficiency-based progression” 53 usually signifies that a trainee completes two task repetitions in a row scoring the benchmark, reaching a plateau of the learning curve. 54
3.4 Blended learning and non-technical skillsFor the acquisition of psychomotor skills, background knowledge regarding anatomy, physiology and pathology is equally relevant. Traditional teaching models are described by the term “face-to-face learning”: a teacher lectures students with the help of image and/or text material, most often in the form of presentation slides. In contrast, the concept of “E-learning” includes multimedia (text, images, video, games) content that is delivered via the internet and can be accessed flexibly. 55 Common services are webop.de, 11 websurg.com, 56 and the Toronto Video Atlas (tvasurg.ca), 57 which provide sectioned videos with written/spoken commentary for a variety of surgical procedures. Serious gaming, mainly available for critical care team skills, 58 is another application of E-learning. These concepts have been proven to exert a significant effect on practical surgical performance. 13 , 14 For the utilization of these platforms in other methods of cognitive learning, the term “blended learning” was coined.
As described above, assessment and feedback are important cornerstones of effective skill acquisition. For this purpose and when working on a patient with an OR team, communication skills are key for a professional work ethic. Initially deriving from aviation, where communication errors can lead to human fatalities as well, several frameworks have been developed to specifically train these skills. For example, the Non-Technical Skills for Surgeons (NOTSS) have been developed by a team of psychologists based on interviews, cognitive task analyses and observations with consultant surgeons in 2006. 59 Key categories are situation awareness, decision making, task management, leadership and communication/teamwork and can be scored similarly to the technical scores mentioned above. It is important to remember that self-evaluation of these skills to validate training is not feasible. 60 Another example is the TeamSTEPPS™ framework, which provides a step-by-step approach to improve non-technical skills in a surgical workplace. 61 These skills can be effectively trained in team simulation training. 62
4 Training modalities: what is available and what is it worth?4.1 Wet lab/Animal models
One of the oldest models for the training of surgical skills and procedures is the use of human cadavers, cadaveric or alive (anesthetized) animals, as well as isolated organs/anatomical structures. For laparoscopic training, small animals like rabbits as well as large animals such as pigs, sheep and dogs have been used. 63 Naturally, these models, especially when using larger anesthetized mammals provide the most realistic (face validity) experience. 64 Most minimally invasive general and visceral surgery procedures are conducted in a porcine model. This model was reported to improve several parameters including total procedure time, radicality of lymph node dissection (in gastrectomy) and overall performance scores, 65 , 66 although no study showed transferability of the acquired skills to human interventions. Cadaveric (fresh frozen) humans/animals or (perfused) single organs like small intestine can be used for the training of certain tasks like the creation of an intestinal anastomosis. These models are known for high face validity. 67–69 Cadaveric human or animal models are often used for the assessment of concurrent validity, especially as a substitute for testing transfer to real OR situations (transfer validity). 70 Using a chicken skin model, Nadu et al. were able to show transferability of skill for the time to complete an urethrovesical anastomosis in humans, but the assessment of other parameters was very limited. 71 Using more advanced evaluation, Patel et al. saw an increase in OSATS scores in laparoscopic salpingectomy after training on an porcine cadaver. 72
Disadvantages of this concept include ethical considerations regarding animal experiments as well as high costs, especially when using live mammals. 73
4.2 Box trainers (video based)Box trainers have been amongst the first dedicated laparoscopic training devices. They typically consist of a table with ports for laparoscopic instruments as well as a camera that projects the image on a TV screen. Under the table is a platform fixed in place where trainees perform tasks like object transfer or more advanced tasks. There is a wide variety of trainers reported, both self-built and commercial. 74 , 75 Advantages of these box trainers are the relatively low cost and flexible accessibility, while the option to perform whole procedures is usually limited. In a systematic review comparing laparoscopic performance in novice surgical trainees after box training and without any training, there was no difference reported between different kind of box simulators, but a significant improvement of selected laparoscopic skills with training. 75 This improvement was seen in the dimensions of time to complete tasks, accuracy and frequency of errors.
Transferability of the skills acquired and improved through box trainers has been shown for a variety of applications 76. Zendejas et al. 77 proved a significantly better GOALS score and operative time after training with a box simulator for total extraperitoneal hernia repair, the same was shown for laparoscopic cholecystectomy by Bansal et al. 78
For robotic surgery, there are similar concepts available. The main difference is the need for a functioning robotic system, which means either using an available system in the OR after working hours or allocating a separate device (cost intensive). Next to conventional practice modes (object transfer, cloth cutting, etc.), training of more complex tasks like pancreatojejunal anastomosis can be performed on 3D printed artificial tissue that represents texture and haptic of its real-world counterpart. Face validity of this concept has been proven 79 and two studies reported implementation of this training mode into their robotic curricula. 80 , 81
4.3 Virtual reality simulationThe first virtual reality simulators in surgery were developed for the laparoscopic approach in the early 1990s. 82 , 83 Since then, they have developed into highly capable machines that not only simulate an accurate and functional anatomical environment but also include haptic feedback together with the ability to perform whole procedures. 84 Another advantage is the option to measure additional parameters like instrument pathway length and camera movement as well as the instant performance feedback to trainees. Similar platforms are available for robotic surgery simulation. 85
4.3.1 LaparoscopicCompared to traditional apprenticeship, virtual reality laparoscopic training improves task time, path length, instrument handling, tissue handling, error scores and OSATS score. 84 This mode of training is especially effective in novices with little to no prior experience. 86 Compared to video box trainers, the majority of studies do not show a significant advantage for VR. 87–89 Regarding transfer of skills to the OR, virtual reality training improved OSATS and GOALS score and time to complete, but no difference was found in comparison to box trainers. 30 This raises the question of cost effectiveness, but no study has so far addressed this.
4.3.2 RoboticVirtual simulators for robotic surgery are based on the same principles as laparoscopic simulators, with the ability to perform both certain skill tasks and whole procedures. There are two systematic reviews including meta-analysis of the current evidence of virtual robotic training 85 , 90. While face validity, content validity and transfer of skills in dimensions of time and technical performance (measured via GOALS and GEARS) was confirmed, the number of studies included in the analysis and the individual participant number was low.
5 Curricula and translational research: making structured use of the instruments5.1 Comprehensive training framework
Each mode of simulation for training of minimally invasive surgery has its characteristic advantages and disadvantages ( Table 5 ). High face validity of wet lab concepts come at the price of ethical considerations, financial costs and limited availability. 73 With low evidence regarding improvement and transfer of surgical skills, this modality has its application more in experimental evaluation of new techniques as well as in expert training rather than in basic surgical training. 91 Because of the missing option to perform more complex tasks in box trainers (except for the use of artificial biotissue 79), the use of animals for end-stage training of procedures was justified before. Today, with the availability of virtual reality platforms and artificial biotissue models, this justification becomes less valid. Simulators offer a validated performance improvement that is transferable to real life surgery and a direct, quantified performance feedback is provided. 30 , 84–90 The place of box trainers in this environment would be in the very early stages of resident and student training. They are widely available, inexpensive and validated regarding the acquisition of basic skills in camera movement, instrument handling, and basic tasks like object transfer. 74 , 75
Several studies have shown that multimodality training curricula are superior compared to single training modality curricula. 92 , 93 It is important to subdivide complex procedures into single tasks that lead from basic to advanced skills. In the case of pancreaticoduodenectomy, a validated concept is to start with basic robotic skills on a virtual reality simulator 94 , 95 and then proceed to train the complex tasks, like hepatojejunostomy and pancreaticojejunostomy on artificial biotissue models 81 before supervised practice on patients. 96
This current knowledge must be translated into structured curricula for each surgical discipline and in even more detail for certain procedures.
5.2 Ideal curriculum designKern et al. described in 1998 97 a 6-step structured approach for development of health care education curricula. This was modified by Khamis et al., in 2015 54 for the application in simulation-based surgical training curricula, resulting in seven aspects (see Table 6 ). The first step consists of identification of the problem and general needs such as the identification of learning curves and complication rates for the procedures addressed by the curriculum. This is followed by the assessment of the situation on the institutional level, meaning the current state of training, availability of simulation equipment and attending expertise. The third step is defined by setting outcome goals and how to measure them. It is important to use standardized measures such as the OSATS score 26 or other numeric data that are easy to obtain in virtual reality simulators or by observation to ensure comparable internal assessment and external validation of the curriculum (steps five and six). The content of the curriculum is decided in step four and involves splitting the procedure to be trained in single tasks and steps, assigning them to different experience levels, setting benchmarks and proficiency levels as well as inclusion of different simulation modes, as mentioned above. Lastly, the curriculum has to be implemented (step seven), which includes practical aspects such as financial and institutional support. This process is constantly under review and relies on continuous feedback, resulting in a refined curriculum that should be published for external evaluation. 54 , 97 , 98
5.3 Current curriculaOne of the first curricula for simulation training in laparoscopic surgery is the “Fundamentals of laparoscopic surgery” (FLS), which was initially developed by the Society of American Gastrointestinal Endoscopic Surgery (SAGES) to teach basics of laparoscopic surgery. 99 This included not only cognitive knowledge but also technical skills: “peg transfer, pattern cutting, ligating loop, suturing with an intracorporeal knot, and suturing with an extracorporeal knot” 99 , 100, which are now famous in the world of laparoscopic surgical training. This curriculum serves as a very good example, as it was continuously assessed and validated over the following years, 101 showing feasibility of practical implementation (96 % participation rate among residents over a 2 year period), 102 construct validity and transfer of skills to the operating room 103 with significant preservation of skills after a span of two years. 102 , 104 This lead to the FLS curriculum becoming a requirement for certification in laparoscopic surgery by the American Board of Surgery. 105 Another institution that provide resources and accreditation for curriculum development is the American College of Surgeons Accredited Education Institutes (ACS-AEI). 106 However, the most used and in reality most applicable curriculum is the virtual reality course from Aggarwal et al. 107 for laparoscopic cholecystectomy. The curriculum is divided in nine basic tasks, four procedural tasks and a full procedure task, all of them carried out on the LAP Mentor VR laparoscopic surgical simulator (Simbionix Corporation, Cleveland, Ohio, USA). To commence with the next level in tasks, the trainee has to achieve a certain grade of proficiency. This benchmark was defined by a median performance from experienced surgeons. Construct validity was proven via difference in performance between novice and senior surgeons and learning curve analysis for the full procedure showed plateaus after second to third repetition. A similarly constructed curriculum with wide acceptance was designed by Sinitsky et al. 108 for laparoscopic appendectomy in 2019, also using the LAP Mentor VR trainer. The training consists of two basic tasks, five procedural tasks and the full procedure. Benchmark levels to proceed to the next task were defined by median performance from experienced surgeons. Learning curves reached plateaus after about six repetitions, depending on the outcome measure. Another similar curriculum is available on the same VR platform for laparoscopic sigmoidectomy. 109
Regarding robotic surgery, there a variety of curricula published, most of them using virtual reality simulators as the main content. 110 , 111 A current systematic review by Rahimi et al. 110 lists the available curricula and addresses how they comply with the steps of curriculum design by Khamis and Kern et al. 54 , 97 (“problem/need identification, goals/objective, educational strategies, assessment/evaluation, implementation”). While all of the reported training schedules (n = 71) cover all of the mentioned steps, evaluation and assessment has been addressed more superficially than educational content. As the question “what and how to teach and train” is more obvious than the more theoretical part of “how to assess and evaluate”, these are not surprising results. On the other hand, the principle that “assessment drives learning” is well accepted in medical education and should be addressed accordingly in surgical training curricula.
5.4 Translational researchThe main goal of surgical training is to maximize patient safety and outcome. This means that the aforementioned training modalities and curricula must translate into measurable clinical benefits. Successful transfer of skills acquired during simulator training has been reported for laparoscopic 30 , 112–115 and robotic 85 , 116–119 virtual reality simulation. This evidence is promising, but has to be further strengthened and expanded, as the studies have some deficiencies: small number and low expertise of test subjects, limited range of procedures (mainly cholecystectomy), no multi-center comparisons 120, and missing patient-centered outcome parameters, such as blood loss, complication rate, and length of stay.
Gallagher et al. 120 addressed this by using a professional approach from other high-skilled sectors like aviation. Transfer of training (TOT) is a measurement of the difference between two groups in performing a task, with only group one receiving training on the simulator in question. Transfer effectiveness ratio (TER) is the time saved in training in real world conditions (on the patient) relative to the time spent training with a simulator. Using this method, Gallagher et al. showed a TER of 24 % for laparoscopic novices regarding a simple transfer and place task. Regarding cholecystectomy, this would correspond to “skipping” the reduction of the risk for bile duct injury from 2 to 0,5 %, which is in “hands-on” patient training achieved during the first 20 cases of the learning curve. 3 , 120
Another factor that has yet to be addressed is cost effectiveness. As implementation of curricula is dependent on financial support from supervising institutions (clinic boards, health care insurance providers, government institutions), it is important to back the need for simulation training with financial advantages. The duration of procedures may be a poor indicator for surgical quality, 121 but is an important predictor for procedure costs, as personnel expenses are the biggest contributor. 122 The same is true for length of stay. While the dimension of time can be quantified easily, other cost factors as complication rate and the management of these are more difficult to express in financial numbers. Regardless, the first two measurable cost dimensions should be compared to the initial and ongoing costs of simulators, material and personnel (supervising and evaluating trainees). Not to forget, implementation of well-validated training curricula possesses high publication value and improves workforce happiness.
Patient-related outcomes have only been reported for a minority of laparoscopic 123 , 124 and robotic 116 , 118 procedure trainings. As global rating scores may not translate directly to patient outcomes for every procedure, these are the most relevant parameters and can also be quantified (blood loss, length of stay, incidence of procedure specific complications). Furthermore, these outcomes can be easier to record as structured rating requires experienced auditors, which is a limited resource in clinical practice.
To provide sufficient evidence for clinical benefits, these study setups must be conducted with standardized measurements, including patient-centered outcomes, and must be carried out over different time spans. Groups have to be randomized and readouts need to be performed in a blinded fashion to keep the risk of bias to a minimum. 75 , 89 Multi-center studies are then the last step for proving clinical benefits of simulation training in surgery.
5.5 Take home messagesSurgical training in the OR is not structured, does not comply with patient safety and is not cost efficient in the current health system. Current training research provides several alternatives. One of the key points of training progress evaluation and research is assessment, which must fulfill different modalities of validity. There are validated tools for assessment, namely scores (OSATS, GOALS, GEARS) with the need of experienced raters and automated assessment on the basis of kinematic data, which can be supported by artificial intelligence.
Training is already and will be focused on simulated setups: dry lab, wet lab and virtual reality simulators supported by traditional sources of knowledge, videos and nontechnical skills (blended learning): 1) Animals, cadavers and single organs offer high face validity, while their availability is limited and makes repeated training difficult. 2) Box trainers for both laparoscopic and robotic surgery offer good evidence for skill improvement and transferability, they are flexible, accessible and their use can be extended by artificial biotissue. 3) Virtual reality simulators offer similar evidence regarding skill acquisition and transfer validity with the added advantage of direct feedback through automated assessment. High initial costs are usually balanced by lack of expendables.
These tools need to be implemented in a curriculum that stems from a structured assessment of target needs and training objective(s). Educational strategies and assessment tools should be well defined. The results of the training and its effect on patient outcomes should be continuously evaluated and the curriculum must be accordingly adjusted.
Regarding the crucial transfer of improved skills to improved patient outcomes, there is still missing knowledge.
6 ConclusionDuring the rise of minimally invasive surgery, a wide range of simulation modalities (wet lab, dry lab, virtual reality) have been developed to train residents before skill acquisition on human patients. Each of these modalities has their respective benefits and drawbacks. The required instruments for the development of minimally invasive surgery training curricula combining these different modalities to create a clinical benefit are known and published. The main problem of most current and past research in this field is absence of structure and standardization, as well as lack of using the available instruments. Further high-quality research in this field promises to improve patient safety in surgical disciplines.
CRediT authorship contribution statementPhilipp Seeger: Writing – original draft, Visualization, Data curation. Nikolaos Kaldis: Writing – review & editing, Validation. Felix Nickel: Writing – review & editing, Validation. Thilo Hackert: Writing – review & editing, Validation. Panagis M. Lykoudis: Writing – review & editing, Validation, Conceptualization. Anastasios D. Giannou: Writing – review & editing, Validation, Supervision, Funding acquisition, Conceptualization.
Declaration of interestPhilipp Seeger none.
Nikolaos Kaldis none.
Felix Nickel none.
Thilo Hackert none.
Panagis M. Lykoudis Teaching fees from Johnson & Johnson, Travel/workshop costs from Medtronic.
Anastasios D. Giannou none.
| Validity | Definition | Example |
| | how well a simulator looks and feels like a real surgical scenario | haptic feedback of virtual reality simulation |
| | whether the simulation outcome variables fit the ones of reality | time to complete a procedure can be measured by virtual reality simulation |
| | if skills acquired in one mode of simulation can be measured with another model | Laparoscopic training with box simulators improves skill levels in virtual reality simulation |
| | if skill improvement or acquisition in simulation training can be transferred to a real-life scenario | Virtual reality camera movement training improves camera movement in real laparoscopic surgery |
| | How a simulator can differentiate between different levels of expertise | Experienced surgeons achieve better performance results than residents |
| Source of validity evidence | Definition | Example for surgical training |
| | The content of assessment is similar to its real-life counterpart | A virtual reality simulation of an appendectomy resembles the real procedure (anatomy, function of tissue, haptics etc.) |
| | Dimension of quality control for raters | Structured scoring for a laparoscopic task is carried out in a blinded fashion |
| | Suitability of single items to measure the whole construct | The single basic laparoscopic skills acquired in a box trainer all together improve procedure scores |
| | Correlation (negative or positive) of connected variables | Improvement in laparoscopic skills does or does not translate to robotic skills |
| | The results of assessment lead to a defined action | Proficiency levels acquired in virtual reality training as requirement for real life robotic surgery |
| 1 | 2 | 3 | 4 | 5 | |
| | Frequently used unnecessary force on tissue or caused damage by inappropriate use of instruments | Careful handling of tissue but occasionally caused inadvertent damage | Consistently handled tissues appropriately with minimal damage | ||
| | Many unnecessary moves | Efficient time/motion but some unnecessary moves | Economy of movement and maximum efficiency | ||
| | Repeatedly makes tentative or awkward moves with instruments | Competent use of instruments although occasionally appeared stiff or awkward | Fluid moves with instruments and no awkwardness | ||
| | Frequently asked for the wrong instrument or used an inappropriate instrument | Knew the names of most instruments and used appropriate instrument or the task | Obviously familiar with the instruments required and their names | ||
| | Consistently placed assistants poorly or failed to use assistants | Good use of assistants most of the time | Strategically used assistant to the best advantage at all times | ||
| | Frequently stopped operating or needed to discuss next move | Demonstrated ability for forward planning with steady progression of operative procedure | Obviously planned course of operation with effortless flow from one move to the next | ||
| | Deficient knowledge. Needed specific instruction at most operative steps | Knew all important aspects of the operation | Demonstrated familiarity with all aspects of the operation |
| 1 | 2 | 3 | 4 | 5 | |
| | Constantly overshoots target, wide swings, slow to correct | Some overshooting or missing of target, but quick to correct | Accurately directs instruments in the correct plane to target | ||
| | Uses only one hand, ignores nondominant hand, poor coordination between hands | Uses both hands, but does not optimize interaction between hands | Expertly uses both hands in a complimentary manner to provide optimal exposure | ||
| | Uncertain, inefficient efforts; many tentative movements; constantly changing focus or persisting without progress | Slow, but planned movements are reasonably organized | Confident, efficient and safe conduct, maintains focus on task until it is better performed by way of an alternative approach | ||
| | Rough movements, tears tissue, injures adjacent structures, poor grasper control, grasper frequently slips, frequent suture breakage | Handles tissues reasonably well, minor trauma to adjacent tissue (i.e., occasional unnecessary bleeding or slipping of the grasper), rare suture breakage | Handles tissues well, applies appropriate traction, negligible injury to adjacent structures, no suture breakage | ||
| | Unable to complete entire task, even with verbal guidance | Able to complete task safely with moderate guidance | Able to complete task independently without prompting | ||
| | Consistently does not optimize view, hand position or repeated collisions even with guidance | View is sometimes not optimal. Occasionally needs to relocate arms. Occasional collisions and obstruction of the assistant. | Controls camera and hand position optimally and independently. Minimal collisions or obstruction of assistant. |
| Mode of simulation | Evidence | Advantage | Disadvantage | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| Definition | Example | |
| | Problem identification and general need assessment | Critical step of pancreaticojejunostomy (PJ) in robotic pancreaticoduodenectomy (RPD) cannot be trained on patients as early learning curve increases rate of pancreatic fistula (PF) |
| | Targeted need assessment | Number of RPD in institution, number of surgeons proficient in RPD, number of residents fitting for RPD training, availability of simulators |
| | Goals and objectives | Scores for measurement of outcomes, what aspects of PJ should be trained (psychomotor, team communication, theoretical knowledge) |
| | Educational strategies | Division of procedure into smaller tasks and basic skills, which simulators are validated for PJ training, benchmarking proficiency, modes of training, duration of training, assigning experienced reviewers |
| | Individual assessment and feedback | Use validated assessment measures to certify proficiency |
| | Program evaluation | Assess procedural performance and patient-centered outcomes, include individual feedback of trainees and raters |
| | Implementation | Gather financial and institutional backing, provide arguments (patient safety, cost effectiveness), designate program administrators, introduction to trainees |
©2025. The Authors