It appears you don't have support to open PDFs in this web browser. To view this file, Open with your PDF reader
Abstract
This study was designed to assess how different prompt engineering techniques, specifically direct prompts, Chain of Thought (CoT), and a modified CoT approach, influence the ability of GPT-3.5 to answer clinical and calculation-based medical questions, particularly those styled like the USMLE Step 1 exams. To achieve this, we analyzed the responses of GPT-3.5 to two distinct sets of questions: a batch of 1000 questions generated by GPT-4, and another set comprising 95 real USMLE Step 1 questions. These questions spanned a range of medical calculations and clinical scenarios across various fields and difficulty levels. Our analysis revealed that there were no significant differences in the accuracy of GPT-3.5's responses when using direct prompts, CoT, or modified CoT methods. For instance, in the USMLE sample, the success rates were 61.7% for direct prompts, 62.8% for CoT, and 57.4% for modified CoT, with a p-value of 0.734. Similar trends were observed in the responses to GPT-4 generated questions, both clinical and calculation-based, with p-values above 0.05 indicating no significant difference between the prompt types. The conclusion drawn from this study is that the use of CoT prompt engineering does not significantly alter GPT-3.5's effectiveness in handling medical calculations or clinical scenario questions styled like those in USMLE exams. This finding is crucial as it suggests that performance of ChatGPT remains consistent regardless of whether a CoT technique is used instead of direct prompts. This consistency could be instrumental in simplifying the integration of AI tools like ChatGPT into medical education, enabling healthcare professionals to utilize these tools with ease, without the necessity for complex prompt engineering.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details
1 Mount Sinai Health System, New York, USA (GRID:grid.425214.4) (ISNI:0000 0000 9963 6690)
2 Affiliated to Tel-Aviv University, Hospital Management, Sheba Medical Center, Tel Aviv, Israel (GRID:grid.12136.37) (ISNI:0000 0004 1937 0546); Affiliated to Tel-Aviv University, ARC Innovation Center, Sheba Medical Center, Tel Aviv, Israel (GRID:grid.12136.37) (ISNI:0000 0004 1937 0546)
3 The Charles Bronfman Institute of Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, USA (GRID:grid.59734.3c) (ISNI:0000 0001 0670 2351)
4 University of California, Los Angeles, USA (GRID:grid.19006.3e) (ISNI:0000 0000 9632 6718)
5 Affiliated to Tel-Aviv University, ARC Innovation Center, Sheba Medical Center, Tel Aviv, Israel (GRID:grid.12136.37) (ISNI:0000 0004 1937 0546); The Charles Bronfman Institute of Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, USA (GRID:grid.59734.3c) (ISNI:0000 0001 0670 2351)