Content area

Abstract

Background

Multiple choice questions are heavily used in medical education assessments, but rely on recognition instead of knowledge recall. However, grading open questions is a time-intensive task for teachers. Automatic short answer grading (ASAG) has tried to fill this gap, and with the recent advent of Large Language Models (LLM), this branch has seen a new momentum.

Methods

We graded 2288 student answers from 12 undergraduate medical education courses in 3 languages using GPT-4 and Gemini 1.0 Pro.

Results

GPT-4 proposed significantly lower grades than the human evaluator, but reached low rates of false positives. The grades of Gemini 1.0 Pro were not significantly different from the teachers’. Both LLMs reached a moderate agreement with human grades, and a high precision for GPT-4 among answers considered fully correct. A consistent grading behavior could be determined for high-quality keys. A weak correlation was found wrt. the length or language of student answers. There is a risk of bias if the LLM knows the human grade a priori.

Conclusions

LLM-based ASAG applied to medical education still requires human oversight, but time can be spared on the edge cases, allowing teachers to focus on the middle ones. For Bachelor-level medical education questions, the training knowledge of LLMs seems to be sufficient, fine-tuning is thus not necessary.

Details

1009240
Company / organization
Title
LLM-based automatic short answer grading in undergraduate medical education
Publication title
Volume
24
Pages
1-16
Publication year
2024
Publication date
2024
Section
Research
Publisher
Springer Nature B.V.
Place of publication
London
Country of publication
Netherlands
e-ISSN
14726920
Source type
Scholarly Journal
Language of publication
English
Document type
Journal Article
Publication history
 
 
Online publication date
2024-09-27
Milestone dates
2024-05-19 (Received); 2024-09-13 (Accepted); 2024-09-27 (Published)
Publication history
 
 
   First posting date
27 Sep 2024
ProQuest document ID
3115122456
Document URL
https://www.proquest.com/scholarly-journals/llm-based-automatic-short-answer-grading/docview/3115122456/se-2?accountid=208611
Copyright
© 2024. This work is licensed under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Last updated
2024-12-16
Database
ProQuest One Academic