Content area
This study aims to develop a robust rubric for evaluating artificial intelligence (AI)–assisted essay writing in English as a Foreign Language (EFL) contexts. Employing a modified Delphi technique, we conducted a comprehensive literature review and administered Likert scale questionnaires. This process yielded nine key evaluation criteria, forming the initial rubric. The rubric was applied to evaluate 33 AI-assisted essays written by students as part of an intensive course assignment. Statistical analysis revealed significant inter-rater reliability and convergent validity coefficients, supporting the adoption and further development of such rubrics across higher education institutions. The developed rubric was subsequently used to evaluate these essays using two AI tools: ChatGPT and Claude. The results indicated that both AI tools evaluated the essays with similar scores, demonstrating consistency in their assessment capabilities.
Details
Academic Achievement;
Language Teachers;
Second Languages;
Holistic Evaluation;
Student Evaluation;
Artificial Intelligence;
Delphi Technique;
Educational Assessment;
Language Proficiency;
Learner Engagement;
English;
Student Characteristics;
Scoring Rubrics;
Language Research;
Addition;
Teaching Methods;
Computers;
Interrater Reliability;
English (Second Language);
Periodicals;
Computer Assisted Instruction;
Essays;
Outcomes of Education;
Higher Education
Reliability;
Literature reviews;
Higher education;
Quantitative analysis;
Foreign language learning;
Teachers;
Second language writing;
Chatbots;
English as a second language;
Statistical analysis;
Artificial intelligence;
Foreign languages;
Essays;
Higher education institutions;
Language teachers;
Language;
Student writing;
English language;
Automation;
Ethics;
Feedback;
Delphi method;
Convergent validity
1 Arab Open University, Saudi Arabia
2 King Faisal University, Saudi Arabia
