Content area

Abstract

Human vs. Automated Coding Style Grading in Computing Education

Computer programming courses often evaluate student coding style manually. Static analysis tools provide an opportunity to automate this process. In this paper, we explore the tradeoffs of human style graders and general-purpose static analysis tools to evaluate student code. We investigate the following research questions: - Are human coding style evaluation scores consistent with static analysis tools? - Which style grading criteria are best evaluated with existing static analysis tools and which are more effectively evaluated by human graders?

We analyze data from a second-semester programming course at a large research institution with 943 students enrolled. Hired student graders evaluated student code with rubric criteria such as “Lines are not too long” or “Code is not too deeply nested.” We also ran several static analysis tools on the same student code to evaluate the same criteria. We then analyzed the correlation between the number of static analysis warnings and human style grading score for each criterion.

In our preliminary results, we see that static analysis tools tend to be more effective at evaluating objective code style criteria. We found a weak negative or no correlation between the human style grading score and number of static analysis warnings. Note that we expect student code with more static analysis warnings to receive fewer human style grading points. When comparing the “Lines are not too long” human style grading criterion to a related line-length static analysis inspection, we see a Pearson correlation score of r=-0.21. We also see trends in the distributions of human style grading scores that suggest human graders perform inconsistently. For example, 50% of students who received full human style grading points for the line-length criterion had 3 or more static analysis warnings from a related line-length inspection. Additionally, 23% of students who received no points on the same criterion had no static analysis warnings for the line-length inspection.

We also found that some code style criteria are not well suited to the general-purpose static analysis tools we investigated. For example, none of the static analysis tools we investigated provide a robust way of evaluating the quality of variable and function names in a program. Some tools provide an inspection for detecting variable names that are shorter than a user-specified length threshold; however, this inspection fails to identify low-quality variable names that happen to be longer than the minimum allowed length. Furthermore, there are some common scenarios where a short variable name is acceptable by convention.

Static analysis tools have the benefit of integration with an automated grading system, facilitating faster and more frequent feedback compared to human grading. The literature suggests that frequent feedback encourages students to actively improve on their work (Spacco et al. 2006). There is also evidence to suggest that increased engagement is most beneficial to students with less experience (Carini et al. 2006). Our results suggest that automated code quality evaluation could be one tool that benefits student learning in intro CS courses, helping most those students with least access to CS training pre-college.

References - Carini, R.M., Kuh, G.D. & Klein, S.P. Res High Educ (2006) 47: 1. - Spacco, Jaime and Pugh, William. Helping students appreciate test-driven development (TDD). Proceedings of OOPSLA, pages 907–913, 2006.

Details

Business indexing term
Title
Human vs. Automated Coding Style Grading in Computing Education
Source details
Conference: 2019 ASEE Annual Conference & Exposition; Location: Tampa, Florida; Start Date: June 15, 2019; End Date: June 19, 2019
Publication year
2019
Publication date
Jun 15, 2019
Publisher
American Society for Engineering Education-ASEE
Place of publication
Atlanta
Country of publication
United States
Source type
Conference Paper
Language of publication
English
Document type
Conference Proceedings
Publication history
 
 
Online publication date
2019-07-09
Publication history
 
 
   First posting date
09 Jul 2019
ProQuest document ID
2314005511
Document URL
https://www.proquest.com/conference-papers-proceedings/human-vs-automated-coding-style-grading-computing/docview/2314005511/se-2?accountid=208611
Copyright
© 2019. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the associated terms available at https://peer.asee.org/about .
Last updated
2025-11-14
Database
ProQuest One Academic