Human vs. Automated Coding Style Grading in Computing Education

Abstract

Computer programming courses often evaluate student coding style manually. Static analysis tools provide an opportunity to automate this process. In this paper, we explore the tradeoffs of human style graders and general-purpose static analysis tools to evaluate student code. We investigate the following research questions: - Are human coding style evaluation scores consistent with static analysis tools? - Which style grading criteria are best evaluated with existing static analysis tools and which are more effectively evaluated by human graders?

We analyze data from a second-semester programming course at a large research institution with 943 students enrolled. Hired student graders evaluated student code with rubric criteria such as “Lines are not too long” or “Code is not too deeply nested.” We also ran several static analysis tools on the same student code to evaluate the same criteria. We then analyzed the correlation between the number of static analysis warnings and human style grading score for each criterion.

In our preliminary results, we see that static analysis tools tend to be more effective at evaluating objective code style criteria. We found a weak negative or no correlation between the human style grading score and number of static analysis warnings. Note that we expect student code with more static analysis warnings to receive fewer human style grading points. When comparing the “Lines are not too long” human style grading criterion to a related line-length static analysis inspection, we see a Pearson correlation score of r=-0.21. We also see trends in the distributions of human style grading scores that suggest human graders perform inconsistently. For example, 50% of students who received full human style grading points for the line-length criterion had 3 or more static analysis warnings from a related line-length inspection. Additionally, 23% of students who received no points on the same criterion had no static analysis warnings for the line-length inspection.

We also found that some code style criteria are not well suited to the general-purpose static analysis tools we investigated. For example, none of the static analysis tools we investigated provide a robust way of evaluating the quality of variable and function names in a program. Some tools provide an inspection for detecting variable names that are shorter than a user-specified length threshold; however, this inspection fails to identify low-quality variable names that happen to be longer than the minimum allowed length. Furthermore, there are some common scenarios where a short variable name is acceptable by convention.

Static analysis tools have the benefit of integration with an automated grading system, facilitating faster and more frequent feedback compared to human grading. The literature suggests that frequent feedback encourages students to actively improve on their work (Spacco et al. 2006). There is also evidence to suggest that increased engagement is most beneficial to students with less experience (Carini et al. 2006). Our results suggest that automated code quality evaluation could be one tool that benefits student learning in intro CS courses, helping most those students with least access to CS training pre-college.

References - Carini, R.M., Kuh, G.D. & Klein, S.P. Res High Educ (2006) 47: 1. - Spacco, Jaime and Pugh, William. Helping students appreciate test-driven development (TDD). Proceedings of OOPSLA, pages 907–913, 2006.

Details

Business indexing term

Subject:

Automation

Identifier / keyword

Static analysis; Computer programming; Test-driven development

URL

https://peer.asee.org/human-vs-automated-coding-style-grading-in-computing-education

Title

Human vs. Automated Coding Style Grading in Computing Education

Author

Perretta, James; Weimer, Westley; DeOrio, Andrew

Publication title

Association for Engineering Education - Engineering Library Division Papers; Atlanta

Source details

Conference: 2019 ASEE Annual Conference & Exposition; Location: Tampa, Florida; Start Date: June 15, 2019; End Date: June 19, 2019

Publication year

2019

Publication date

Jun 15, 2019

Publisher

American Society for Engineering Education-ASEE

Place of publication

Atlanta

Country of publication

United States

Publication subject

Engineering--Mechanical Engineering, Heating, Plumbing And Refrigeration

Source type

Conference Paper

Language of publication

English

Document type

Conference Proceedings

Publication history

Online publication date

2019-07-09

Publication history

First posting date

09 Jul 2019

ProQuest document ID

2314005511

Document URL

https://www.proquest.com/conference-papers-proceedings/human-vs-automated-coding-style-grading-computing/docview/2314005511/se-2?accountid=208611

Last updated

2025-11-14

Database

ProQuest One Academic

Human vs. Automated Coding Style Grading in Computing Education

Abstract

Details

Full text options

Suggested sources

Search with indexing terms

Subject

Human vs. Automated Coding Style Grading in Computing Education

Content area

Abstract

Details

Full text options

Suggested sources

Search with indexing terms

Subject