Content area

Abstract

Self-Admitted Technical Debt (SATD), a concept that highlights sub-optimal choices in software development documented in code comments or other project resources, poses challenges to the maintainability and evolution of software systems. Recognizing the importance of managing SATD to mitigate its detrimental impact on software quality and maintainability, this thesis focuses on four essential aspects of SATD management: (1) SATD identification and classification—detecting SATD instances in code comments and classifying them; (2) SATD tracking—identifying SATD occurrences in repositories and tracking them through commits until their removal or latest recorded location; (3) Automated SATD repayment—addressing SATD automatically through code generation; and (4) An empirical study of SATD across programming languages to analyze its prevalence, deletion ratio, lifespan, and other characteristics.

For SATD identification and classification, we investigate the effectiveness of large language models (LLMs) across different settings, including in-context learning (ICL), fine-tuning, LLM size, and the impact of providing contextual information, such as surrounding code.

For SATD tracking, we develop the first programming language-independent tracking tool, facilitating the precise monitoring of SATD lifecycles and aiding in the accurate analysis of SATD repayment practices.

We study the effectiveness of LLMs in automated SATD repayment by creating the largest SATD repayment datasets ever assembled (58,722 items for Python and 97,347 items for Java) and introducing three new, effective evaluation metrics: BLEU-diff, CrystalBLEU-diff, and LEMOD. Using our new benchmarks and evaluation metrics, we evaluate two types of automated SATD repayment methods: fine-tuning smaller models, and prompt engineering with five large-scale models.

Finally, we conduct an empirical study of SATD across three popular programming languages. Leveraging our SATD Tracker, we processed 13,693 Python repositories, 7,659 Java repositories, and 11,203 JavaScript repositories, extracting the largest SATD datasets ever created for these languages. We analyze these datasets to compare the characteristics of SATD across programming languages and repository sizes.

This thesis advances the field of SATD management by introducing innovative methods for SATD identification, tracking, automated repayment, and empirical analysis, providing valuable insights and tools to improve software maintainability and quality across diverse programming languages.

Details

1010268
Business indexing term
Title
Automated Self-Admitted Technical Debt Tracking, Classification, and Repayment for Sustainable Software Development
Number of pages
191
Publication year
2025
Degree date
2025
School code
0283
Source
DAI-A 87/6(E), Dissertation Abstracts International
ISBN
9798270206444
Advisor
Committee member
Adams, Bram; Muise, Christian
University/institution
Queen's University (Canada)
University location
Canada -- Ontario, CA
Degree
Ph.D.
Source type
Dissertation or Thesis
Language
English
Document type
Dissertation/Thesis
Dissertation/thesis number
32353647
ProQuest document ID
3283374061
Document URL
https://www.proquest.com/dissertations-theses/automated-self-admitted-technical-debt-tracking/docview/3283374061/se-2?accountid=208611
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.
Database
ProQuest One Academic