Content area

Abstract

Continuous Integration (CI) is the concept of continuously incorporating new code changes in the project’s main codebase. To ensure the quality of the integrated changes, two types of quality gates are typically considered by most software development companies: automated gates, leveraging the build process, which invokes compilation commands and runs tests, and manual gates, based on peer reviews.

With projects scaling up in size, the cost required for maintaining CI quality gates can be considerable (in terms of computational/manual effort). To assist or improve CI quality gates, the Software Engineering (SE) community has proposed a wide range of software analytics models. However, most of prior work focuses on the initial deployment of such models, without considering the different factors involved in their long-term sustainability: automated gates require smart retraining processes to maintain performance, while manual gates require synergy with reviewers’ workflows.

In this thesis, we study sustainable AI for CI quality gates. For automated gates, we explore how to improve the learning process of software analytics models using (1) dynamic training scheduling strategies, where we evaluate different ways to dynamically adapt the training time for Retraining-From-Scratch (RFS) setups, and (2) Lifelong Learning (LL) training setups, where we change the fundamental learning algorithms toward an incremental setup. For the manual CI quality gates, we perform user studies to evaluate the impact on reviewers’ productivity of (3) hotspot-based file-ordering, where problematic changes are shown first to the reviewers, and of (4) reviewers’ interaction with generated review comments.

We were able to optimize the trade-off between model performance and the computational effort of model retraining, with our best heuristics recommending retraining only once every 5-6 weeks without losing performance to weekly retrained models. Yet, LL substantially reduces computational effort even further by 2-40x compared to RFS. Through large-scale industrial user studies, we also observe that reviewers benefit from ML-based assistance, as file-reordering led to more (+23%) and better (precision +13%, recall +8%) review comments compared to the default alphanumeric file-ordering, while generative AI-based review comment generation obtains promising results regarding acceptance (8.1% and 7.2%) and appreciation (23% and 28.3%).

Details

1010268
Business indexing term
Title
Towards Sustainable AI for Continuous Integration Quality Gates
Number of pages
216
Publication year
2025
Degree date
2025
School code
0283
Source
DAI-B 86/10(E), Dissertation Abstracts International
ISBN
9798311911795
Advisor
University/institution
Queen's University (Canada)
University location
Canada -- Ontario, CA
Degree
Ph.D.
Source type
Dissertation or Thesis
Language
English
Document type
Dissertation/Thesis
Dissertation/thesis number
31923447
ProQuest document ID
3214296986
Document URL
https://www.proquest.com/dissertations-theses/towards-sustainable-ai-continuous-integration/docview/3214296986/se-2?accountid=208611
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.
Database
ProQuest One Academic