Content area

Abstract

Machine Learning (ML) models and Large Language Models (LLMs) have demonstrated strong capabilities in automating tasks traditionally performed manually. Their effectiveness has led to their integration into critical applications, forming the backbone of complex AIbased systems. To manage the full lifecycle of these models, operational frameworks such as Machine Learning Operations (MLOps) and Large Language Model Operations (LLMOps) have emerged, offering tailored tools and best practices. MLOps and LLMOps pipelines enable the continuous deployment of improved models—either by updating to more capable versions or by adapting models to evolving, dynamic environments through Continuous Training (CT) on fresh production data. These practices aim to enhance the dependability of AI-based systems. However, despite offering best practices and tools to support robustness, MLOps and LLMOps pipelines do not inherently guarantee reliability or trustworthiness. For instance, deploying a sub-optimal model may lead to performance degradation instead of improvement, ultimately compromising the reliability of the entire pipeline. Such reliability issues can lead to costly failures, loss of stakeholder trust, and critical errors in high-stakes applications.

In this thesis, we aim to contribute to improving the reliability and trustworthiness of both MLOps and LLMOps pipelines. In the first part, we focus on MLOps and aim to enhance the reliability of models updated through CT workflows. Although CT is designed to improve AI-based system performance, it can also introduce risks when production data is noisy and poorly managed, leading to catastrophic regressions or silent performance degradation. In practice, production data may suffer from distribution drift or be automatically labeled with low confidence, making it unsuitable for reliable CT. To address these challenges and promote more robust CT pipelines, we propose a reliable maintenance approach based on a filtering mechanism for incoming data. This method excludes low-confidence instances, which are likely to be mislabeled, as well as samples that significantly deviate from the original distribution. This approach helps safeguard the CT process and ensures more reliable model updates over time.

The second part of this thesis focuses on LLMOps, particularly pipelines for code generation tasks. With the rapid evolution of large language models, new versions are frequently released, often accompanied by promises of significant improvements. However, such updates can inadvertently introduce regressions, and even the most advanced models may produce code with inefficiencies. This makes it difficult to maintain consistent quality and long-term reliability across model versions.

To address these challenges, we first propose a taxonomy of inefficiencies commonly observed in code generated by LLMs. This taxonomy provides a structured basis for systematically evaluating model outputs and identifying recurring flaws. Building on this foundation, we introduce ReCatcher, a regression testing suite designed to detect both capability regressions and improvements between different LLM versions. ReCatcher thus contributes to a more transparent, trustworthy, and well-informed continuous deployment process for language models.

Together, these contributions aim to strengthen the reliability of AI-based systems by addressing key challenges in MLOps and LLMOps pipelines, while providing concrete solutions and actionable insights for safer model deployment.

Details

1010268
Business indexing term
Title
Towards Reliable and Trustworthy Pipelines for MLOps and LLMOps
Number of pages
133
Publication year
2025
Degree date
2025
School code
1105
Source
MAI 87/3(E), Masters Abstracts International
ISBN
9798293882120
Committee member
Lamothe, Maxime
University/institution
Ecole Polytechnique, Montreal (Canada)
University location
Canada -- Quebec, CA
Degree
M.A.Sc.
Source type
Dissertation or Thesis
Language
English
Document type
Dissertation/Thesis
Dissertation/thesis number
32317565
ProQuest document ID
3254451511
Document URL
https://www.proquest.com/dissertations-theses/towards-reliable-trustworthy-pipelines-mlops/docview/3254451511/se-2?accountid=208611
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.
Database
ProQuest One Academic