Content area

Abstract

As cloud computing revolutionizes the IT industry, the complexity of underlying cloud systems has been increasing rapidly. The advent of new hardware accelerators and software platforms has made it challenging for cloud users to master the growing development toolkits. Compounding the issue, the programming frameworks and internals of these new systems are highly heterogeneous, with different performance characteristics, resource constraints, management principles, and reliability considerations. Consequently, it is becoming crucial to minimize human effort when managing these new ecosystems. In this dissertation, we advocate for assisting cloud developers and operators by automatically generating system insights. These insights bridge the gap between user intentions and system requirements, providing clarity on the outcomes of user actions on a system without the need for tedious trial-and-error processes.

This dissertation demonstrates how we generate various types of insights for different cloud systems. Firstly, the dissertation explores performance optimization insights, which are critically needed as users attempt to offload legacy code from on-premise servers to emerging accelerators like SmartNICs. These new hardware components feature entirely different programming abstractions, compilers, instruction sets, and architectures. Although a straightforward offloading strategy might functionally work, it could lead to significant performance degradation, undermining the benefits of using accelerators. To address this issue, we create a toolset called Clara, which can automatically predict offloading performance and suggest tuning strategies before extensive deployment efforts. This allows users to make informed decisions on whether and how to offload their legacy code. 

Secondly, the dissertation investigates safety compliance insights for the cloud networking stack, focusing on ensuring the correctness of system updates for the latest generation of runtime-programmable platforms. We observe that even if both the current and intended functionalities are correct and efficient, the intermediate transition state can still introduce consistency and capacity issues into the core network. To tackle this challenge, we employ formal reasoning techniques to achieve update clarity. We develop FlexPlan, an interactive platform that synthesizes runtime transition plans meeting dynamic user demands, greatly minimizing the need for manual intervention.

Lastly, the dissertation unearths infrastructure management insights for emerging cloud orchestration platforms. Clouds are constructed by providers like Microsoft but are intended for third-party use. This user/owner division limits cloud users’ visibility and control over cloud service behavior. The adoption of Infrastructure-as-Code (IaC) style cloud orchestration platforms further complicates this semantic gap by adding another intermediate layer of abstraction. To address this complexity, we propose Zodiac, a pipeline that automatically uncovers cloud provider requirements, and clarifies their interaction with orchestration platforms. The outcome is a set of orchestration rules that cloud users must follow to ensure proper cloud management practices.

Throughout these projects, we leverage and extend techniques from a wide variety of disciplines, such as formal reasoning, software testing, machine learning, and their intersections. The results demonstrate the feasibility of generating useful insights across cloud data, control, and management planes, while unveiling an even larger insight generation and integration design space yet to be explored. 

Details

1010268
Business indexing term
Title
Assisting Cloud System Development With Automated Insight Generation
Number of pages
141
Publication year
2025
Degree date
2025
School code
0127
Source
DAI-B 86/11(E), Dissertation Abstracts International
ISBN
9798314875407
Advisor
Committee member
Chen, Jiasi; Beckett, Ryan; Wang, Xinyu
University/institution
University of Michigan
Department
Computer Science & Engineering
University location
United States -- Michigan
Degree
Ph.D.
Source type
Dissertation or Thesis
Language
English
Document type
Dissertation/Thesis
Dissertation/thesis number
32092654
ProQuest document ID
3203096046
Document URL
https://www.proquest.com/dissertations-theses/assisting-cloud-system-development-with-automated/docview/3203096046/se-2?accountid=208611
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.
Database
ProQuest One Academic