Content area
Cloud computing has revolutionized service deployment by offering scalable, flexible remote resources. Yet, a central challenge for cloud providers remains: efficiently utilizing servers while maintaining performance. Providers typically address this by (1) developing VM scheduling techniques that leverage CPU oversubscription with performance guarantees, and (2) introducing novel VM types like Burstable Virtual Machines (BVMs), which offer dynamic CPU allocation compared to the fixed allocation of Regular VMs. This thesis proposes methods to improve server utilization using insights drawn from real-world cloud workloads.
This work is the first to conduct a large-scale study of BVMs using real-world cloud traces. By comparing the CPU usage behavior of Burstable VMs with Regular VMs, we uncover distinct characteristics that reveal the shortcomings of current scheduling approaches and existing VM offerings. These findings underscore the importance of developing new scheduling strategies that effectively accommodate the bursty workloads of Burstable VMs, highlighting the need for new VM types that better align with workload demands.
Based on our data analysis, we propose novel VM scheduling algorithms that leverage CPU oversubscription to improve utilization while adhering to performance objectives. Specifically, we introduce Audible, a new Burstable VM scheduling approach based on a non-parametric statistical model. Audible is lightweight, workload-independent, and does not require machine learning model training for parameter tuning. We rigorously validate this scheduling technique through largescale simulations using production-level traces. Our evaluation demonstrates that Audible achieves utilization up to 5 times higher than the method currently used in a cloud, while still enforcing strict performance guarantees.
Another major outcome of our analysis is the identification of demand for a new VM type combining the widespread adoption of Regular VMs with dynamic CPU allocation policies of BVMs. As a result, we introduce the Universal Burstable VM along with a supporting scheduling framework using probabilistic modeling to ensure performance guarantees. Universal Burstable VM encourages users to adopt it due to lower costs while enabling providers to utilize their servers more efficiently through CPU oversubscription. Our results show that this new VM type can improve cluster utilization for workloads originally running on Regular VMs by a factor of 3.5X.