Content area
Datacenters are increasingly adopting heterogeneous memory and storage hierarchies to meet the demands of modern data-intensive workloads. Datacenter hierarchies include not just DRAM and disks but also new technologies such as Non-Volatile Memory (NVM), CXL-attached memory and ultra-low-latency SSDs. Traditional interfaces and system software fall short in fully utilizing the potential of these new devices. This dissertation tackles some of the challenges in integrating these new memory and storage technologies into existing hierarchies by rethinking the system software that manages them and/or the hardware interface. These new interfaces, system policies and runtime systems help applications exploit fully the potential of these devices.
First, I introduce ASAP, a speculative persistence model for byte-addressable NVM that preserves crash consistency without stalling normal execution. ASAP uses a hardware persist buffer to track and persist data to NVM ASAP eagerly flushes data and speculatively updates the memory for high performance. To handle the occasional crash, it stores minimal recovery data in the memory controller caches. Simulation experiments show that ASAP achieves 2.3x average speedup over existing systems and performs within 4% of an ideal instantly-persistent system.
Second, I present BypassD, a novel I/O architecture that eliminates software overheads in the critical path of I/O operations to modern ultra-low-latency SSDs while also ensuring that multiple applications can share the device safely. BypassD virtualizes SSD block addresses and relies on the IOMMU to perform translation and permission checks. BypassD reduces I/O latencies by 40% and improves performance of real-life workloads such as WiredTiger by 20%.
Third, I develop ARMS - Adaptive and Robust Memory tiering System - that overcomes the limitations of existing tiering systems by eliminating static thresholds. ARMS combines dual-horizon moving-average hotness scoring, cost-benefit-aware migration, and bandwidth-aware batched transfers to adapt online to workload and platform changes, matching tuned heuristics (within 3%) while outperforming state-of-the-art systems by 1.26x-2.3x, without any per-workload tuning.