Abstract

Multicore scaling is reaching its limit with the end of Dennard scaling; however, data and processing needs are increasing rapidly. The new generation of cloud applications are making large-scale application development commonplace; hence, this software growth does not show any sign of slowing. Unlike in the past, we cannot maintain the growth by adding more hardware to systems. It is developers’ responsibility to write optimized software that efficiently use the underlying hardware to sustain innovation. 

How applications place data relative to where they perform computation can greatly impact the performance of applications with diverse resource requirements, ranging from single-CPU machines to multi-socket NUMA machines to distributed clusters. This dissertation demonstrates, across three application domains (DNN inference, OS kernels, and distributed key-value stores), that while a universal placement strategy does not exist for all these domains, it is feasible to develop systematic abstractions that enable the movement, replication, and partitioning of workloads across cores and machines. Such abstractions alleviate the need for ad-hoc and manual solutions that are currently prevalent.

This dissertation exemplifies this with three systems. The first is Packrat , which uses controlled data and compute placement to improve DNN inference latency on single CPU machines. It algorithmically partitions the larger DNN tasks into smaller tasks and places them on the CPU cores to improve the overall throughput and latency. The second system, NrOS uses controlled data and compute placement to improve the performance of OS kernels on multi-socket NUMA machines while simplifying kernel development. It replicates the kernel data structures across the NUMA nodes to avoid costly remote memory accesses to improve throughput and latency for OS services such as file system and virtual memory, etc. The third and the last system, ASFP uses controlled data and compute placement to improve the performance of distributed key-value stores. It uses logically decoupled storage functions and, based on resources demands, it places them on storage servers and clients to improve the overall system throughput and latency.

Details

Title
Controlled Data and Compute Placement for Scaling Application Performance
Author
Ankit
Publication year
2023
Publisher
ProQuest Dissertations & Theses
ISBN
9798380593618
Source type
Dissertation or Thesis
Language of publication
English
ProQuest document ID
2877268551
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.