Abstract

Translate

Multicore scaling is reaching its limit with the end of Dennard scaling; however, data and processing needs are increasing rapidly. The new generation of cloud applications are making large-scale application development commonplace; hence, this software growth does not show any sign of slowing. Unlike in the past, we cannot maintain the growth by adding more hardware to systems. It is developers’ responsibility to write optimized software that efficiently use the underlying hardware to sustain innovation.

How applications place data relative to where they perform computation can greatly impact the performance of applications with diverse resource requirements, ranging from single-CPU machines to multi-socket NUMA machines to distributed clusters. This dissertation demonstrates, across three application domains (DNN inference, OS kernels, and distributed key-value stores), that while a universal placement strategy does not exist for all these domains, it is feasible to develop systematic abstractions that enable the movement, replication, and partitioning of workloads across cores and machines. Such abstractions alleviate the need for ad-hoc and manual solutions that are currently prevalent.

This dissertation exemplifies this with three systems. The first is Packrat , which uses controlled data and compute placement to improve DNN inference latency on single CPU machines. It algorithmically partitions the larger DNN tasks into smaller tasks and places them on the CPU cores to improve the overall throughput and latency. The second system, NrOS uses controlled data and compute placement to improve the performance of OS kernels on multi-socket NUMA machines while simplifying kernel development. It replicates the kernel data structures across the NUMA nodes to avoid costly remote memory accesses to improve throughput and latency for OS services such as file system and virtual memory, etc. The third and the last system, ASFP uses controlled data and compute placement to improve the performance of distributed key-value stores. It uses logically decoupled storage functions and, based on resources demands, it places them on storage servers and clients to improve the overall system throughput and latency.

Details

Title

Controlled Data and Compute Placement for Scaling Application Performance

Author

Ankit

Publication year

2023

Publisher

ProQuest Dissertations & Theses

ISBN

9798380593618

Source type

Dissertation or Thesis

Language of publication

English

ProQuest document ID

2877268551

Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.

Controlled Data and Compute Placement for Scaling Application Performance

Jump to:

Abstract

Details

Suggested sources