Content area

Abstract

Modern computing is increasingly driven by the explosive growth of data from applications such as Artificial Intelligence (AI), Machine Learning (ML), and Genomics. These workloads are inherently data-intensive, requiring fast and efficient processing of large datasets. Although scaling input data in AI applications continues to boost performance, traditional computing architectures have struggled to keep pace, creating a widening gap between data generation and processing capabilities.

This disparity stresses the three fundamental pillars of computing—storage, communication, and computation—impacting performance, energy efficiency, and cost. Conventional Von Neumann architectures, designed to maximize computational throughput, now face the “memory and power wall,” where compute units cannot fetch or process data fast enough to meet demand. As data movement becomes the dominant bottleneck, there is a clear need to pivot from compute-centric to memory-centric design approaches.

In-Memory Computing (IMC), or Compute-in-Memory (CIM), addresses these challenges by performing computation directly within memory, minimizing data movement and mitigating the memory wall.

This dissertation introduces a series of digital CIM circuits and architectures that significantly improve power, performance, and area (PPA) metrics for data-intensive workloads. It begins with a programmable CIM design that balances the flexibility of Central-Processing-Units(CPUs)/Graphics Processing Units (GPUs) with the efficiency of ASICs, enabling a broad class of applications. A prototype 28nm CMOS chip is then presented to accelerate general matrix-matrix multiplications (GEMMs) across various fixed-point precisions.

The focus then shifts to sparse GEMM acceleration. The first design demonstrates how CIM tailored for channel decoders leverages both fixed and unstructured sparsity to outperform conventional designs. The second design, fabricated in 28nm CMOS, supports diverse unstructured sparse formats and integer precisions, efficiently targeting highly sparse deep neural networks (DNNs). The final design achieves state-of-the-art efficiency in compressed sparse GEMMs, supporting both integer and floating-point data types using shared hardware. It also integrates a RISC-V CPU to manage computation across diverse matrix sizes and model types.

Together, these contributions advance CIM as a scalable and efficient platform for future AI and data-centric systems.

Details

1010268
Business indexing term
Title
Compute-in-Memory Circuits and Architectures for Efficient Acceleration of AI and Data Centric Workloads
Number of pages
117
Publication year
2025
Degree date
2025
School code
0010
Source
DAI-B 87/2(E), Dissertation Abstracts International
ISBN
9798290970172
Advisor
Committee member
Seo, Jae-sun; Cao, Yu; Zhang, Jeff
University/institution
Arizona State University
Department
Electrical Engineering
University location
United States -- Arizona
Degree
Ph.D.
Source type
Dissertation or Thesis
Language
English
Document type
Dissertation/Thesis
Dissertation/thesis number
32114253
ProQuest document ID
3240621818
Document URL
https://www.proquest.com/dissertations-theses/compute-memory-circuits-architectures-efficient/docview/3240621818/se-2?accountid=208611
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.
Database
ProQuest One Academic