Extending Data Flow Architectures for

Abstract

This thesis augments and extends the state-of-the-art CNN inference accelerator for FPGAs, HPIPE. We first focus on the infrastructure of the accelerator, where we build an extra hardware unit to implement the Sigmoid function and automated unit tests to validate the functionality of the accelerator. We then study how to leverage the AI-optimized Stratix 10 NX FPGAs to achieve up to 7X speedup for convolution operations. Next, we extend HPIPE by integrating it with a hardware-friendly non-maximum suppression (NMS) unit to accelerate object detection and provide the highest-performing single-shot detection-based (SSD-based) object detection accelerator for FPGAs. Finally, we build an automated CAD flow to partition CNNs across multiple FPGAs that communicate via 100 Gb Ethernet. We show through a prototype system that doubling the number of FPGAs results in 2X performance improvement on three CNNs: MobileNet-V1, MobileNet-V2, and ResNet-50.

Details

Title

Extending Data Flow Architectures for Convolutional Neural Networks to Object Detection and Multiple Fpgas

Author

Ibrahim, Mohamed Abdelfattah Abdelghany

Publication year

2022

Publisher

ProQuest Dissertations & Theses

ISBN

9798834075752

Source type

Dissertation or Thesis

Language of publication

English

ProQuest document ID

2700520843

Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.

Extending Data Flow Architectures for Convolutional Neural Networks to Object Detection and Multiple Fpgas

Content area

Abstract

Details

Suggested sources