Content area

Abstract

This thesis augments and extends the state-of-the-art CNN inference accelerator for FPGAs, HPIPE. We first focus on the infrastructure of the accelerator, where we build an extra hardware unit to implement the Sigmoid function and automated unit tests to validate the functionality of the accelerator. We then study how to leverage the AI-optimized Stratix 10 NX FPGAs to achieve up to 7X speedup for convolution operations. Next, we extend HPIPE by integrating it with a hardware-friendly non-maximum suppression (NMS) unit to accelerate object detection and provide the highest-performing single-shot detection-based (SSD-based) object detection accelerator for FPGAs. Finally, we build an automated CAD flow to partition CNNs across multiple FPGAs that communicate via 100 Gb Ethernet. We show through a prototype system that doubling the number of FPGAs results in 2X performance improvement on three CNNs: MobileNet-V1, MobileNet-V2, and ResNet-50.

Details

Title
Extending Data Flow Architectures for Convolutional Neural Networks to Object Detection and Multiple Fpgas
Author
Ibrahim, Mohamed Abdelfattah Abdelghany
Publication year
2022
Publisher
ProQuest Dissertations & Theses
ISBN
9798834075752
Source type
Dissertation or Thesis
Language of publication
English
ProQuest document ID
2700520843
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.