An Efficient Convolutional Neural Network Accelerator Design on FPGA Using the Layer-to-Layer Unified Input Winograd Architecture

Abstract

Convolutional Neural Networks (CNNs) have found widespread applications in artificial intelligence fields such as computer vision and edge computing. However, as input data dimensionality and convolutional model depth continue to increase, deploying CNNs on edge and embedded devices faces significant challenges, including high computational demands, excessive hardware resource consumption, and prolonged computation times. In contrast, the Decomposable Winograd Method (DWM), which decomposes large-size or large-stride kernels into smaller kernels, provides a more efficient solution for inference acceleration in resource-constrained environments. This work proposes an approach employing the layer-to-layer unified input transformation based on the Decomposable Winograd Method. This reduces computational complexity in the feature transformation unit through system-level parallel pipelining and operation reuse. Additionally, we introduce a reconfigurable, column-indexed Winograd computation unit design to minimize hardware resource consumption. We also design flexible data access patterns to support efficient computation. Finally, we propose a preprocessing shift network system that enables low-latency data access and dynamic selection of the Winograd computation unit. Experimental evaluations on VGG-16 and ResNet-18 networks demonstrate that our accelerator, deployed on the Xilinx XC7Z045 platform, achieves an average throughput of 683.26 GOPS. Compared to existing approaches, the design improves DSP efficiency (GOPS/DSPs) by 5.8×.

Details

Subject

Fourier transforms;
Hardware;
Pipelining (computers);
Artificial neural networks;
Neural networks;
Edge computing;
Network latency;
Decomposition;
Design;
Energy efficiency;
Embedded systems;
Computer vision;
Methods;
Algorithms;
Digital signal processors;
Artificial intelligence;
Consumption;
Field programmable gate arrays

Company / organization

Name:

Xilinx Inc

NAICS:

334413

Identifier / keyword

convolutional neural network (CNN) acceleration; field-programmable gate array (FPGA); Winograd algorithm

Title

An Efficient Convolutional Neural Network Accelerator Design on FPGA Using the Layer-to-Layer Unified Input Winograd Architecture

Author

Li, Jie¹; Liang, Yong¹; Yang, Zhenhao¹; Li, Xinhai¹

¹ Key Laboratory of Advanced Manufacturing and Automation Technology, Education Department of Guangxi Zhuang Autonomous Region, Guilin University of Technology, Guilin 541006, China; [email protected] (J.L.); [email protected] (Z.Y.); [email protected] (X.L.); College of Mechanical and Control Engineering, Guilin University of Technology, Guilin 541006, China

Publication title

Electronics; Basel

Volume

Issue

First page

1182

Publication year

2025

Publication date

2025

Publisher

MDPI AG

Place of publication

Basel

Country of publication

Switzerland

Publication subject

Electronics

e-ISSN

20799292

Source type

Scholarly Journal

Language of publication

English

Document type

Journal Article

Publication history

Online publication date

2025-03-17

Milestone dates

2025-02-19 (Received); 2025-03-14 (Accepted)

Publication history

First posting date

17 Mar 2025

DOI

https://doi.org/10.3390/electronics14061182

ProQuest document ID

3181456233

Document URL

https://www.proquest.com/scholarly-journals/efficient-convolutional-neural-network/docview/3181456233/se-2?accountid=208611

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Last updated

2025-03-28

Database

ProQuest One Academic

An Efficient Convolutional Neural Network Accelerator Design on FPGA Using the Layer-to-Layer Unified Input Winograd Architecture

Content area

Abstract

Details