Content area
The real-valued fast Fourier transform (RFFT) is well-suited for high-speed, low-power FFT processors, as it requires approximately half the arithmetic operations compared to the traditional complex-valued FFT (CFFT). While RFFT can be computed using CFFT hardware, a dedicated RFFT implementation offers advantages such as lower hardware complexity, reduced power consumption, and higher throughput. However, unlike CFFT, the irregular signal flow graph of RFFT presents challenges in designing efficient pipelined architectures. In our previous work, we have proposed a high-level programming approach using Open Computing Language (OpenCL) to implement the forward RFFT architectures on Field-Programmable Gate Arrays (FPGAs). In this article, we propose a high-level programming approach to implement the inverse RFFT architectures on FPGAs. By identifying regular computational patterns in the inverse RFFT flow graph, our method efficiently expresses the algorithm using a for loop, which is later fully unrolled using high-level synthesis tools to automatically generate a pipelined architecture. Experiments show that for a 4,096-point inverse RFFT, the proposed method achieves a 2.36x speedup and 2.92x better energy efficiency over CUDA FFT (CUFFT) on Graphics Processing Units (GPUs), and a 24.91x speedup and 18.98x better energy efficiency over Fastest Fourier Transform in the West (FFTW) on Central Processing Units (CPUs) respectively. Compared to Intel’s CFFT design on the same FPGA, the proposed one reduces 9% logic resources while achieving a 1.39x speedup. These results highlight the effectiveness of our approach in optimizing RFFT performance on FPGA platforms.
Details
Computation;
Performance evaluation;
Fourier transforms;
Graphics processing units;
Hardware;
Fast Fourier transformations;
Signal flow graphs;
Power management;
High level synthesis;
Medical equipment;
Design;
Algorithms;
Field programmable gate arrays;
Complexity;
High performance computing