Content area
Turbo Codes (TCs) are a family of convolutional codes that provide powerful Forward Error Correction (FEC) and operate near the Shannon limit for channel capacity. In the context of modern communication systems, such as those conforming to the DVB-RCS2 standard, Turbo Encoders (TEs) play a crucial role in ensuring robust data transmission over noisy satellite links. A key computational bottleneck in the Turbo Encoder is the non-uniform interleaving stage, where input bits are rearranged according to a dynamically generated permutation pattern. This stage often requires the intermediate storage of data, resulting in increased latency and reduced throughput, especially in embedded or real-time systems. This paper introduces a vector processing algorithm designed to accelerate the interleaving stage of the Turbo Encoder. The proposed algorithm is tailored for vector DSP architectures (e.g., CEVA-XC4500), and leverages the hardware’s SIMD capabilities to perform the permutation operation in a structured, phase-wise manner. Our method adopts a modular Load–Execute–Store design, facilitating efficient memory alignment, deterministic latency, and hardware portability. We present a detailed breakdown of the algorithm’s implementation, compare it with a conventional scalar (serial) model, and analyze its compatibility with the DVB-RCS2 specification. Experimental results demonstrate significant performance improvements, achieving a speed-up factor of up to 3.4× in total cycles, 4.8× in write operations, and 7.3× in read operations, relative to the baseline scalar implementation. The findings highlight the effectiveness of vectorized permutation in FEC pipelines and its relevance for high-throughput, low-power communication systems.
Details
Channel capacity;
Satellite communications;
Hardware;
Error correction;
Turbo codes;
Architecture;
Data transmission;
Codes;
Coders;
Efficiency;
Data integrity;
Embedded systems;
Error correction & detection;
Digital video;
Communications systems;
Algorithms;
Vector processing (computers);
Data storage;
Modular structures;
Permutations;
Real time;
Array processors
; Boxerman Ohad 2 ; Ben-Shimol Yehuda 2
; Manor Erez 1
; Greenberg, Shlomo 1
1 Department of Electrical and Computer Engineering, Ben Gurion University, Beer-Sheva 84105, Israel; [email protected] (M.B.); [email protected] (O.B.); [email protected] (Y.B.-S.); [email protected] (E.M.), Department of Computer Science, Sami Shamoon College of Engineering, Beer-Sheva 84100, Israel
2 Department of Electrical and Computer Engineering, Ben Gurion University, Beer-Sheva 84105, Israel; [email protected] (M.B.); [email protected] (O.B.); [email protected] (Y.B.-S.); [email protected] (E.M.)