Content area
Energy-efficient implementations are essential for the future and modern society, especially in digital signal processing (DSP) and communication systems, where the rapid growth of devices, such as those battery-driven internet of things (IoT) sensors, necessitates low-complexity and low-power solutions. This thesis concentrates on two areas: constant multiplication and active user detection in wire-less networks.
Constant multiplication can be implemented using a shift-and-add (SHA) network. Typically, the number of adders/subtracters is minimized, but the number of cascaded adders/subtracters (depth) also impacts the power consumption. The two classes of algorithms used to solve the problem are adder graph algorithms or sub-expression sharing algorithms. Adder graph algorithms typically yield better results for a single or a low number of inputs, since they do not depend on the representation of the numbers involved. However, they can lead to very high run times and worse results when the number of inputs is high. At the same time, it is known that it is possible to transpose the problem. For example, a sum of products (many inputs) can be transposed to a single input with multiple coefficients, meaning that it is possible to transpose the problem to the more advantageous form, solve it and then transpose the solution back. However, there has been no systematic algorithm available to obtain the transposed result that takes depth into account. In this thesis, a systematic algorithm that obtains the minimum depth of the transposed SHA network subject to the input is introduced.
The practical application of the constant multiplication problem is demonstrated through the implementation of a reconfigurable lowpass equalizer, widely used in communication systems and DSP. Various formulations of the constant multiplication problem, combined with pipelining, are explored to identify the most efficient implementation in a 28 nm FD-SOI standard cell, significantly reducing power consumption and highlighting the real-world impact of our research.
The second research focuses on the challenge of detecting active users in massive machine-type communication (mMTC) scenarios involving large numbers of devices. The problem is addressed using a pilot-hopping sequence method and is formulated as a non-negative least-squares (NNLS) problem. This work implements two NNLS algorithms, fast projected gradient (Fast) and multiplicative updates (Mult), to solve the active user detection problem. These implementations are implemented in a 28 nm FD-SOI process and are optimized for energy efficiency, chip area, and detection speed. The results demonstrate the ability to perform over a million detections per second with significantly lower energy consumption compared to existing methods. However, the implementations lack reconfigurability, and it can be argued whether the high detection rates are relevant for current practical applications.
To enhance practicality and reconfigurability, the Fast algorithm is implemented using a reconfigurable time-multiplexed architecture, reducing resources by reusing them within one iteration. This architecture employs a novel user re-ordering method to enable parallel memory access and continuous operation for successive iterations, thereby increasing the execution speed. The architecture is implemented on numerous FPGA families, demonstrating resource efficiency and reconfigurability by storing the pilot-hopping sequences in memory, while obtaining a more practically usable detection rate of about one to a few thousand detections per second depending on the FPGA family.