Content area
The widespread adoption of deep learning has led to a surge in demand for the efficient deployment of Deep Neural Networks (DNNs) on embedded and resource-constrained systems. These platforms are increasingly tasked with performing complex inference workloads in real time, yet they must do so under strict limitations on energy consumption, computation, and memory resources. As DNN models grow in complexity to support advanced applications such as time-series forecasting and image classification, ensuring their efficient execution without compromising performance becomes a key design challenge. This calls for tailored algorithmic and architectural solutions that can meet application-specific constraints while maximizing the benefits of modern hardware accelerators.
This dissertation proposes a set of algorithmic-based approaches aimed at enhancing the efficiency and adaptability of deep learning models for embedded systems. The research centers on three primary contributions: (i) a hybrid Long Short-Term Memory (LSTM)-Transformer architecture optimized for multi-step residential power load forecasting, integrating sequence modeling and attention mechanisms for improved accuracy under training time constraints, (ii) a lightweight hierarchical deep neural network enhancement that augments baseline classifiers through cascading binary Convolutional Neural Networks (CNNs) and Vision Transformers, achieving higher classification accuracy and reduced inference time, and (iii) an energy-aware scheduling framework for DNN inference, featuring dynamic batch size selection, GPU frequency adjustment, and concurrent task mapping across multi-GPU systems to minimize energy use without violating real-time deadlines.
In summary, this dissertation presents a comprehensive approach to designing and managing deep learning workflows on embedded platforms, emphasizing performance-efficiency trade-offs. The proposed methods demonstrated significant improvements in both predictive accuracy and resource utilization, compared to existing solutions, offering practical pathways for deploying DNNs in sustainable, computing environments.