Content area
Deploying deep neural networks (DNNs) in resource-limited environments—such as smartwatches, IoT nodes, and intelligent sensors—poses significant challenges due to constraints in memory, computing power, and energy budgets. This paper presents a comprehensive review of recent advances in accelerating DNN inference on edge platforms, with a focus on model compression, compiler optimizations, and hardware–software co-design. We analyze the trade-offs between latency, energy, and accuracy across various techniques, highlighting practical deployment strategies on real-world devices. In particular, we categorize existing frameworks based on their architectural targets and adaptation mechanisms and discuss open challenges such as runtime adaptability and hardware-aware scheduling. This review aims to guide the development of efficient and scalable edge intelligence solutions.
Details
Smartphones;
Hardware;
Bandwidths;
Optimization techniques;
Artificial neural networks;
Data processing;
Software upgrading;
Co-design;
Privacy;
Energy consumption;
Efficiency;
Energy budget;
Neurons;
Embedded systems;
Artificial intelligence;
Edge computing;
Power;
Sensors;
Decision making;
Autonomous vehicles;
Neural networks;
Inference;
Network latency;
Remote computing;
Connectivity;
Intelligence;
Algorithms;
Surveillance
; Park, Hyun-Cheol 1
; Kang Bongsoon 2
1 Department of Computer Engineering, Korea National University of Transportation, Chungju 27469, Republic of Korea; [email protected] (D.N.); [email protected] (H.-C.P.)
2 Department of Electronics Engineering, Dong-A University, Busan 49315, Republic of Korea