Content area

Abstract

Distributed inference in resource-constrained heterogeneous edge clusters is fundamentally limited by disparities in device capabilities and load imbalance issues. Existing methods predominantly focus on optimizing single-pipeline allocation schemes for partitioned sub-models. However, such approaches often lead to load imbalance and suboptimal resource utilization under concurrent batch processing scenarios. To address these challenges, we propose a non-uniform deployment inference framework (NUDIF), which achieves high-throughput distributed inference service by adapting to heterogeneous resources and balancing inter-stage processing capabilities. Formulated as a mixed-integer nonlinear programming (MINLP) problem, NUDIF is responsible for planning the number of instances for each sub-model and determining the specific devices for deploying these instances, while considering computational capacity, memory constraints, and communication latency. This optimization minimizes inter-stage processing discrepancies and maximizes resource utilization. Experimental evaluations demonstrate that NUDIF enhances system throughput by an average of 9.95% compared to traditional single-pipeline optimization methods under various scales of cluster device configurations.

Details

1009240
Title
NUDIF: A Non-Uniform Deployment Framework for Distributed Inference in Heterogeneous Edge Clusters
Author
Li, Peng 1 ; Chen, Qing 1 ; Liu, Hao 2 

 National Key Laboratory of Complex Aviation System Simulation, Chengdu 610036, China; [email protected], Southwest China Institute of Electronic Technology, Chengdu 610036, China 
 School of Computer Science (National Pilot Software Engineering School), Beijing University of Posts and Telecommunications (BUPT), Beijing 100876, China; [email protected] 
Publication title
Volume
17
Issue
4
First page
168
Publication year
2025
Publication date
2025
Publisher
MDPI AG
Place of publication
Basel
Country of publication
Switzerland
Publication subject
e-ISSN
19995903
Source type
Scholarly Journal
Language of publication
English
Document type
Journal Article
Publication history
 
 
Online publication date
2025-04-11
Milestone dates
2025-03-09 (Received); 2025-04-07 (Accepted)
Publication history
 
 
   First posting date
11 Apr 2025
ProQuest document ID
3194606736
Document URL
https://www.proquest.com/scholarly-journals/nudif-non-uniform-deployment-framework/docview/3194606736/se-2?accountid=208611
Copyright
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Last updated
2025-04-25
Database
ProQuest One Academic