Content area
ABSTRACT
Nowadays, deep neural network (DNN) partition is an effective strategy to accelerate deep learning (DL) tasks. A pioneering technology, computing and network convergence (CNC), integrates dispersed computing resources and bandwidth via the network control plane to utilize them efficiently. This paper presents a novel network‐cloud (NC) architecture designed for DL task inference in CNC scenario, where network devices directly participate in computation, thereby reducing extra transmission costs. Considering multi‐hop computing‐capable network nodes and one cloud node in a chain path, leveraging deep reinforcement learning (DRL), we develop a joint‐optimization algorithm for DNN partition, subtask offloading and computing resource allocation based on deep Q network (DQN), referred to as POADQ, which invokes a subtask offloading and computing resource allocation (SORA) algorithm with low complexity, to minimize delay. DQN searches the optimal DNN partition point, and SORA identifies the next optimal offloading node for next subtask through our proposed NONPRA (next optimal node prediction with resource allocation) method, which selects the node that exhibits the smallest predicted increase in cost. We conduct some experiments and compare POADQ with other schemes. The results show that our proposed algorithm is superior to other algorithms in reducing the average delay of subtasks.
Details
; Wang, Zhili 1 ; Yang, Yang 1 ; Wang, Sining 2 1 State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China
2 State Grid Information & Telecommunication Group Co., Ltd., Beijing, China
