Content area
With the rapid growth of computation-intensive applications, high-performance computing (HPC) clusters have become essential for scientific computing, AI training, and industrial simulation. However, job scheduling in HPC clusters remains challenging due to heterogeneous resources, diverse task demands, and complex constraints. Traditional scheduling methods such as FCFS, SJF, and Backfilling show limited adaptability and struggle to achieve global optimization in large-scale environments. To address these issues, this paper proposes an intelligent scheduling method based on graph neural networks (GNNs) and deep reinforcement learning. A resource-constrained job–node bipartite graph is constructed to model task–node matching relationships, with node and task features capturing resource states and task demands. A GNN is employed to encode the scheduling state, and an Actor–Critic reinforcement learning framework is used to guide scheduling decisions. Simulation results show that, compared with other schedulers, the proposed GNN–Actor–Critic approach significantly improves average waiting time, average turnaround time, average slowdown, and overall resource utilization, demonstrating its effectiveness and practicality for HPC cluster scheduling.
Details
Computation;
Network topologies;
Communication;
Optimization;
Nodes;
Clusters;
Machine learning;
Heuristic;
Workloads;
High performance computing;
Computer simulation;
Mathematical programming;
Scheduling;
Artificial intelligence;
Graph theory;
Global optimization;
Graph neural networks;
Graph representations;
Decision making;
Neural networks;
Linear programming;
Algorithms;
Resource utilization;
Deep learning;
Neighborhoods
; Zhou, Jingbo 1
; Wang, Zhijun 2
1 College of Electronic Information Engineering, Changchun University of Science and Technology, Weixing Road No. 7089, Changchun 130022, China; [email protected] (X.B.); [email protected] (J.Z.)
2 High Performance Computing Center, Changchun Normal University, North Changji Road No. 677, Changchun 130032, China