Content area

Abstract

With the rapid growth of computation-intensive applications, high-performance computing (HPC) clusters have become essential for scientific computing, AI training, and industrial simulation. However, job scheduling in HPC clusters remains challenging due to heterogeneous resources, diverse task demands, and complex constraints. Traditional scheduling methods such as FCFS, SJF, and Backfilling show limited adaptability and struggle to achieve global optimization in large-scale environments. To address these issues, this paper proposes an intelligent scheduling method based on graph neural networks (GNNs) and deep reinforcement learning. A resource-constrained job–node bipartite graph is constructed to model task–node matching relationships, with node and task features capturing resource states and task demands. A GNN is employed to encode the scheduling state, and an Actor–Critic reinforcement learning framework is used to guide scheduling decisions. Simulation results show that, compared with other schedulers, the proposed GNN–Actor–Critic approach significantly improves average waiting time, average turnaround time, average slowdown, and overall resource utilization, demonstrating its effectiveness and practicality for HPC cluster scheduling.

Details

1009240
Business indexing term
Title
A High-Performance Computing Cluster Intelligent Scheduling Algorithm Based on Graph Neural Network and Actor–Critic
Author
Bai Xuemei 1   VIAFID ORCID Logo  ; Zhou, Jingbo 1   VIAFID ORCID Logo  ; Wang, Zhijun 2   VIAFID ORCID Logo 

 College of Electronic Information Engineering, Changchun University of Science and Technology, Weixing Road No. 7089, Changchun 130022, China; [email protected] (X.B.); [email protected] (J.Z.) 
 High Performance Computing Center, Changchun Normal University, North Changji Road No. 677, Changchun 130032, China 
Publication title
Volume
15
Issue
1
First page
116
Number of pages
30
Publication year
2025
Publication date
2025
Publisher
MDPI AG
Place of publication
Basel
Country of publication
Switzerland
Publication subject
e-ISSN
20799292
Source type
Scholarly Journal
Language of publication
English
Document type
Journal Article
Publication history
 
 
Online publication date
2025-12-25
Milestone dates
2025-11-24 (Received); 2025-12-24 (Accepted)
Publication history
 
 
   First posting date
25 Dec 2025
ProQuest document ID
3291798833
Document URL
https://www.proquest.com/scholarly-journals/high-performance-computing-cluster-intelligent/docview/3291798833/se-2?accountid=208611
Copyright
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Last updated
2026-01-09
Database
ProQuest One Academic