Close

Presentation

GNN-RL: An Intelligent HPC Resource Scheduler
DescriptionEfficient resource allocation in high-performance computing (HPC) environments is crucial for optimizing utilization, minimizing make-span, and enhancing throughput. We propose GNN-RL, a novel intelligent scheduler that leverages a hybrid Graph Neural Network and Reinforcement Learning model, learning from historical workload data to implement optimal scheduling policies. Experimental results show that GNN-RL significantly outperforms conventional methods. Compared to the First-Come-First-Served (FCFS) baseline, GNN-RL achieves a 2.1-fold increase in resource utilization (84.25\% vs. 39.84\%), a 114-fold improvement in throughput (40,061.86 vs. 351.69 jobs/s), and a 114-fold reduction in make-span (4.50s vs. 513.11s). GNN-RL also surpasses EASY Backfilling, with 1.3 times higher resource utilization and 2 times better throughput and make-span. The fairness index remains consistent, indicating that GNN-RL maintains fairness while improving other metrics. Our findings suggest GNN-RL is a significant advancement in intelligent HPC resource management, enabling more efficient and responsive computing environments.
Event Type
ACM Student Research Competition: Graduate Poster
ACM Student Research Competition: Undergraduate Poster
Doctoral Showcase
Posters
TimeTuesday, 19 November 202412pm - 5pm EST
LocationB302-B305
Registration Categories
TP
XO/EX