Close

Presentation

Resource Adaptivity at Task-Level
DescriptionTraditional static resource allocation in supercomputers (jobs retain a fixed set of resources) leads to inefficiencies. Resource adaptivity (jobs can change resources at runtime) significantly increases supercomputer efficiency.

This work exploits Asynchronous Many-Task (AMT) programming, which is particularly well suited to adaptivity, thanks to its transparent resource management. The AMT runtime system dynamically assigns user-defined small tasks to workers to achieve load balancing and adapt to resource changes.

Contributions of this work include techniques for malleability and evolving capabilities, allowing programs to dynamically change resources without interrupting computation. Heuristics for automatic load detection determine when to start or terminate processes, which is particularly beneficial for unpredictable workloads. Practicality is demonstrated by adapting the GLB library. A generic communication interface enables interaction between programs and resource managers. Evaluations with a prototype resource manager show significant improvements in batch makespan, node utilization, and job turnaround time for both malleable and evolving jobs.