Close

Presentation

Applied AI and Federated Learning in HPC: Because Even Supercomputers Need a Team
DescriptionThe High-Performance Computing (HPC) paradigm, which forms the backbone of global cloud infrastructure, has experienced exponential growth through multiple iterations.
Initially used for scientific computing, later for productivity, and more recently Artificial Intelligence (AI) based services, this application growth has further contributed to the complexity of the HPC hardware-software ecosystem, creating new challenges and opportunities in the domain. Given the innate nature of computing is highly distributed, tiered, and evolving, several decisions must be made locally across non-overlapping decision boundaries often with multiple local and global objectives for optimization. Such optimization goals span energy, cost, performance, and environmental impact. In this paper, we present our most recent works in the applications of AI for HPC, with a special focus on applications in federated digital twins, intelligent storage buffer cache, application performance projection, and energy-aware scheduling.