Presentation
SIGN IN TO VIEW THIS PRESENTATION Sign In
Slinky: The Missing Link Between Slurm and Kubernetes
DescriptionAI and ML engineers are often unfamiliar with traditional HPC environments. They rely on cloud native systems like Kubernetes to abstract complex system management. These platforms lack the fine-grained resource control and advanced scheduling features crucial for HPC/AI/ML workloads. Slurm, an open-source HPC workload manager, excels in allocating resources, managing parallel jobs, and handling task queues for workloads. In this talk, we introduce our new project, Slinky, which bridges the gap between HPC and cloud native worlds by running Slurm in Kubernetes. By combining Slurm's robust capabilities with the Kubernetes user-friendly interface, Slinky creates a powerful solution; it delivers HPC-level performance and scheduling within an accessible cloud native platform. This integration empowers AI and ML engineers to harness the full potential of their resources without requiring extensive systems expertise.