Close

Presentation

SWARM: Scientific Workflow Applications on Resilient Metasystem
DescriptionCurrent (centralized) resource management strategies typically require a global view of distributed HPC systems, relying on a cluster-wide resource manager for scheduling, with static, expert-tuned rules. This centralized decision-making approach suffers from resilience, efficiency and scalability issues. In this work, we describe our initial progress in the SWARM project that takes a novel decentralized multi-agent approach leveraging Swarm Intelligence (SI) and consensus strategies for enhanced robustness, resilience, and fault tolerance. We present our foundational SWARM system model to improve network overlays, enhance job selection using multi-agent consensus algorithms, and design SI-inspired scheduling approaches.
Event Type
ACM Student Research Competition: Graduate Poster
ACM Student Research Competition: Undergraduate Poster
Doctoral Showcase
Posters
TimeTuesday, 19 November 202412pm - 5pm EST
LocationB302-B305
Registration Categories
TP
XO/EX