BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/New_York
X-LIC-LOCATION:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20250626T234540Z
LOCATION:B306
DTSTART;TZID=America/New_York:20241118T112000
DTEND;TZID=America/New_York:20241118T114500
UID:submissions.supercomputing.org_SC24_sess751_ws_p3hpc108@linklings.com
SUMMARY:Autonomous Execution for Multi-GPU Systems: Compiler Support
DESCRIPTION:Javid Baydamirli (Koç University, Turkey); Tal Ben-Nun (Lawren
 ce Livermore National Laboratory (LLNL)); and Didem Unat (Koç University, 
 Turkey)\n\nRecent trends in HPC systems increasingly emphasize accelerator
 s, particularly GPUs, as autonomous execution units, shifting control of e
 ntire program execution to GPUs. In this work, we aim to bridge this gap w
 ith a compiler and provide a productive method for writing efficient GPU-f
 irst code. We design and develop a code generator that efficiently fuses a
 nd schedules persistent kernels, provides high-level abstractions over dev
 ice resources, and enables GPU-initiated communication within Python code 
 using NVSHMEM to realize autonomous multi-GPU execution. We compare our im
 plementation to other accelerated Python compilers including CuPy, DaCe, a
 nd cuNumeric on 22 NPBench kernels. We additionally perform a scaling stud
 y of distributed 2D/3D Jacobi and observe a speedup of 6.1𝑥 and 30.8𝑥 over
  DaCe and cuNumeric, respectively, on 8 GPUs for the 3D case with a scalin
 g efficiency of 98%.\n\nTag: Performance Optimization, Programming Framewo
 rks and System Software\n\nRegistration Category: Workshop Reg Pass\n\nSes
 sion Chairs: CJ Newburn (NVIDIA Corporation), Scott J. Parker (Argonne Nat
 ional Laboratory (ANL)), John Pennycook (Intel Corporation), and Kenneth W
 eiss (Lawrence Livermore National Laboratory (LLNL))\n\n
END:VEVENT
END:VCALENDAR
