BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/New_York
X-LIC-LOCATION:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20250626T234541Z
LOCATION:B306
DTSTART;TZID=America/New_York:20241117T121000
DTEND;TZID=America/New_York:20241117T123000
UID:submissions.supercomputing.org_SC24_sess734_ws_pawatm114@linklings.com
SUMMARY:Intel® SHMEM: GPU-initiated OpenSHMEM using SYCL
DESCRIPTION:Alex Brooks, Philip Marshall, David Ozog, Md W. Rahman, Lawren
 ce Stewart, and Rithwik Tom (Intel Corporation)\n\nModern high-end systems
  are increasingly becoming heterogeneous, providing users options to use g
 eneral purpose Graphics Processing Units (GPU) and other accelerators for 
 additional performance. High Performance Computing (HPC) and Artificial In
 telligence (AI) applications are often carefully arranged to overlap commu
 nications and computation for increased efficiency on such platforms. This
  has led to efforts to extend popular communication libraries to support G
 PU awareness and more recently, GPU-initiated operations. In this paper, w
 e present Intel SHMEM, a library that enables users to write programs that
  are GPU aware, in that API calls support GPU memory, and also support GPU
 -initiated communication operations by embedding OpenSHMEM style calls wit
 hin GPU kernels. We also propose thread-collaborative extensions to the Op
 enSHMEM standard that can enable users to better exploit the strengths of 
 GPUs. Our implementation adapts to choose between direct load/store from G
 PU and the GPU copy engine based transfer to optimize performance on diffe
 rent configurations.\n\nTag: Heterogeneous Computing, Parallel Programming
  Methods, Models, Languages and Environments, PAW-Full, Task Parallelism\n
 \nRegistration Category: Workshop Reg Pass\n\nSession Chairs: Engin Kayrak
 lioglu (Hewlett Packard Enterprise (HPE)); Daniele Lezzi (Barcelona Superc
 omputing Center (BSC)); Karla Vanessa Morris Wright (Sandia National Labor
 atories); Irene Moulitsas (Cranfield University); Elliott Slaughter (SLAC 
 National Accelerator Laboratory); and Kenjiro Taura (The University of Tok
 yo, Japan)\n\n
END:VEVENT
END:VCALENDAR
