BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/New_York
X-LIC-LOCATION:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20250626T234541Z
LOCATION:B202
DTSTART;TZID=America/New_York:20241118T133000
DTEND;TZID=America/New_York:20241118T170000
UID:submissions.supercomputing.org_SC24_sess415_tut162@linklings.com
SUMMARY:Scalable Big Data Processing on High-Performance Computing Systems
DESCRIPTION:Dhabaleswar K. (DK) Panda, Jinghan Yao, and Kinan Alattar (The
  Ohio State University)\n\nThere are several popular Big Data processing f
 rameworks including Apache Spark and Dask. These frameworks are not capabl
 e of exploiting high-speed and low-latency networks like InfiniBand, Omni-
 Path, Slingshot, and others. In the High-Performance Computing (HPC)commun
 ity, the Message-Passing Interface (MPI) libraries are widely adopted to t
 ackle this issue by executing scientific and engineering applications on p
 arallel hardware connected via fast interconnect.\n\nThis tutorial introdu
 ces MPI4Spark and MPI4Dask that are enhanced Spark and Dask frameworks, re
 spectively, and capable of utilizing MPI for communication in a parallel a
 nd distributed setting on HPC systems. MPI4Spark can launch the Spark ecos
 ystem using MPI launchers to utilize MPI communication. It also maintains 
 isolation for application execution by forking new processes using Dynamic
  Process Management (DPM). MPI4Spark also provides portability and perform
 ance benefits as it can utilize popular HPC interconnects. MPI4Dask is an 
 MPI-based custom Dask framework that is targeted for modern HPC clusters b
 uilt with CPU and NVIDIA GPUs.\n\nThis tutorial provides a detailed overvi
 ew of the design, implementation, and evaluation of MPI4Spark and MPI4Dask
  on state-of-the-art HPC systems. Later, we also cover writing, running, a
 nd demonstrating user Big Data applications on HPC systems.\n\nTag: Emergi
 ng Technologies, Scalable Data Mining\n\nRegistration Category: Tutorial R
 eg Pass\n\n
END:VEVENT
END:VCALENDAR
