BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/New_York
X-LIC-LOCATION:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20250626T234126Z
LOCATION:B312
DTSTART;TZID=America/New_York:20241118T090000
DTEND;TZID=America/New_York:20241118T173000
UID:submissions.supercomputing.org_SC24_sess811@linklings.com
SUMMARY:Communication, I/O, and Storage at Scale on Next-Generation Platfo
 rms – Scalable Infrastructures
DESCRIPTION:Next-generation HPC platforms deal with increasing heterogenei
 ty in their subsystems. These subsystems include internal high-speed fabri
 cs for inter-node communication; storage system integrated with programmab
 le data processing units (DPUs) and infrastructure processing units (IPUs)
  to support software-defined networks; traditional storage infrastructures
  with global parallel POSIX-based filesystems complemented with scalable o
 bject stores; and heterogeneous compute nodes configured with a diverse sp
 ectrum of CPUs and accelerators (e.g., GPU, FPGA, AI processors) having co
 mplex intra-node communication. The workshop will pursue multiple objectiv
 es, including: (1) develop and provide a holistic overview of next-generat
 ion platforms with an emphasis on communication, I/O, and storage at scale
 , (2) showcase application-driven performance analysis with various HPC ne
 twork fabrics, (3) present experiences with emerging storage concepts like
  object stores and all-flash storage, (4) share experiences with performan
 ce tuning on heterogeneous platforms from multiple vendors, and (5) share 
 best practices for application programming with complex communication, I/O
 , and storage at scale.\n\nBenchmarking Ethernet Interconnect for HPC/AI w
 orkloads\n\nInterconnects have always played a cornerstone role\nin HPC. S
 ince the inception of the Top500 ranking, interconnect\nstatistics have be
 en predominantly dominated by two compet-\ning technologies: InfiniBand an
 d Ethernet. However, even if\nEthernet increased its popularity due to ver
 satility and cost-\ne...\n\n\nLorenzo Pichetti (University of Trento, Ital
 y); Daniele De Sensi (Sapienza University of Rome); Karthee Sivalingam (Hu
 awei Technologies Ltd, KTH Royal Institute of Technology); Stepan Nassyr (
 ParTec AG, Germany; Forschungszentrum Jülich); Dirk Pleiter (KTH Royal Ins
 titute of Technology); Aldo Artigiani (Huawei Technologies Ltd); Flavio Ve
 lla (University of Trento, Italy); and Daniele Cesarini and Matteo Turisin
 i (CINECA)\n---------------------\nIXPUG : Afternoon Break\n\nIXPUG : Afte
 rnoon Break\n\n---------------------\nPerformance analysis of a stencil co
 de in modern C++\n\nIn this paper we evaluate multiple parallel programmin
 g models\nwith respect to both ease of expression, and resulting performan
 ce.\nWe do this by implementing the mathematical algorithm\nknown as the `
 power method' in a variety of ways,\nusing modern C++ technniques.\n\n\nVi
 ctor Eijkhout and Yojan Chitkara (The University of Texas at Austin)\n----
 -----------------\nPredicting Protein Folding on Intel’s Data Center GPU M
 ax Series Architecture (PVC)\n\nPredicting the structure of proteins has b
 een a grand challenge for over 60 years. Google's DeepMind team leveraged 
 Artificial intelligence  in 2020 to develop AlphaFold and achieved an accu
 racy above 90 for two-thirds of the proteins in CASP's competition. AlphaF
 old has been very successful in biol...\n\n\nMadhavan Prasanna (Purdue Uni
 versity), Dhani Ruhela (Westwood High School), and Aaditya Saxena (Bob Jon
 es High School)\n---------------------\nIXPUG: Introduction and Welcome\n\
 nAmit Ruhela (The University of Texas at Austin)\n---------------------\nC
 an Current SDS Controllers Scale To Modern HPC Infrastructures?\n\nModern 
 supercomputers host numerous jobs that compete for shared storage resource
 s, causing I/O interference and performance degradation. Solutions based o
 n software-defined storage (SDS) emerged to address this issue by coordina
 ting the storage environment through the enforcement of QoS policies. H...
 \n\n\nMariana Miranda (INESC TEC & University of Minho); Yusuke Tanimura a
 nd Jason Haga (National Institute of Advanced Industrial Science and Techn
 ology (AIST), Japan); Amit Ruhela, Stephen Lien Harrell, and John Cazes (T
 exas Advanced Computing Center & University of Texas at Austin); and Ricar
 do Macedo, José Pereira, and João Paulo (INESC TEC & University of Minho)\
 n---------------------\nProtocol Buffer Deserialization DPU Offloading in 
 the RPC Datapath\n\nIn the microservice paradigm, monolithic applications 
 are decomposed into finer-grained modules invoked independently in a data-
 flow fashion. The different modules communicate through remote procedure c
 alls (RPCs), which constitute a critical component of the infrastructure. 
 To ensure portable passa...\n\n\nRaphaël Frantz (Eindhoven University of T
 echnology, Netherlands); Jerónimo Sánchez García (Aalborg University, Cope
 nhagen); Marcin Copik (ETH Zürich); Idelfonso Tafur Monroy (Eindhoven Univ
 ersity of Technology, Netherlands); and Juan José Vegas Olmos, Gil Bloch, 
 and Salvatore Di Girolamo (NVIDIA Corporation)\n---------------------\nIXP
 UG : Lunch\n\nIXPUG: Lunch\n\n---------------------\nKeynote : Network and
  Communication Infrastructure powering Meta’s GenAI and Recommendation Sys
 tems\n\nIn 2020, Meta changed the way we did AI Training. We moved to a sy
 nchronous training approach to power our recommendation systems. This pivo
 t required us to build high speed low latency RDMA networks to interconnec
 t GPUs. Over the years Meta has build some of the largest AI Clusters in t
 he world to ...\n\n\nAdithya Gangidi and Mikel Jimenez Mikel Jimenez Ferna
 ndez (Meta)\n---------------------\nIXPUG : Morning Break\n\nIXPUG : Morni
 ng Break\n\n---------------------\nIXPUG : Open Discussion and Wrapup\n\nD
 avid Martin (Argonne National Laboratory (ANL))\n---------------------\nFr
 om Tensor Processing Primitive towards Tensor Compilers using upstream MLI
 R\n\nDuring the past decade, Deep Learning (DL) algorithms, programming sy
 stems and hardware have converged with the High Performance Computing (HPC
 ) counterparts. Nevertheless, the programming methodology of DL and HPC sy
 stems is stagnant, relying on highly-optimized, yet platform-specific and 
 inflexibl...\n\n\nAlexander Heinecke (Intel Corporation)\n----------------
 -----\nAn Efficient Checkpointing System for Large Machine Learning Model 
 Training\n\nAs machine learning\nmodels increase in size and complexity ra
 pidly, the cost of\ncheckpointing in ML training became a bottleneck in st
 orage\nand performance (time). For example, the latest GPT-4 model\nhas ma
 ssive parameters at the scale of 1.76 trillion. It is highly\ntime and sto
 rage consuming to fre...\n\n\nWubiao Xu (Nanchang Hangkong University); Xi
 n Huang (Kobe University, Japan; RIKEN); Shiman Meng and Weiping Zhang (Na
 nchang Hangkong University); Luanzheng Guo (Pacific Northwest National Lab
 oratory (PNNL)); Kento Sato (RIKEN); and Guoyuan Jia (Nanchang Hangkong Un
 iversity)\n---------------------\nModeling and Simulation of Collective Al
 gorithms on HPC Network Topologies using Structural Simulation Toolkit\n\n
 In the last decade, DL training has emerged as an HPC-scale workload runni
 ng on large clusters. The dominant communication pattern in distributed da
 ta-parallel DL training is allreduce which is used to sum the model gradie
 nts across processes during backpropagation phase. Various allreduce algor
 ithm...\n\n\nSai Prabhakar Rao Chenna (Intel Corporation)\n\nTag: I/O, Sto
 rage, Archive\n\nRegistration Category: Workshop Reg Pass\n\nSession Chair
 s: Glenn Brook (Cornelis Networks, University of Tennessee); Clayton Hughe
 s (Sandia National Laboratories); Nalini Kumar (Intel Corporation); Hatem 
 Ltaief (King Abdullah University of Science and Technology (KAUST)); David
  Martin (Argonne National Laboratory (ANL), Northwestern University); and 
 Amit Ruhela (Texas Advanced Computing Center (TACC), University of Texas)
END:VEVENT
END:VCALENDAR
