BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/New_York
X-LIC-LOCATION:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20260422T143139Z
LOCATION:B214
DTSTART;TZID=America/New_York:20241117T133000
DTEND;TZID=America/New_York:20241117T170000
UID:submissions.supercomputing.org_SC24_sess419_tut173@linklings.com
SUMMARY:Tools to Diagnose and Repair Floating-Point Errors in Heterogeneou
 s Computing Hardware and Software
DESCRIPTION:Ganesh Gopalakrishnan (University of Utah); Cindy Rubio-Gonzal
 ez (University of California, Davis); Xinyi Li (University of Utah); Dolor
 es Miao (University of California, Davis); and Edward Misback (University 
 of Washington)\n\nFloating-point arithmetic is central to HPC and ML, with
  the variety of number formats, hardware platforms, and compilers explodin
 g in this era of heterogeneity. This unfortunately increases the incidence
  of numerical issues including exceptions such as Infinity and NaN that ca
 n render the computed results unreliable or change control-flows, introduc
 es excessive rounding that breaks the assumptions made in the numerical al
 gorithm in use, and overall causes result non-reproducibility when code is
  optimized or ported across platforms. In this tutorial, we present three 
 novel tools: (1) GPU-FPX, which exposes silent exceptions in NVIDIA GPU co
 mputations, (2) Ciel, which pinpoints where compilers silently over-optimi
 ze and cause non-reproducibility, and (3) Herbie, which improves the accur
 acy of a programmer-written expression, significantly reducing rounding er
 ror or eliminating exceptions. This half-day tutorial will consist of  (1)
  presentations of floating-point basics, (2) demos  of  all our numerical 
 debugging tools, presenting their principle of operation and ideal usage c
 ontexts, and (3) plenty of time for Q/A, especially on using these tools w
 ithin the organization of the attendees. New and emerging technologies suc
 h as Tensor Cores will be introduced by showing how to test for non-portab
 ility of codes across them.\n\nTag: Debugging and Correctness Tools, Emerg
 ing Technologies, Fault-Tolerance, Reliability, Maintainability, and Adapt
 ability, Numerical Methods\n\nRegistration Category: Tutorial Reg Pass\n\n
END:VEVENT
END:VCALENDAR
