BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/New_York
X-LIC-LOCATION:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20260422T143138Z
LOCATION:B311
DTSTART;TZID=America/New_York:20241119T153000
DTEND;TZID=America/New_York:20241119T160000
UID:submissions.supercomputing.org_SC24_sess382_pap439@linklings.com
SUMMARY:Matrix-Free Finite-Volume Kernels on a Dataflow Architecture
DESCRIPTION:Ryuichi Sai (Rice University); Francois Hamon (TotalEnergies E
 &P Research and Technology USA, LLC); John Mellor-Crummey (Rice University
 ); and Mauricio Araya-Polo (TotalEnergies E&P Research and Technology USA,
  LLC)\n\nFast and accurate numerical simulations are crucial for designing
  large-scale geological carbon storage projects ensuring safe long-term CO
 2 containment -- as a climate change mitigation strategy. These simulation
 s involve solving numerous large and complex linear systems arising from t
 he implicit Finite-Volume (FV) discretization of PDEs governing subsurface
  fluid flow. Compounded with highly detailed geo-models, solving linear sy
 stems is computationally and memory expensive, and accounts for the majori
 ty of the simulation computing time. Modern intricate memory hierarchical 
 systems are insufficient to overcome the challenges of large-scale numeric
 al simulations. Therefore, exploring algorithms that can leverage alternat
 ive and balanced paradigms, such as dataflow and in-memory computing is cr
 ucial. This work introduces a matrix-free algorithm to solve FV-based line
 ar systems using a dataflow architecture to significantly minimize memory 
 bottlenecks. Our implementation achieves two orders-of-magnitude speedup c
 ompared to a GPGPU-based reference implementation, and up to 1.2 PFlops on
  a single dataflow device.\n\nTag: Accelerators, Applications and Applicat
 ion Frameworks, Graph Algorithms, Modeling and Simulation, Numerical Metho
 ds\n\nRegistration Category: Tech Program Reg Pass\n\nSession Chair: Wenqi
 an Dong (Oregon State University)\n\n
END:VEVENT
END:VCALENDAR
