BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/New_York
X-LIC-LOCATION:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20250626T234539Z
LOCATION:B303
DTSTART;TZID=America/New_York:20241118T103000
DTEND;TZID=America/New_York:20241118T105000
UID:submissions.supercomputing.org_SC24_sess748_ws_pmbss109@linklings.com
SUMMARY:System-Wide Roofline Profiling: A Case Study on NERSC’s Perlmutter
  Supercomputer
DESCRIPTION:Brian Austin, Dhruva Kulkarni, Brandon Cook, Samuel Williams, 
 and Nicholas Wright (Lawrence Berkeley National Laboratory (LBNL))\n\nHPC 
 system architects routinely use application profiling and performance mode
 ling to evaluate hardware and software performance trade-offs. However, th
 e focus on individual applications leaves gaps in the understanding of sys
 tem utilization because it is impractical to collect profiles and models f
 or every  application. In this paper, we use hardware activity metrics gat
 hered from NERSC’s Perlmutter system to perform a roofline performance ana
 lysis of a diverse scientific workload and provide quantitative empirical 
 evidence for widely held beliefs that had previously been inferred from sc
 attered analyses of individual applications. Specifically, we confirm the 
 predominance of double-precision operations. The arithmetic intensity dist
 ribution suggests that near equal fractions of the workload are compute-bo
 und and bandwidth-bound on Perlmutter GPUs. These results stand in worriso
 me contrast to hardware performance trends, where artificial intelligence 
 applications driving processors emphasize the performance of reduced-preci
 sion operations, and gains in memory bandwidth are not keeping pace with p
 eak processing rates.\n\nTag: Accelerators, Modeling and Simulation, Perfo
 rmance Evaluation and/or Optimization Tools\n\nRegistration Category: Work
 shop Reg Pass\n\nSession Chairs: Simon Hammond (National Nuclear Security 
 Administration (NNSA)); Sascha Hunold (Technical University of Vienna); an
 d Steven A. Wright (University of York, England)\n\n
END:VEVENT
END:VCALENDAR
