BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/New_York
X-LIC-LOCATION:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20260422T143139Z
LOCATION:B310
DTSTART;TZID=America/New_York:20241117T162000
DTEND;TZID=America/New_York:20241117T163000
UID:submissions.supercomputing.org_SC24_sess737_ws_ia109@linklings.com
SUMMARY:Performance evaluation and modelling of single-precision matrix mu
 ltiplication on Cerebras CS-2
DESCRIPTION:Ryunosuke Matsuzaki (Meiji University), Daichi Mukunoki (Indep
 endent), and Takaaki Miyajima (Meiji University)\n\nAlthough recent superc
 omputers have been improving their computational performance, achieving pe
 rformance scaling with respect to the number of nodes is not easy due to l
 ong inter-node communication latency. Many attempts have been made to hide
  communication latency and maintain strong scalability even for dense matr
 ix multiplication. Matrix multiplication is an ideal candidate for benchma
 rking the performance of supercomputers. The Cerebras CS-2 system is an ac
 celerator for deep learning with the world’s largest chip, the wafer-scale
  engine 2 (WSE-2). The WSE-2 can be considered a distributed memory system
  that comes with 745500 processing elements connected in a low-latency 2D 
 mesh topology. This paper presents the maximum performance, weak and stron
 g scaling performance, and proposes a performance model for single-precisi
 on matrix multiplication on CS-2. We observed the maximum performance of 3
 49.0 TFlops/s (matrix size: 33000x33000) and a weak scaling efficiency of 
 1.00. The mean absolute percentage error of the model was 4.7%.\n\nTag: Gr
 aph Algorithms, Heterogeneous Computing, Programming Frameworks and System
  Software\n\nRegistration Category: Workshop Reg Pass\n\nSession Chairs: M
 ichela Becchi (North Carolina State University); John Feo (Pacific Northwe
 st National Laboratory (PNNL)); Antonino Tumeo (Pacific Northwest National
  Laboratory (PNNL)); and Ana Lucia Varbanescu (University of Twente, Nethe
 rlands; University of Amsterdam, Netherlands)\n\n
END:VEVENT
END:VCALENDAR
