BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/New_York
X-LIC-LOCATION:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20250626T234540Z
LOCATION:B303
DTSTART;TZID=America/New_York:20241118T090000
DTEND;TZID=America/New_York:20241118T093000
UID:submissions.supercomputing.org_SC24_sess748_ws_pmbsf125@linklings.com
SUMMARY:LLM-Inference-Bench: Inference Benchmarking of Large Language Mode
 ls on AI Accelerators
DESCRIPTION:Krishna Teja Chitty-Venkata, Siddhisanket Raskar, Bharat Kale,
  Farah Ferdaus, Aditya Tanikanti, Ken Raffenetti, Valerie Taylor, Murali E
 mani, and Venkatram Vishwanath (Argonne National Laboratory (ANL))\n\nLarg
 e language models (LLMs) have propelled groundbreaking advancements across
  several domains and are commonly used for text generation applications. H
 owever, the computational demands of these complex models pose significant
  challenges, requiring efficient hardware acceleration. Benchmarking the p
 erformance of LLMs across diverse hardware platforms is crucial to underst
 anding their scalability and throughput characteristics. We introduce LLM-
 Inference-Bench, a comprehensive benchmarking suite to evaluate the hardwa
 re inference performance of LLMs. We thoroughly analyze diverse hardware p
 latforms, including GPUs from Nvidia and AMD, and specialized AI accelerat
 ors Intel Habana and SambaNova. Our evaluation includes several LLM infere
 nce frameworks and models from LLaMA, Mistral, and Qwen families with 7B a
 nd 70B parameters. Our benchmarking results reveal the strengths and limit
 ations of various models, hardware platforms, and inference frameworks. We
  provide an interactive dashboard to help identify configurations for opti
 mal performance for a given hardware platform.\n\nTag: Accelerators, Model
 ing and Simulation, Performance Evaluation and/or Optimization Tools\n\nRe
 gistration Category: Workshop Reg Pass\n\nSession Chairs: Simon Hammond (N
 ational Nuclear Security Administration (NNSA)); Sascha Hunold (Technical 
 University of Vienna); and Steven A. Wright (University of York, England)\
 n\n
END:VEVENT
END:VCALENDAR
