BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/New_York
X-LIC-LOCATION:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20250626T233526Z
LOCATION:B302-B305
DTSTART;TZID=America/New_York:20241120T100000
DTEND;TZID=America/New_York:20241120T170000
UID:submissions.supercomputing.org_SC24_sess533_post153@linklings.com
SUMMARY:Prompt Phrase Ordering Using Large Language Models in HPC: Evaluat
 ing Prompt Sensitivity
DESCRIPTION:Noah Thomasson (Oak Ridge National Laboratory (ORNL), PCIP Int
 ernship Program) and Hilda Klasky (Oak Ridge National Laboratory (ORNL))\n
 \nLarge language models (LLMs) often require well-designed prompts for eff
 ective responses, but optimizing prompts is challenging due to prompt sens
 itivity, where small changes can cause significant performance variations.
  This study evaluates prompt performance across all permutations of indepe
 ndent phrases to investigate prompt sensitivity and robustness. We used tw
 o datasets: GSM8k, for mathematical reasoning, and a custom prompt for sum
 marizing database metadata. Performance was assessed using the llama3-inst
 ruct-7B model on Ollama and parallelized in a high-performance computing e
 nvironment. We compared phrase indices in the best and worst prompts and u
 sed Hamming distance to measure performance changes between phrase orderin
 gs. Results show that prompt phrase ordering significantly affects LLM per
 formance, with Hamming distance indicating that changes can dramatically a
 lter scores, often by chance. This supports existing findings on prompt se
 nsitivity. Our study highlights the challenges in prompt optimization, ind
 icating that modifying phrases in a successful prompt does not guarantee a
 nother successful prompt.\n\nRegistration Category: Tech Program Reg Pass,
  Exhibits Reg Pass\n\nSession Chairs: Ayesha Afzal (Friedrich-Alexander Un
 iversity, Erlangen-Nuremberg; Erlangen National High Performance Computing
  Center); Sally Ellingson (University of Kentucky); and Alan Sussman (Univ
 ersity of Maryland)\n\n
END:VEVENT
END:VCALENDAR
