BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/New_York
X-LIC-LOCATION:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20250626T234542Z
LOCATION:B302-B305
DTSTART;TZID=America/New_York:20241119T120000
DTEND;TZID=America/New_York:20241119T170000
UID:submissions.supercomputing.org_SC24_sess487_post278@linklings.com
SUMMARY:PcMINER: Mining Performance-Related Commits at Scale
DESCRIPTION:Md Abul Kalam Azad, Manoj Alexender, Matthew Alexender, Syed S
 alauddin Mohammad Tariq, Foyzul Hassan, and Probir Roy (University of Mich
 igan - Dearborn)\n\nPerformance inefficiencies in software can severely im
 pact application quality and resource utilization. Addressing these issues
  often requires significant developer effort, yet the lack of large-scale,
  open-source performance datasets hinders the development of effective mit
 igation strategies. To fill this gap, we present PcMINER, a tool that mine
 s performance inefficiency-related commits from GitHub at scale. PcMINER u
 ses PcERT-KD, a transformer model that classifies these commits with accur
 acy comparable to 7B parameter LLMs but with reduced computational costs, 
 making it ideal for CPU cluster deployment. By mining GitHub repositories 
 with a 50-node CPU cluster, PcMINER has generated a dataset of 162K perfor
 mance-related commits in C++ and 103.8K in Python. This dataset promises t
 o enhance data-driven approaches to detecting performance inefficiencies. 
 \n\nIn the poster session, I will present the problem, motivation, methodo
 logy, and results, with additional details that may be accessible through 
 a QR code, and will provide a brief oral overview.\n\nRegistration Categor
 y: Tech Program Reg Pass, Exhibits Reg Pass\n\nSession Chairs: Ayesha Afza
 l (Friedrich-Alexander University, Erlangen-Nuremberg; Erlangen National H
 igh Performance Computing Center); Sally Ellingson (University of Kentucky
 ); and Alan Sussman (University of Maryland)\n\n
END:VEVENT
END:VCALENDAR
