BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/New_York
X-LIC-LOCATION:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20250626T234542Z
LOCATION:B309
DTSTART;TZID=America/New_York:20241119T110000
DTEND;TZID=America/New_York:20241119T113000
UID:submissions.supercomputing.org_SC24_sess380_pap491@linklings.com
SUMMARY:DFTracer: An Analysis-Friendly Data Flow Tracer for AI-Driven Work
 flows
DESCRIPTION:Hariharan Devarajan (Lawrence Livermore National Laboratory (L
 LNL), Illinois Institute of Technology); Loic Pottier (Lawrence Livermore 
 National Laboratory (LLNL)); Kaushik Velusamy and Huihuo Zheng (Argonne Na
 tional Laboratory (ANL)); Izzet Yildirim (Illinois Institute of Technology
 ); Olga Kogiou and Weikuan Yu (Florida State University); Anthony Kougkas 
 and Xian-He Sun (Illinois Institute of Technology); and Jae-Seung Yeom and
  Kathryn Mohror (Lawrence Livermore National Laboratory (LLNL))\n\nModern 
 HPC workflows involve intricate coupling of simulation, data analytics, an
 d artificial intelligence (AI) applications to improve time to scientific 
 insight. However, current tools are not designed to work with an AI-based 
 I/O software stack that requires tracing at multiple levels of the applica
 tion. To this end, we designed DFTracer to capture data-centric events fro
 m workflows and the I/O stack. DFTracer has following three novel features
 , including a unified interface to capture tracing data from different lay
 ers in the software stack, a trace format which is analysis-friendly optim
 ized to supports efficiently loading, and the capability to tag events wit
 h workflow-specific context to improve analysis. Additionally, we demonstr
 ate that DFTracer has a 1.44x smaller runtime overhead and 7.1x smaller tr
 ace size as compared to state-of-the-art tools. In conclusion, we demonstr
 ate that DFTracer can capture multi-level performance data with a low over
 head of 1-5% from MuMMI and Megatron Deepspeed workflows.\n\nTag: Algorith
 ms, Artificial Intelligence/Machine Learning, Data Movement and Memory, Gr
 aph Algorithms\n\nRegistration Category: Tech Program Reg Pass\n\nSession 
 Chair: Jay Lofstead (Sandia National Laboratories, University of New Mexic
 o)\n\n
END:VEVENT
END:VCALENDAR
