BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/New_York
X-LIC-LOCATION:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20250626T234543Z
LOCATION:B308
DTSTART;TZID=America/New_York:20241119T160000
DTEND;TZID=America/New_York:20241119T163000
UID:submissions.supercomputing.org_SC24_sess392_pap332@linklings.com
SUMMARY:SMIless: Serving DAG-based Inference with Dynamic Invocations unde
 r Serverless Computing
DESCRIPTION:Chengzhi Lu (Shenzhen Institute of Advanced Technology, Chines
 e Academy of Sciences; University of Macau); Huanle Xu, Yudan Li, and Weny
 an Chen (University of Macau); Kejiang Ye (Shenzhen Institute of Advanced 
 Technology, Chinese Academy of Sciences); and Chengzhong Xu (University of
  Macau)\n\nThe deployment of ML serving applications, featuring multiple i
 nference functions on serverless platforms, has gained substantial popular
 ity, leading to numerous developments of new systems. However, these syste
 ms often focus on optimizing resource provisioning and cold start manageme
 nt separately, ultimately resulting in higher monetary costs.\nThis paper 
 introduces SMIless, a highly efficient serverless system tailored for serv
 ing DAG-based ML inference in heterogeneous environments. SMIless effectiv
 ely co-optimizes resource configuration and cold-start management in the c
 ontext of dynamic invocations. This is achieved by seamlessly integrating 
 adaptive pre-warming windows, striking an effective balance between perfor
 mance and cost. We have implemented SMIless on top of OpenFaaS and conduct
 ed extensive evaluations using real-world ML serving applications. The exp
 erimental results demonstrate that SMIless can achieve up to a 5.73$\times
 $ reduction in the overall costs while meeting the SLA requirements for al
 l user requests, surpassing the performance of state-of-the-art solutions.
 \n\nTag: Distributed Computing, Middleware and System Software\n\nRegistra
 tion Category: Tech Program Reg Pass\n\nSession Chair: Dejan Milojicic (He
 wlett Packard Labs)\n\n
END:VEVENT
END:VCALENDAR
