Close

Session

Paper: GPU Optimizations for ML
Event TypePaper
TimeWednesday, 20 November 202410:30am - 12pm EST
LocationB308
Tags
Accelerators
Artificial Intelligence/Machine Learning
Cloud Computing
Distributed Computing
Heterogeneous Computing
Performance Optimization
Registration Categories
TP
Presentations
10:30am - 11:00am ESTPipeInfer: Accelerating LLM Inference using Asynchronous Pipelined Speculation
Accelerators
Artificial Intelligence/Machine Learning
Distributed Computing
Heterogeneous Computing
Performance Optimization
11:00am - 11:30am ESTRecFlex: Enabling Feature Heterogeneity-Aware Optimization for Deep Recommendation Models with Flexible Schedules
Accelerators
Artificial Intelligence/Machine Learning
Distributed Computing
Heterogeneous Computing
Performance Optimization
11:30am - 12:00pm ESTParvaGPU: Efficient Spatial GPU Sharing for Large-Scale DNN Inference in Cloud Environments
Accelerators
Artificial Intelligence/Machine Learning
Cloud Computing