Presentation
Using CXL to Improve Performance of AI Language Models
DescriptionIncreased demand for AI applications highlights the “memory wall” obstacle — a capacity and bandwidth memory transfer bottleneck. CXL facilitates memory sharing between accelerators and GPUs while enabling direct-attached memory (i.e. DRAM) to any node, improving memory bandwidth, performance, and capacity for AI language models.
This session will explore the advantages of memory sharing and DRAM improvements for CPU, GPU, and CPU plus GPU-based memory applications utilizing AI language models, such as RAG and LlaMA. Attendees will learn about performance, cost, and power consumption benefits of DRAM and CXL memory modules.
This session will explore the advantages of memory sharing and DRAM improvements for CPU, GPU, and CPU plus GPU-based memory applications utilizing AI language models, such as RAG and LlaMA. Attendees will learn about performance, cost, and power consumption benefits of DRAM and CXL memory modules.
Session Leader
Additional Session Leaders