Close

Presentation

hZCCL: Accelerating Collective Communication with Co-Designed Homomorphic Compression
DescriptionAs network bandwidth lags behind increasing computing power, efficient collective communication is a major challenge for exascale applications. Traditional approaches use error-bounded lossy compression to accelerate collective operations but suffer from the costly decompression-operation-compression (DOC) workflow. We propose hZCCL, the first homomorphic compression-communication co-design enabling direct operations on compressed data, avoiding expensive DOC overhead. Alongside the co-design framework, we introduce a lightweight, multi-core CPU-optimized compressor and a homomorphic compressor with a runtime heuristic to select efficient compression pipelines dynamically. We evaluate hZCCL with up to 512 nodes and across five application datasets. The experimental results demonstrate that our homomorphic compressor achieves a CPU throughput of up to 379.08 GB/s, surpassing the conventional DOC workflow by up to 36.53X. Moreover, our hZCCL-accelerated collectives outperform two state-of-the-art baselines, delivering speedups of up to 2.12X and 6.77X compared to original MPI collectives in single-thread and multi-thread modes, respectively, while maintaining data accuracy.
Event Type
Paper
TimeThursday, 21 November 20244pm - 4:30pm EST
LocationB312-B313A
Tags
Data Compression
Data Movement and Memory
Distributed Computing
Message Passing
Network
Registration Categories
TP