Presentation
Compiler-Aided Correctness Checking of CUDA-Aware MPI Applications
DescriptionHybrid MPI + X models, combining the Message Passing Interface (MPI) with node-level parallel programming models, increase complexity and introduce additional correctness issues. This work addresses the challenges of detecting data races in hybrid CUDA-aware MPI applications due to the asynchronous and non-blocking nature of CUDA and MPI APIs. We introduce CuSan, an LLVM compiler extension and runtime, to track CUDA-specific concurrency, synchronization and memory access semantics. We integrate CuSan with MUST, a dynamic MPI correctness tool, and ThreadSanitizer (TSan), a thread-level data race detector. MUST with TSan can already detect concurrency issues for multi-threaded MPI codes. Together with CuSan, these tools allow for comprehensive correctness checking of concurrency issues in CUDA-aware MPI applications. Our evaluation on two-mini apps reveals runtime overhead ranging from 6×-36×, depending on the amount of memory tracked by TSan, compared to the uninstrumented version. Memory overhead remains under 1.8×. CuSan is available at https://github.com/ahueck/cusan