Close

Presentation

Designing Quality MPI Correctness Benchmarks: Insights and Metrics
DescriptionSeveral MPI correctness benchmarks have been proposed to evaluate the quality of MPI correctness tools.
The design of such a benchmark comes with different challenges, which we will address in this paper.
First, an imbalance in the proportion of correct and erroneous codes in the benchmarks requires careful metric interpretation (recall, accuracy, F1 score).
Second, tools that detect errors but do not report additional information, like the affected source line or class of error, are less valuable.
We extend the typical notion of a true positive with stricter variants that consider a tool's helpfulness.
We introduce a new noise metric to consider the amount of distracting error reports.
We evaluate those new metrics with MPI-BugBench, on the MPI correctness tools ITAC, MUST, and PARCOACH.
Third, we discuss the complexities of hand-crafted and automatically generated benchmark codes and the additional challenges of non-deterministic errors.
Event Type
Workshop
TimeMonday, 18 November 202412:12pm - 12:18pm EST
LocationB315
Tags
Debugging and Correctness Tools
Fault-Tolerance, Reliability, Maintainability, and Adaptability
Software Engineering
Registration Categories
W