Detecting and Reproducing Error-Code Propagation Bugs in MPI Implementations
We present an approach to automatically detect and reproduce error code propagation bugs in MPI implementations. Specifically, we combine static analysis and program repair for bug detection, and apply fault injection to reproduce error propagation bugs found in MPI libraries written in C/C++. We demonstrate our approach on the MPICH library—one of the most popular implementations of MPI, and the MPICH-based implementation MVAPICH2, uncovering 447 previously unknown bugs. We discovered that 31 of these bugs result in program crashes, and 60% of the MPICH test suite is susceptible to crashing due to failures to propagate error codes. Moreover, 95 bugs produce undesirable behavior that has been confirmed dynamically, causing tests to fail, hanging processes, or simply dropping error codes before reaching user applications.
Tue 25 FebDisplayed time zone: Tijuana, Baja California change
10:55 - 12:35
|On the fly MHP Analysis|
|Detecting and Reproducing Error-Code Propagation Bugs in MPI Implementations|
|Parallel and Distributed Bounded Model Checking of Multi-threaded Programs|
|Parallel Race Detection with Futures|