Detecting and Reproducing Error-Code Propagation Bugs in MPI Implementations
We present an approach to automatically detect and reproduce error code propagation bugs in MPI implementations. Specifically, we combine static analysis and program repair for bug detection, and apply fault injection to reproduce error propagation bugs found in MPI libraries written in C/C++. We demonstrate our approach on the MPICH library—one of the most popular implementations of MPI, and the MPICH-based implementation MVAPICH2, uncovering 447 previously unknown bugs. We discovered that 31 of these bugs result in program crashes, and 60% of the MPICH test suite is susceptible to crashing due to failures to propagate error codes. Moreover, 95 bugs produce undesirable behavior that has been confirmed dynamically, causing tests to fail, hanging processes, or simply dropping error codes before reaching user applications.
This program is tentative and subject to change.
Tue 25 Feb
|10:55 - 11:20|
|11:20 - 11:45|
|11:45 - 12:10|
|12:10 - 12:35|