Detecting and Reproducing Error-Code Propagation Bugs in MPI Implementations
We present an approach to automatically detect and reproduce error code propagation bugs in MPI implementations. Specifically, we combine static analysis and program repair for bug detection, and apply fault injection to reproduce error propagation bugs found in MPI libraries written in C/C++. We demonstrate our approach on the MPICH library—one of the most popular implementations of MPI, and the MPICH-based implementation MVAPICH2, uncovering 447 previously unknown bugs. We discovered that 31 of these bugs result in program crashes, and 60% of the MPICH test suite is susceptible to crashing due to failures to propagate error codes. Moreover, 95 bugs produce undesirable behavior that has been confirmed dynamically, causing tests to fail, hanging processes, or simply dropping error codes before reaching user applications.
Tue 25 FebDisplayed time zone: Tijuana, Baja California change
10:55 - 12:35 | |||
10:55 25mTalk | On the fly MHP Analysis Main Conference | ||
11:20 25mTalk | Detecting and Reproducing Error-Code Propagation Bugs in MPI Implementations Main Conference Daniel DeFreez University of California, Davis, Antara Bhowmick University of California, Davis, Ignacio Laguna Lawrence Livermore National Laboratory, Cindy Rubio-González University of California, Davis | ||
11:45 25mTalk | Parallel and Distributed Bounded Model Checking of Multi-threaded Programs Main Conference | ||
12:10 25mTalk | Parallel Race Detection with Futures Main Conference Yifan Xu Washington University in St. Louis, Kyle Singer Washington University in St. Louis, I-Ting Angelina Lee Washington University in St. Louis |