PARALLEL DATA LAB 

PDL Abstract

dBug: Systematic Evaluation of Distributed Systems

18th International Workshop on Model Checking of Software (SPIN'11), Snowbird UT, July 2011.

Jiri Simsa, Randy Bryant, Garth A. Gibson

Parallel Data Laboratory
School of Computer Science
Carnegie Mellon University
Pittsburgh, PA 15213

http://www.pdl.cmu.edu/

In order to improve quality of an implementation of a distributed and multi- threaded system, software engineers inspect code and run tests. However, the concurrent nature of such systems makes these tasks challenging. For testing, this problem is addressed by stress testing, which repeatedly executes a test hoping that eventually all possible outcomes of the test will be encountered. In this paper we present the dBug tool, which implements an alternative method to stress testing called systematic testing. The systematic testing method implemented by dBug controls the order in which certain concurrent function calls occur. By doing so, the method can systematically enumerate possible inter- leavings of function calls in an execution of a concurrent system. The dBug tool can be thought of as a light-weight model checker, which uses the implementation of a distributed and multi-threaded system and its test as an implicit description of the state space to be explored. In this state space, the dBug tool performs a reachability analysis checking for a number of safety properties including the absence of 1) deadlocks, 2) con icting non-reentrant function calls, and 3) system aborts and runtime assertions inserted by the user.

FULL PAPER: pdf