SDI Seminar

Speaker: Farnam Jahanian, Department of EECS, University of Michigan

Testing of Fault-Tolerant Distributed Systems via Probing and Fault Injection

Date: Thursday, March 7, 1996

Abstract: As software for distributed systems becomes more complex, ensuring that a system meets its prescribed specification is a growing challenge that confronts software developers. This is particularly important for distributed systems with strict dependability and timeliness constraints. This talk reports on ORCHESTRA, a portable fault injection environment for testing implementations of distributed protocols. This tool is based on a simple yet powerful framework, called script-driven probing and fault injection, for the evaluation and validation of the fault-tolerance and timing characteristics of distributed protocols. The tool, which was initially developed on the Real-Time Mach operating system and later ported to other platforms including Solaris and SunOS, has been used to conduct extensive experiments on several prototype and commercial protocols including five implementations of TCP. This talk also reports on the development of this tool and some of the surprising results from these experiments.

(Joint work with Scott Dawson and Todd Mitton at the University of Michigan.)