On Correlated Failures in Survivable Storage Systems

Carnegie Mellon University Technical Report CMU-CS-02-129, May 2002.

Mehmet Bakkaloglu, Jay J. Wylie, Chenxi Wang, Gregory R. Ganger

Dept. Electrical & Computer Engineering
Carnegie Mellon University
Pittsburgh, PA 15213

The design of survivable storage systems involves inherent trade-offs among properties such as performance, security, and availability. A toolbox of simple and accurate models of these properties allows a designer to make informed decisions. This report focuses on availability modeling. We describe two ways of extending the classic model of availability with a single "correlation parameter" to accommodate correlated failures. We evaluate the efficacy of the models by comparing their results with real measurements. We also show the use of the models as design decision tools: we analyze the effects of availability and correlation on the ordering of data distribution schemes and we investigate the placement of related files.

