Date: March 24, 1994

Speaker: Mark Holland

Highly Available Disk Arrays

Abstract:
There exists a wide variety of applications in which data availability must be continuous, that is, where the system is never taken off-line and any interruption in the accessibility of stored data causes significant disruption in the service provided by the application. Examples include on-line transaction processing systems such as airline reservation systems, and automated teller networks in banking systems. In addition, there exist many applications for which a high degree of data availability is important, but continuous operation is not required. An example is a research and development environment, where access to a centrally-stored CAD system is often necessary to make progress on a design project. These applications and many others mandate both high performance and high availability from their storage subsystems.

Parity-based redundant disk arrays are very attractive storage alternatives for these systems because the offer both low cost per megabyte and high data reliability. Unfortunately such systems exhibit poor availability characteristics; their performance is severely degraded in the presence of a disk failure. This talk addresses the design of ECC-based redundant disk arrays that offer dramatically higher levels of performance in the presence of failure than systems comprising the current state of the art.

The talk considers two primary aspects of the failure-recovery problem: the organization of the data and redundancy in the array, and the algorithms used to recover the lost data. Additionally, the talk develops a design for a redundant disk array targeted at extremely high availability through extremely fast failure recovery. This development also demonstrates the generality of the techniques presented here.

SDI / LCS Seminar Questions?
Karen Lindenfelser, 86716, or visit www.pdl.cmu.edu/SDI/