Redundant Arrays of Independent Disks
THIS PAGE HAS MOVED. PLEASE UPDATE YOUR BOOKMARKS. iF YOU ARE NOT REDIRECTED IN A FEW SECONDS, PLEASE CLICK HERE TO GO TO OUR NEW PAGE.
[ RAID Overview | Publications
| RAIDframe | RAID
This page contains a brief summary of RAID technology for those people
who are unfamiliar with it so that our RAID research can be understood
within a larger context. More detailed background can be found in Garth
Gibson's Ph.D. thesis, Mark Holland's
Ph.D. thesis, and the RAIDframe
documentation, which excerpts chapter two of Mark Holland's thesis
and draws heavily upon Bill Courtright's
thesis. These three documents also refer to a large number of technical
papers, including several available from our
Publications page, which can provide even more depth on specific
variants of RAID technology.
Several trends in the computer industry over the past decade have driven
the design of the storage subsystem towards increasing parallelism. This
means that systems can and will perform better in terms of I/O by increasing
the number, rather than the performance, of individual disks. These trends
include a widening gap between the speed of CPUs and that of disks, the
shrinking size of disk drives, and new I/O-intensive applications such
as digital video, scientific visualization, and spatial databases.
By adding redundancy in storing data, arrays of disks offer the ability
to withstand the failure of a single disk. There are several methods
for maintaining redundant data, and in 1988 the different methods were
categorized into a taxonomy known as RAID (Redundant Arrays of Inexpensive
Disks -- Inexpensive was later changed to Independent) by a research
group at U.C.-Berkeley headed by David Patterson. Garth Gibson, the
head of the Parallel Data Lab, completed his thesis while
working on this research project.
Originally, there were five RAID levels, but the phrase "RAID Level
0" is now commonly used to refer to a nonredundant disk array and RAID
Level 6 has been added to the numbered levels. Additionally, a number
of researchers have proposed variations on RAID levels, including several
by the PDL: write deferring, parity
logging, parity declustering,
and log-structured storage.
Today, RAID systems are an extremely profitable product for the storage
industry -- the market for RAID exceeded $3 billion in 1994 and is expected
to surpass $13 billion by 1997. But they do have their limitations.
First, there is the cost of maintaining redundant data -- both in terms
of disk space and the time it takes to access the disks in the array.
Second, ensuring that redundant arrays can handle transient operating
errors as well as tolerate the failure of a disk -- through the ability
to recover lost data (reliability) as well as the ability to perform
well while the system restores the data on-line (availability) -- is
a complex process that is becoming more difficult as each new RAID optimization
is proposed. Third, many applications access disk drives serially, meaning
that they are unable to take advantage of the parallelism offered by
disk arrays. Finally, RAID systems directly attached to a host system
bus are inherently not scalable.
Recognizing these limitations, members of the Parallel Data Lab have
moved beyond proposing optimizations for specific RAID levels to solving
problems common to all redundant arrays. Mark Holland's thesis, On-line
Data Reconstruction in Redundant Disk Arrays, targets the reliability
and availability of RAID systems. He offers a disk-oriented reconstruction
algorithm for restoring data lost during a single disk failure.
This algorithm maximizes the efficiency of reconstruction without significantly
penalizing response time for the system user.
Hugo Patterson began working on a "smart" disk controller before realizing
that the first issue to address was how "smartly" applications use disks.
This research, which is really situated within the context of file systems
and is independent of the number of disks in the storage subsystem,
has had significant implications for the third major limitation on RAID
systems: the inability of most applications to access disks in the array
in parallel. Hugo's solution to this problem enables applications with
serial I/O workloads to mimic parallel applications by fetching needed
data in advance. His research in informed
prefetching and caching is covered elsewhere in the PDL Web pages.
While Mark Holland's research addressed how the array restores data
lost from a specific failed disk without penalizing the system user
excessively, Bill Courtright's
research focused more broadly upon how the array handles errors
that occur while it is operating -- independent of the specific cause
for the error. Bill began his research investigating how to reduce the
complexity associated with handling
errors in RAID systems. His work was motivated by a very real need
in industry: large fractions (over half) of array code was being devoted
to handling architecture-specific errors. Verifying the correctness
of this code is difficult and the code is not easily extended to support
In separating error-handling from code specific to RAID architecture,
Bill chose to model RAID operations using directed acyclic graphs (DAGs)
because they offer an intuitive, visual structuring of sequences of
disk operations. Mark then built upon this approach when he began developing
a general-purpose RAID controller. The main implication of Mark's new
work was that a truly general-purpose controller would allow RAID designers
to prototype new RAID designs quickly.
At this point, the work on error-handling and the general-purpose
RAID controller naturally merged into a project to develop an extensible
RAID framework, which we have named RAIDframe
(in the tradition of RAIDsim, the simulation tool developed at U.C.-Berkeley).
RAIDframe offers RAID designers a number of benefits. First, separating
error-handling from RAID-specific code lets designers reuse over 90%
of the code to build new RAID systems. It also means that error-handling
can be automated across all RAID designs. Second, modeling RAID operations
as DAGs means that techniques for verifying the correctness of software
designs can be used -- even before they're implemented in code. Essentially,
RAIDframe allows designers to address the current limitations of RAID
systems by quickly prototyping new designs which are verifiably correct,
handle errors transparently, and recover from failed disks with performance
being degraded minimally.
We thank the members and companies of the PDL Consortium: American Power Conversion,
Data Domain, Inc.,
Sun Microsystems, Symantec Corporation and
VMware, Inc. for
their interest, insights, feedback, and support.