| RAID: Redundant Arrays of Independent Disks
  THIS PAGE HAS MOVED. PLEASE UPDATE YOUR BOOKMARKS. iF YOU ARE NOT REDIRECTED IN A FEW SECONDS, PLEASE CLICK HERE TO GO TO OUR NEW PAGE. [ RAID Overview | Publications 
             | RAIDframe | RAID 
              Tutorial ] 
            This page contains a brief summary of RAID technology for those people 
          who are unfamiliar with it so that our RAID research can be understood 
          within a larger context. More detailed background can be found in Garth 
          Gibson's Ph.D. thesis, Mark Holland's 
          Ph.D. thesis, and the  RAIDframe 
          documentation, which excerpts chapter two of Mark Holland's thesis 
          and draws heavily upon Bill Courtright's 
          thesis. These three documents also refer to a large number of technical 
          papers, including several available from our  
          Publications page, which can provide even more depth on specific 
          variants of RAID technology. 
          
        Several trends in the computer industry over the past decade have driven 
        the design of the storage subsystem towards increasing parallelism. This 
        means that systems can and will perform better in terms of I/O by increasing 
        the number, rather than the performance, of individual disks. These trends 
        include a widening gap between the speed of CPUs and that of disks, the 
        shrinking size of disk drives, and new I/O-intensive applications such 
        as digital video, scientific visualization, and spatial databases.  By adding redundancy in storing data, arrays of disks offer the ability 
          to withstand the failure of a single disk. There are several methods 
          for maintaining redundant data, and in 1988 the different methods were 
          categorized into a taxonomy known as RAID (Redundant Arrays of Inexpensive 
          Disks -- Inexpensive was later changed to Independent) by a research 
          group at U.C.-Berkeley headed by David Patterson. Garth Gibson, the 
          head of the Parallel Data Lab, completed his thesis while 
          working on this research project. 
          Originally, there were five RAID levels, but the phrase "RAID Level 
          0" is now commonly used to refer to a nonredundant disk array and RAID 
          Level 6 has been added to the numbered levels. Additionally, a number 
          of researchers have proposed variations on RAID levels, including several 
          by the PDL: write deferring, parity 
          logging, parity declustering, 
          and log-structured storage. 
          Today, RAID systems are an extremely profitable product for the storage 
          industry -- the market for RAID exceeded $3 billion in 1994 and is expected 
          to surpass $13 billion by 1997. But they do have their limitations. 
          First, there is the cost of maintaining redundant data -- both in terms 
          of disk space and the time it takes to access the disks in the array. 
          Second, ensuring that redundant arrays can handle transient operating 
          errors as well as tolerate the failure of a disk -- through the ability 
          to recover lost data (reliability) as well as the ability to perform 
          well while the system restores the data on-line (availability) -- is 
          a complex process that is becoming more difficult as each new RAID optimization 
          is proposed. Third, many applications access disk drives serially, meaning 
          that they are unable to take advantage of the parallelism offered by 
          disk arrays. Finally, RAID systems directly attached to a host system 
          bus are inherently not scalable. 
          Recognizing these limitations, members of the Parallel Data Lab have 
          moved beyond proposing optimizations for specific RAID levels to solving 
          problems common to all redundant arrays. Mark Holland's thesis, On-line 
          Data Reconstruction in Redundant Disk Arrays, targets the reliability 
          and availability of RAID systems. He offers a disk-oriented reconstruction 
          algorithm for restoring data lost during a single disk failure. 
          This algorithm maximizes the efficiency of reconstruction without significantly 
          penalizing response time for the system user. 
          Hugo Patterson began working on a "smart" disk controller before realizing 
          that the first issue to address was how "smartly" applications use disks. 
          This research, which is really situated within the context of file systems 
          and is independent of the number of disks in the storage subsystem, 
          has had significant implications for the third major limitation on RAID 
          systems: the inability of most applications to access disks in the array 
          in parallel. Hugo's solution to this problem enables applications with 
          serial I/O workloads to mimic parallel applications by fetching needed 
          data in advance. His research in informed 
          prefetching and caching is covered elsewhere in the PDL Web pages. 
          While Mark Holland's research addressed how the array restores data 
          lost from a specific failed disk without penalizing the system user 
          excessively, Bill Courtright's 
          research focused more broadly upon how the array handles errors 
          that occur while it is operating -- independent of the specific cause 
          for the error. Bill began his research investigating how to reduce the 
          complexity associated with handling 
          errors in RAID systems. His work was motivated by a very real need 
          in industry: large fractions (over half) of array code was being devoted 
          to handling architecture-specific errors. Verifying the correctness 
          of this code is difficult and the code is not easily extended to support 
          new architectures. 
          In separating error-handling from code specific to RAID architecture, 
          Bill chose to model RAID operations using directed acyclic graphs (DAGs) 
          because they offer an intuitive, visual structuring of sequences of 
          disk operations. Mark then built upon this approach when he began developing 
          a general-purpose RAID controller. The main implication of Mark's new 
          work was that a truly general-purpose controller would allow RAID designers 
          to prototype new RAID designs quickly. 
          At this point, the work on error-handling and the general-purpose 
          RAID controller naturally merged into a project to develop an extensible 
          RAID framework, which we have named RAIDframe 
          (in the tradition of RAIDsim, the simulation tool developed at U.C.-Berkeley). 
          RAIDframe offers RAID designers a number of benefits. First, separating 
          error-handling from RAID-specific code lets designers reuse over 90% 
          of the code to build new RAID systems. It also means that error-handling 
          can be automated across all RAID designs. Second, modeling RAID operations 
          as DAGs means that techniques for verifying the correctness of software 
          designs can be used -- even before they're implemented in code. Essentially, 
          RAIDframe allows designers to address the current limitations of RAID 
          systems by quickly prototyping new designs which are verifiably correct, 
          handle errors transparently, and recover from failed disks with performance 
        being degraded minimally.         
         Acknowledgements
          
We thank the members and companies of the PDL Consortium: American Power Conversion, 
Data Domain, Inc., 
EMC Corporation, 
Facebook, 
Google, 
Hewlett-Packard Labs, 
Hitachi, 
IBM,
Intel Corporation, 
LSI, 
Microsoft Research, 
NetApp, Inc., 
Oracle Corporation, 
Seagate Technology, 
Sun Microsystems, Symantec Corporation and
VMware, Inc.  for
their interest, insights, feedback, and support.   
       |