PARALLEL DATA LAB 

PDL Abstract

A Redundant Disk Array Architecture for Efficient Small Writes

Carnegie Mellon University Technical Report CMU-CS-94-170, July 1994.

Daniel Stodolsky, Mark Holland, William V. Courtright II, and Garth A. Gibson

School of Computer Science
Carnegie Mellon University
Pittsburgh, PA 15213
garth@cs.cmu.edu

http://www.pdl.cmu.edu/

Parity encoded redundant disk arrays provide highly reliable, cost effective secondary storage with high performance for reads and large writes. Their performance on small writes, however, is much worse than mirrored disks - the traditional, highly reliable, but expensive organization for second ary storage. Unfortunately, small writes are a substantial portion of the I/O workload of many impor tant, demanding applications such as on-line transaction processing. This paper presents parity logging, a novel solution to the small write problem for redundant disk arrays. Parity logging applies journalling techniques to substantially reduce the cost of small writes. We provide detailed models of parity logging and competing schemes - mirroring, floating storage, and RAID level 5 - and verify these models by simulation. Parity logging provides performance competitive with mirroring, but with capacity overhead close to the minimum offered by RAID level 5. Finally, parity logging can exploit data caching more effectively than all three alternative approaches.

FULL PAPER: pdf / postscript