SDI Seminar

Speaker: John Hartman, University of Arizona

The Swarm Scalable Storage System

Date: November 21, 1996

Abstract:

The traditional network file server has outlived its usefulness. All data transferred between the clients and the storage devices must pass through the server, limiting the file system's performance, scalability, and reliability. In this talk I will present the design of Swarm, a scalable storage system that eliminates the traditional centralized file server, providing scalable, reliable, cost-effective data storage. Data in Swarm are stored on a cluster of storage servers, servers that provide a block-based interface rather than file-based, and are optimized for cost-performance rather than absolute performance. Clients store data on the servers using a {\em striped log} abstraction, in which newly-created file data are appended to a log and striped across the storage cluster. The use of a striped log simplifies storage allocation, improves file access performance, balances server loads, enables the use of computed redundancy to provide fault-tolerance, and simplifies crash recovery. Swarm uses a distributed cleaning algorithm to garbage-collect unused portions of the log, and a distributed file management algorithm to maintain file metadata and file system consistency.

A Swarm prototype is currently being developed on a cluster of Pentium-based personal computers running the Scout operating system. This prototype will ultimately support a variety of file system protocols, including NFS, the SIO low-level API, and the native Swarm file system.