Performance Insulation for Shared Storage Servers
Services that share a storage infrastructure should realize the same efficiency, within their share of time, as when they have the infrastructure to themselves. Unfortunately, traditional systems offer nothing close to this ideal, with inter-service disk and cache interference creating large and unpredictable inefficiencies. A new metric, the R-value, is an explicitly configured lower bound on the storage efficiency a service will receive (relative to its non-sharing performance), no matter what services it shares the infrastructure with. The Argon storage server explicitly manages its resources to bound the inefficiency arising from inter-service disk and cache interference in traditional systems. The goal is to provide each service with at least a configured fraction of the throughput it achieves when it has the storage server to itself, within its share of the server -- a service allocated 1/nth of a server should get nearly 1/nth (or more) of the throughput it would get alone. The Argon storage server uses prefetching, write coalescing, cache partitioning, and quanta-based disk scheduling policies to ensure R-value commitments are met. For more information, see the extended overview.
Argon’s high-level architecture. Argon makes use of cache partitioning, request amortization, and quanta-based disk time scheduling.
- Co-scheduling of Disk Head Time in Cluster-based Storage.
Matthew Wachs, Gregory R. Ganger. 28th International Symposium On Reliable Distributed Systems September 27-30, 2009. Niagara Falls, New York, U.S.A. Supersedes Carnegie Mellon University Parallel Data Lab Technical Report
Abstract / PDF [245K]
- Argon: Performance Insulation for Shared Storage Servers. Matthew Wachs, Michael Abd-El-Malek, Eno Thereska, Gregory R. Ganger. Proceedings of the 5th USENIX Conference on File and Storage Technologies (FAST '07), February 13–16, 2007, San Jose, CA. Supercedes Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-06-106, May 2006.
Abstract / PDF [ 167K]
This material is based on research sponsored in part by the National Science Foundation, via grants #CNS-0326453 and #CCF-0621499, by the Air Force Research Laboratory, under agreement number F49620–01–1–0433, by the Army Research Office, under agreement number DAAD19–02–1–0389, and by the Department of Energy under Award Number DE-FC02- 06ER25767. Matthew Wachs is supported in part by an NDSEG Fellowship, which is sponsored by the Department of Defense.
We thank the members and companies of the PDL Consortium: Actifio, American Power Conversion, EMC Corporation, Facebook, Google, Hewlett-Packard Labs, Hitachi, Huawei Technologies Co., Intel Corporation, Microsoft Research, NEC Laboratories, NetApp, Inc., Oracle Corporation, Panasas, Samsung Information Systems America, Seagate Technology, Symantec Corporation, VMware, Inc., and Western Digital for their interest, insights, feedback, and support.