PARALLEL DATA LAB

Self-* Storage Systems




Human administration of storage systems is a large and growing issue in modern IT infrastructures. We are exploring new storage architectures that integrate automated management functions and simplify the human administrative task. Self-* storage systems are self-configuring, self-organizing, self-tuning, self-healing, self-managing systems of storage bricks. Borrowing organizational ideas from corporate structure and technologies from AI and control systems, self-* storage should simplify storage administration, reduce system cost, increase system robustness, and simplify system construction.
___
1pronounced "self-star", the name is a play on the UNIX shell wildcard character, '*'. It captures many recent buzzwords in a single meta-buzzword.


White Paper

  • Self-* Storage: Brick-based Storage with Automated Administration. Gregory R. Ganger, John D. Strunk, Andrew J. Klosterman. Published as Carnegie Mellon University Technical Report, CMU-CS-03-178, August 2003.
    Abstract / PDF [553K]

People

FACULTY

Anastassia Ailamaki
Anthony Brockwell
Chuck Cranor
Christos Faloutsos
Greg Ganger
Mike Reiter

STAFF

Manish Prasad
Raja Sambasivan
Terrence Wong

STUDENTS

Michael Abd-El-Malek
Kinman Au
Garth Goodson
James Hendricks
Andrew Klosterman
Nuno Loureiro
Chris Lumb
Mike Mesnier
Adrian Ng
Spiros Papadimitriou
Brandon Salmon
Jiri Schindler
Shafeeq Sinnamohideen
Craig Soules
John Strunk
Eno Thereska
Matthew Wachs
Mengzhi Wang
Jay Wylie


Publications

  • Visualizing Request-flow Comparison to Aid Performance Diagnosis in Distributed Systems. Raja R. Sambasivan, Ilari Shafer, Michelle L. Mazurek, Gregory R. Ganger. IEEE Transactions on Visualization and Computer Graphics (Proceedings Information Visualization 2013), vol. 19, no. 12, Dec. 2013.
    Abstract / PDF [1.9M] / TRAILER VIDEO [5.6M] / VIDEO [17.9M]

  • Diagnosing Performance Changes in Distributed Systems by Comparing Request Flows. Raja R. Sambasivan. Carnegie Mellon University Parallel Data Lab Ph.D. Dissertation. CMU-PDL-13-105, May 2013.
    Abstract / PDF [3.9M]

  • Automated Diagnosis without Predictability is a Recipe for Failure. Raja R. Sambasivan & Gregory R. Ganger. Proceedings of the 4th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '12), June 12-13, 2012, Boston, MA. Supersedes Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-11-101.
    Abstract / PDF [368K]

  • End-to-end Tracing in HDFS. William Wang Carnegie Mellon University School of Computer Science Technical Report (Masters Thesis) CMU-CS-11-120, July 2011.
    Abstract / PDF [489K]

  • Diagnosing Performance Changes by Comparing Request Flows. Raja R. Sambasivan, Alice X. Zheng, Michael De Rosa, Elie Krevat, Spencer Whitman, Michael Stroucken, William Wang, Lianghong Xu, Gregory R. Ganger. 8th USENIX Symposium on Networked Systems Design and Implementation (NSDI'11). March 30 - April 1, 2011. Boston, MA.
    Abstract / PDF [388K]

  • Improving Storage Bandwidth Guarantees with Performance Insulation. Matthew Wachs, Gregory R. Ganger. Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-10-113, October 2010.
    Abstract / PDF [285K]

  • Diagnosing Performance Changes by Comparing System Behaviours. Raja R. Sambasivan, Alice X. Zheng, Elie Krevat, Spencer Whitman, Michael Stroucken, William Wang, Lianghong Xu, Gregory R. Ganger. Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-10-107. July 2010. Supersedes Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-10-103.
    Abstract / PDF [503K]

  • A Transparently-Scalable Metadata Service for the Ursa Minor Storage System. Shafeeq Sinnamohideen, Raja R. Sambasivan, James Hendricks, Likun Liu, Gregory R. Ganger. Usenix Annual Technical Conference, Boston, MA, June 23-25, 2010. Supercedes Carnegie Mellon University Parallel Data Laboratory Technical Report CMU-PDL-10-102. March 2010.
    Abstract / PDF [230K]

  • Zzyzx: Scalable Fault Tolerance Through Byzantine Locking. James Hendricks, Shafeeq Sinnamohideen, Gregory R. Ganger, Michael K. Reiter. Proceedings of the 40th Annual IEEE/IFIP International Conference on Dependable Systems and Networks. Chicago, Illinois, June 2010.
    Abstract / PDF [231K]

  • Diagnosing Performance Problems by Visualizing and Comparing System Behaviours. Raja R. Sambasivan, Alice X. Zheng, Elie Krevat, Spencer Whitman, Gregory R. Ganger. Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-10-103, February 2010.
    Abstract / PDF [423K]

  • Delayed Instantiation Bulk Operations for Management of Distributed, Object-based Storage Systems. Andrew J. Klosterman. Ph.D. Dissertation. Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-09-108, August 2009.
    Abstract / PDF [2M]

  • Efficient Byzantine Fault Tolerance for Scalable Storage and Services. James Hendricks. Carnegie Mellon School of Computer Science Ph.D. Dissertation CMU-CS-09-146. July 2009.
    Abstract / PDF [1.1M]

  • Co-scheduling of Disk Head Time in Cluster-based Storage. Matthew Wachs, Gregory R. Ganger. 28th International Symposium On Reliable Distributed Systems September 27-30, 2009. Niagara Falls, New York, U.S.A. Supersedes Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-08-113. October 2008.
    Abstract / PDF [245K]

  • Relative Fitness Modeling. Michael P. Mesnier, Matthew Wachs, Raja R. Sambasivan, Alice X. Zheng, and Gregory R. Ganger. Communications of the ACM, Vol. 52 No. 4, April 2009.
    Abstract / PDF [775K]

  • Using Utility Functions to Control a Distributed Storage System. John D. Strunk. Carnegie Mellon University, Dept. ECE Ph.D Dissertation CMU-PDL-08-102, May 2008.
    Abstract / PDF [940K]

  • On Modeling the Relative Fitness of Storage. Michael P. Mesnier. Carnegie Mellon University, Dept. ECE Ph.D Dissertation CMU-PDL-07-108, December 19, 2007.
    Abstract / PDF [1.16M]

  • Using Utility to Provision Storage Systems. John D. Strunk, Eno Thereska, Christos Faloutsos, Gregory R. Ganger. 6th USENIX Conference on File and Storage Technologies (FAST '08). Feb. 26-29, 2008. San Jose, CA. Supercedes Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-07-106, September 2007.
    Abstract / PDF [310K]

  • Low-overhead Byzantine Fault-tolerant Storage. James Hendricks, Gregory R. Ganger, Michael K. Reiter. Proceedings of the Twenty-First ACM Symposium on Operating Systems Principles (SOSP 2007), Stevenson, WA, October 2007.
    Abstract / PDF [280K]

  • Enabling What-if Explorations in Systems. Eno Thereska. Carnegie Mellon University, Dept. ECE Ph.D Dissertation CMU-PDL-07-103, August 2007.
    Abstract / PDF [2.35M]

  • Verifying Distributed Erasure-coded Data. James Hendricks, Gregory R. Ganger, Michael K. Reiter. To appear in Proceedings of the Twenty-Sixth Annual ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing (PODC 2007), Portland, August 2007.
    Abstract / PDF [193K]

  • Categorizing and Differencing System Behaviours. Raja R. Sambasivan, Alice X. Zheng, Eno Thereska, Gregory R. Ganger. Second Workshop on Hot Topics in Autonomic Computing. June 15, 2007. Jacksonville, FL.
    Abstract / PDF [120K]

  • Observer: Keeping System Models from Becoming Obsolete. Eno Thereska, Dushyanth Narayanan, Anastassia Ailamaki, Gregory R. Ganger. Second Workshop on Hot Topics in Autonomic Computing. June 15, 2007. Jacksonville, FL. Supercedes Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-07-101, January 2007.
    Abstract / PDF [75K]

  • Modeling the Relative Fitness of Storage. Michael P. Mesnier, Matthew Wachs, Raja R. Sambasivan, Alice X. Zheng, Gregory R. Ganger. SIGMETRICS’07, June 12–16, 2007, San Diego, California, USA.
    Abstract / PDF [235K]

  • Argon: Performance Insulation for Shared Storage Servers. Matthew Wachs, Michael Abd-El-Malek, Eno Thereska, Gregory R. Ganger. Proceedings of the 5th USENIX Conference on File and Storage Technologies (FAST '07), February 13–16, 2007, San Jose, CA. Supercedes Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-06-106, May 2006.
    Abstract / PDF [ 167K]

  • Eliminating Cross-server Operations in Scalable File Systems. James Hendricks, Shafeeq Sinnamohideen, Raja R. Sambasivan, Gregory R. Ganger. Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-06-105, May 2006.
    Abstract / PDF [ 254K]

  • Improving Small File Performance in Object-based Storage. James Hendricks, Raja R. Sambasivan, Shafeeq Sinnamohideen, Gregory R. Ganger. Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-06-104, May 2006.
    Abstract / PDF [ 1.45M]

  • Early Experiences on the Journey Towards Self-* Storage. Michael Abd-El-Malek, William V. Courtright II, Chuck Cranor, Gregory R. Ganger, James Hendricks, Andrew J. Klosterman, Michael Mesnier, Manish Prasad, Brandon Salmon, Raja R. Sambasivan, Shafeeq Sinnamohideen. Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, September 2006.
    Abstract / PDF [113K] / Postscript [745K]

  • InteMon: Continuous Mining of Sensor Data in Large-scale Self-* Infrastructures. Evan Hoke, Jimeng Sun, John D. Strunk, Gregory R. Ganger, and Christos Faloutsos. ACM SIGOPS Operating Systems Review. Vol 40 Issue 3. July, 2006. ACM Press.
    Abstract / PDF [573K]

  • Stardust: Tracking Activity in a Distributed Storage System. Eno Thereska, Brandon Salmon, John Strunk, Matthew Wachs, Michael Abd-El-Malek, Julio Lopez, Gregory R. Ganger. Proceedings of the Joint International Conference on Measurement and Modeling of Computer Systems, (SIGMETRICS'06). June 26th-30th 2006, Saint-Malo, France.
    Abstract / PDF [578K]

  • Informed Data Distribution Selection in a Self-predicting Storage System. Eno Thereska, Michael Abd-El-Malek, Jay J. Wylie, Dushyanth Narayanan, Gregory R. Ganger. Proceedings of the International Conference on Autonomic Computing (ICAC-06), Dublin, Ireland. June 12th-16th 2006. Supercedes Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-06-101, January 2006.
    Abstract / PDF [196K]

  • Correctness of the Read/Conditional-Write and Query/Update Protocols. Michael Abd-El-Malek, Gregory R. Ganger, Garth R. Goodson, Michael K. Reiter, Jay J. Wylie. Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-05-107, September, 2005.
    Abstract / PDF [392K]

  • Ursa Minor: Versatile Cluster-based Storage. Michael Abd-El-Malek, William V. Courtright II, Chuck Cranor, Gregory R. Ganger, James Hendricks, Andrew J. Klosterman, Michael Mesnier, Manish Prasad, Brandon Salmon, Raja R. Sambasivan, Shafeeq Sinnamohideen, John D. Strunk, Eno Thereska, Matthew Wachs, Jay J. Wylie. Proceedings of the 4th USENIX Conference on File and Storage Technology (FAST '05). San Francisco, CA. December 13-16, 2005. Supercedes Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-05-104, April, 2005.
    Abstract / PDF [490K]

  • D-SPTF: Decentralized Request Distribution in Brick-based Storage Systems. Christopher R. Lumb. Carnegie Mellon University Parallel Data Lab Ph.D. Dissertation CMU-PDL-05-111, December, 2005.
    Abstract / PDF [1.2M]

  • Modeling the Relative Fitness of Storage Devices. Michael Mesnier, Matthew Wachs, Gregory Ganger. Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-05-106, August, 2005.
    Abstract / PDF [190K]

  • Towards self-predicting systems: What if you could ask “what-if”? Eno Thereska, Dushyanth Narayanan, Gregory R. Ganger. 3rd International Workshop on Self-adaptive and Autonomic Computing Systems. Copenhagen, Denmark, August 2005. Supercedes Carnegie Mellon University Parallel Data Laboratory Technical Report CMU-PDL-05-10, February 2005.
    Abstract / PDF [110K]

  • Comparison-based File Server Verification. Yuen-Lin Tan, Terrence Wong, John D. Strunk, Gregory R. Ganger. USENIX '05 Annual Technical Conference, April 10-15, 2005. Anaheim, CA.
    Abstract / Postscript [900K] / PDF [130K]

  • Challenges in Building a Two-Tiered Learning Architecture for Disk Layout. Brandon Salmon, Eno Thereska, Craig A.N. Soules, John D. Strunk, Gregory R. Ganger. Carnegie Mellon University Parallel Data Laboratory Technical Report CMU-PDL-04-109. August, 2004.
    Abstract / Postscript [6.8M] / PDF [150K]

  • Storage Device Performance Prediction with CART Models. Mengzhi Wang, Kinman Au, Anastassia Ailamaki, Anthony Brockwell, Christos Faloutsos, and Gregory R. Ganger. Proc. 12th Annual Meeting of the IEEE/ACM International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS). Volendam, The Netherlands. October 5-7, 2004. Supercedes Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-04-103, March 2004.
    Abstract / Postscript [908K] / PDF [122K]

  • DSPTF: Decentralized Request Distribution in Brickbased Storage Systems. Christopher R. Lumb, Richard Golding, Gregory R. Ganger. Proceedings of ASPLOS’04, October 7–13 ,2004, Boston, Massachusetts, USA.
    Abstract / PDF [281K]

  • The Safety and Liveness Properties of a Protocol Family for Versatile Survivable Storage Infrastructures. Garth R. Goodson, Jay J. Wylie, Gregory R. Ganger, Michael K. Reiter. Carnegie Mellon University Parallel Data Laboratory Technical Report CMU-PDL-03-105. March 2004.
    Abstract / Postscript [922K] / PDF [227K]

  • Efficient Byzantine-tolerant Erasure-coded Storage. Garth R. Goodson, Jay J. Wylie, Gregory R. Ganger, Michael K. Reiter. Proceedings of the International Conference on Dependable Systems and Networks (DSN-2004). Palazzo dei Congressi, Florence, Italy. June 28th - July 1, 2004. Supercedes Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-03-104, December 2003.
    Abstract / Postscript [2.3M] / PDF [253K]

  • Storage Device Performance Prediction with CART Models [Extended Abstract]. Mengzhi Wang, Kinman Au, Anastassia Ailamaki, Anthony Brockwell, Christos Faloutsos, and Gregory R. Ganger. Proceedings: Poster Session. Joint International Conference on Measurement and Modeling of Computer Systems. ACM SIGMETRICS/Performance 2004. June 12th-16th 2004, Columbia University, New York.
    Abstract / Postscript [400K] / PDF [64K]

  • A Protocol Family for Versatile Survivable Storage Infrastructures. Garth R. Goodson, Jay J. Wylie, Gregory R. Ganger, Michael K. Reiter. Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-03-103, December 2003.
    Abstract / Postscript [925K] / PDF [321K]

  • File Classification in Self-* Storage Systems. Michael Mesnier, Eno Thereska, Daniel Ellard, Gregory R. Ganger, Margo Seltzer. Proceedings of the First International Conference on Autonomic Computing (ICAC-04). New York, NY. May 2004. Supercedes Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-04-101, January 2004.
    Abstract / Postscript [1.6M] / PDF [80K]

  • D-SPTF: Decentralized Request Distribution in Brick-based Storage. Christopher R. Lumb, Gregory R. Ganger, Richard Golding. Carnegie Mellon University School of Computer Science Tecnical Report CMU-CS-03-202, November, 2003.
    Abstract / PDF [475K]

  • Attribute-Based Prediction of File Properties. Daniel Ellard, Michael Mesnier, Eno Thereska, Gregory R. Ganger, Margo Seltzer. Harvard Computer Science Group Technical Report TR-14-03, December 2003.
    Abstract / Postscript [850K] / PDF [127K]

  • Self-* Storage: Brick-based Storage with Automated Administration. Gregory R. Ganger, John D. Strunk, Andrew J. Klosterman. Published as Carnegie Mellon University Technical Report, CMU-CS-03-178, August 2003.
    Abstract / PDF [650K]

  • A Human Organization Analogy for Self-* Systems. John D. Strunk, Gregory R. Ganger. First Workshop on Algorithms and Architectures for Self-Managing Systems. In conjunction with Federated Computing Research Conference (FCRC). San Diego, CA. June 11, 2003. Also published as Carnegie Mellon University SCS Technical Report CMU-CS-03-129.
    Abstract / Postscript [273K] / PDF [68K]

  • Efficient Consistency for Erasure-coded Data via Versioning Servers. Garth R. Goodson, Jay J. Wylie, Gregory R. Ganger, Michael K. Reiter. Carnegie Mellon University Technical Report CMU-CS-03-127, April 2003.
    Abstract / Postscript [290K] / PDF [160K]


Acknowledgements

This material is based upon work supported by the National Science Foundation under Grant No. 0326453. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

We thank the members and companies of the PDL Consortium: Amazon, Google, Hitachi Ltd., Honda, Intel Corporation, IBM, Meta, Microsoft Research, Oracle Corporation, Pure Storage, Salesforce, Samsung Semiconductor Inc., Two Sigma, and Western Digital for their interest, insights, feedback, and support.