PDL Abstract

Structuring PLFS for Extensibility

8th Parallel Data Storage Workshop, Nov 18, 2013, Denver, CO.

Chuck Cranor, Milo Polte*, Garth A. Gibson

Carnegie Mellon University
*WibiData, Inc.

The Parallel Log Structured Filesystem (PLFS) was designed to transparently transform highly concurrent, mas- sive high-performance computing (HPC) N-to-1 checkpoint workloads into N-to-N workloads to avoid single-file performance bottlenecks in typical HPC distributed filesystems. PLFS has produced speedups of 2-150X for N-1 workloads at Los Alamos National Lab. Having successfully improved N-1 performance, we have restructured PLFS for extensibil- ity so that it can be applied to more workloads and storage systems. In this paper we describe PLFS' evolution from a single-purpose log-structured middleware filesystem into a more general platform for transparently translating application I/O patterns. As an example of this extensibility, we show how PLFS can now be used to enable HPC applications to perform N-1 checkpoints on an HDFS-based cloud storage system.