PDL Abstract

Reconciling LSM-Trees with Modern Hard Drives using BlueFS

Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-19-102, April 2019.

Abutalib Aghayev, Sage Weil†, Gregory R. Ganger, George Amvrosiadis

Carnegie Mellon University
† Red Hat Inc.


LSM-Trees have become a popular building block in large-scale storage systems where hard drives are the dominant storage medium. Meanwhile, drive makers are shifting to Shingled Magnetic Recording (SMR), a recording technique that increases drive capacity by 25%but also works best with a new, backward-incompatible device interface. Large-scale cloud storage providers are updating their proprietary software stacks to utilize SMR drives, but widespread adoption requires more general-purpose support. This paper introduces BlueFS, an open-source user-space file system that allows widely-used LSM-Tree implementations to utilize SMR drives with zero overhead and no code changes. BlueFS’s design aggressively specializes data placement and I/O sizes to exposed SMR drive parameters, while hiding those details. As a result, for example, unmodified RocksDB performs random inserts 64% faster atop BlueFS than atop XFS, when storing data on an SMR drive. In addition, LevelDB running on BlueFS is 2–20× faster than GearDB, a recent key-value store designed for SMR drives.

KEYWORDS: Shingled Magnetic Recording, SMR, HM-SMR, LSM-Tree, file systems

FULL TR: pdf