PDL Abstract

Lazy Redundancy for NVM Storage: Handing the Performance-Reliability Tradeoff to Applications

Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-19-101, Apr 2019.

Rajat Kateja, Andy Pavlo, Gregory R. Ganger

Carnegie Mellon University

Lazy redundancy maintenance can provide direct access non-volatile memory (NVM) with low-overhead data integrity features. The ANON library lazily maintains redundancy (per-page checksums and cross-page parity) for applications that exploit fine-grained direct load/store access to NVM data. To do so, ANON repurposes page table dirty bits to identify pages where redundancy must be updated, addressing the consistency challenges of using dirty bits across crashes. A periodic background thread updates outdated redundancy at a dataset-specific frequency chosen to tune the performance vs. time-to-coverage tradeoff. This approach avoids critical path interpositioning and often amortizes redundancy updates across many stores to a page, enabling ANON to maintain redundancy at just a few percent overhead. For example, MongoDB’s YCSB throughput drops by less than 2% when using ANON with a 30 sec period and by only 3–7% with a 1 sec period. Compared to the state-of-the-art approach, ANON with a 30 sec period increases the throughput by up to 1.8x for Redis with YCSB workloads and by up to 4.2x for write-only microbenchmarks.

KEYWORDS: NVM, DAX, asynchronous redundancy

FULL TR: pdf