Kangaroo: Caching Billions of Tiny Objects on Flash
Proceedings of the 28th ACM Symposium on Operating Systems Principles (SOSP '21) October 25-28, 2021. Virtual Event.
Sara McAllister*, Benjamin Berg*, Julian Tutuncu-Macias*, Juncheng Yang*, Sathya Gunasekar†, Jimmy Lu†, Daniel Berger‡, Nathan Beckmann*, Gregory R. Ganger*
* Carnegie Mellon University
‡ University of Washington and Microsoft Research
Many social-media and IoT services have very large working sets consisting of billions of tiny (~100 B) objects. Large, flash-based caches are important to serving these working sets at acceptable monetary cost. However, caching tiny objects on flash is challenging for two reasons: (i) SSDs can read/write data only in multi-KB pages that are much larger than a single object, stressing the limited number of times flash can be written; and (ii) very few bits per cached object can be kept in DRAM without losing flash's cost advantage. Unfortunately, existing flash-cache designs fall short of addressing these challenges: write-optimized designs require too much DRAM, and DRAM-optimized designs write flash too much.
We present Kangaroo, a new flash-cache design that optimizes both DRAM usage and flash writes to maximize cache performance while minimizing cost. Kangaroo combines a large, set-associative cache with a small, log-structured cache. The set-associative cache requires minimal DRAM, while the log-structured cache minimizes Kangaroo's flash writes. Experiments using traces from Facebook and Twitter show that Kangaroo achieves DRAM usage close to the best prior DRAM-optimized design, flash writes close to the best prior write-optimized design, and miss ratios better than both. Kangaroo's design is Pareto-optimal across a range of allowed write rates, DRAM sizes, and flash sizes, reducing misses by 29% over the state of the art. These results are corroborated with a test deployment of Kangaroo in a production flash cache at Facebook.
FULL PAPER: pdf - coming soon