PDL Abstract

The CacheLib Caching Engine: Design and Experiences at Scale

14th USENIX Symposium on Operating Systems Design and Implementation (OSDI'20), Virtual Event, Nov. 4–6, 2020.

Benjamin Berg1, Daniel S. Berger1,3, Sara McAllister1, Isaac Grosof1, Sathya Gunasekar2, Jimmy Lu2, Michael Uhlar2, Jim Carrig2, Nathan Beckmann1, Mor Harchol-Balter1, Gregory R. Ganger1

1 Carnegie Mellon University
2 Facebook
3 Microsoft Research


Web services rely on caching at nearly every layer of the system architecture. Commonly, each cache is implemented and maintained independently by a distinct team and is highly specialized to its function. For example, an application-data cache would be independent from a CDN cache. However, this approach ignores the difficult challenges that different caching systems have in common, greatly increasing the overall effort required to deploy, maintain, and scale each cache.

This paper presents a different approach to cache development, successfully employed at Facebook, which extracts a core set of common requirements and functionality from otherwise disjoint caching systems. CacheLib is a generalpurpose caching engine, designed based on experiences with a range of caching use cases at Facebook, that facilitates the easy development and maintenance of caches. CacheLib was first deployed at Facebook in 2017 and today powers over 70 services including CDN, storage, and application-data caches.

This paper describes our experiences during the transition from independent, specialized caches to the widespread adoption of CacheLib. We explain how the characteristics of production workloads and use cases at Facebook drove important design decisions. We describe how caches at Facebook have evolved over time, including the significant benefits seen from deploying CacheLib. We also discuss the implications our experiences have for future caching design and research.

FULL PAPER: pdf / slides / talk video