PARALLEL DATA LAB 

PDL Talk Series

AUgust 12, 2021


TIME
: 12:00 noon - to approximately 1:00 pm EDT
PLACE: Virtual - a zoom link will be emailed closer to the seminar

SPEAKER: Rashmi Vinayak, Assistant Professor, Computer Science Department, Carnegie Mellon University

Efficient, Performant, and Resilient Data Systems: Storage, Caching,
and ML systems
Large-scale data systems, which facilitate storage, caching, computation, and communication of data, form the foundation of digital infrastructure. At large scales of operation, it is imperative to design data systems to be resource efficient, high performance, and resilient to non-ideal operating conditions. In this talk, I will present recent work from my research group (TheSys) towards these goals in systems for data storage, caching, and machine learning.

The primary emphasis of the talk will be on in-memory caching systems. I will first present some surprising findings from our fine-grained analysis of caching workloads at Twitter which uncovers that many workloads are far more write-heavy or more skewed than previously shown, some display unique temporal patterns, and TTL turns out to be a critical parameter defining cache working sets. I will then present Segcache, an in-memory KV cache designed to address these emerging workload characteristics, which achieves high memory efficiency, high throughput, and high scalability simultaneously. Evaluation on production traces shows that Segcache uses up to 60% less memory than state-of-the-art designs for a variety of workloads, while not compromising on throughput and scalability. I will end with a brief overview of ongoing projects on achieving high efficiency, high performance and resilience in storage systems and systems for machine learning.

BIO: Rashmi Vinayak is an assistant professor in the Computer Science department at Carnegie Mellon University. Her research interests broadly lie in computer/networked systems and information/coding theory, and the wide spectrum of intersection between the two areas. Rashmi is a recipient of USENIX NSDI 2021 Community (Best Paper) Award, NSF CAREER Award, Tata Institute of Fundamental Research Memorial Lecture Award, Facebook Distributed Systems Research Award 2019, Google Faculty Research Award 2018, Facebook Communications and Networking Research Award 2017, UC Berkeley Eli Jury Dissertation Award 2016, and IEEE Data Storage Best Paper and Best Student Paper Awards for the years 2011/2012. During her Ph.D. studies, Rashmi was a recipient of Facebook Fellowship 2012-13, the Microsoft Research PhD Fellowship 2013-15, and the Google Anita Borg Memorial Scholarship 2015-16. Rashmi received her Ph.D. from UC Berkeley in 2016, and was a postdoctoral scholar at UC Berkeley's AMPLab/RISELab from 2016-17. Webpage: http://www.cs.cmu.edu/~rvinayak/


CONTACTS


Director, Parallel Data Lab
VOICE: (412) 268-1297


Executive Director, Parallel Data Lab
VOICE: (412) 268-5485


PDL Administrative Manager
VOICE: (412) 268-6716