PARALLEL DATA LAB 

PDL Talk Series

August 7, 2025


TIME
: 12:00 noon - to approximately 1:00 pm EDT
PLACE: Virtual - a zoom link will be emailed closer to the seminar


SPEAKER: Ezra Hoch, Janestreet

Depot: Multi-DC Storage for AI/ML Workloads
As the importance of AI/ML workloads grows, so does the number of GPUs. Both power and availability constraints led Jane Street to run GPU workloads in multiple data-centers. Managing datasets across the estate becomes a challenge, especially in Jane Street's agile and dynamic environment.

I'll talk about Depot, a storage metadata layer that we're building to address Jane Street's use cases; what issues we've seen with using NFS's directory-structure as a metadata layer, the API trade-offs we've considered, what API we landed on (a middle-ground between S3 and a filesystem) and Depot's architecture.

BIOS: Ezra Hoch is a software engineer at Jane Street, working on distributed storage systems. His previous roles at Google include TL-ing GCP’s file solutions, developing GCP’s networking stack and leading Effingo, Google’s global replication system. Prior to Google, he was chief architect at Elastifile, a startup developing a scale-out SSD-optimized filesystem (acquired by Google). His focus is on large-scale distributed infrastructure systems. He got his BSc in Computer Science, MSc and PhD in distributed algorithms, from the Hebrew university of Jerusalem.


CONTACTS


Director, Parallel Data Lab
VOICE: (412) 268-1297


Executive Director, Parallel Data Lab
VOICE: (412) 268-5485


PDL Administrative Manager
VOICE: (412) 268-6716