22nd Annual/1st Virtual Spring Visit Day
June 23 - 24, 2020
initial list of visit day resources
PDL RESEARCH OVERVIEW TALKS
- Overview of Storage for HPC and ML Research -- Prof. George Amvrosiadis
- Overview of Caching Systems and Memory Services Research -- Prof. Nathan Beckmann
- Overview of Data Lake and Cloud Computing Research -- Prof. Greg Ganger
- Overview of Parallel and NVM-aware Computing Research -- Prof. Phil Gibbons
- Overview of Database Systems Research -- Prof. Andy Pavlo
- Overview of Coding for Systems Research -- Prof. Rashmi Vinayak
BOF Sessions
- ML Systems
- Scalable Storage and Data Lakes
- Storage Reliability and Caching Systems
- Exploiting new storage technologies
- Computer Architecture, Compilers, Parallelization
- Databases
industry talks
Big Memory through Virtual Cluster Memory
Marcos K. Aguilera, Senior Staff Researcher, VMware
Experience using RDMA in Oracle Database
Namrata Jampani,Principal Member of Technical Staff, Oracle
Workload-optimized SSDs
Pankaj Mehra, VP of Product Planning, Samsung Electronics
Adapting Technology Development for the IT4.0 Datashpere
Paul Kusbel Senior Director, Seagate Research Group
Operationalising ML is Mostly a Systems Problem
Eno Thereska, Principal Engineer, Amazon
Improving Hypervisor Security
Carlo Bertolli, Research Staff Member and Manager, IBM
Zoned Namespaces: Take Control of Your Data
Matias Bjørling, Director, Emerging System Architectures, Western Digital
Metastable Failure States in Distributed Systems
Nathan Bronson, Software Engineer, Facebook
Advancements in Integrating Application Caves
Garret Swart, Oracle
Mind Your State for Your State of Mind
Pat Helland, Principal Architect, Salesforce
Initial List of StudenT Posters and Video Intros
File Systems Unfit as Distributed Storage Backends: Lessons from 10 Years of Ceph Evolution
Abutalib Aghayev, Sage Weil (Red Hat), Michael Kuchnik, Mark Nelson (Red Hat), Greg Ganger, George Amvrosiadis
Active Learning for ML Enhanced Database Systems
Lin Ma, Bailu Ding, Sudipto Das, Adith Swaminathan
HeART: A Case for Exploiting Disk-Reliability Heterogeneity
Saurabh Kadekodi, Rashmi Vinayak, Greg Ganger
Order-Preserving Key Compression for In-Memory Search Trees
Huanchen Zhang (CMU), Xiaoxuan Liu (CMU), David G. Andersen (CMU), Michael Kaminsky (BrdgAI), Kimberly Keeton (Hewlett Packard Labs), Andrew Pavlo (CMU)
Analysis of inter-job dependencies in Microsoft Cosmos
Andrew Chung (CMU), Carlo Curino (Microsoft), Subru Krishnan (Microsoft), Konstantinos Karanasos (Microsoft), Greg Ganger (CMU)
Wing: Unearthing Inter-job Dependencies for Better Cluster Scheduling
Andrew Chung (CMU), Carlo Curino (Microsoft), Subru Krishnan (Microsoft), Konstantinos Karanasos (Microsoft), Greg Ganger (CMU)
DeltaFS: Reforging FS For Monster-Scale Computing
Qing Zheng, Chuck Cranor, Greg Ganger, Garth Gibson, George Amvrosiadis, Brad Settlemyer (LANL), Gary Grider (LANL)
Low-Overhead, Error-Resilient ML Inference via Coded Computation
Jack Kosaian, Rashmi Vinayak
Accelerated Cloud for AI (ACAI)
Mengxin Cao, Zhiran Chen, Jin Han, Zijing Gu, Baljit Singh, Zhengzhe Yang, Sihan Yue, Eric Nyberg, Majd Sakr
End-to-End Compiler Optimizations for Microservice Based Applications
Pratik Fegade, Chris Fallin, Todd Mowry, Phil Gibbons, Christian Wimmer
The CacheLib Caching Engine: Experiences with Caching at Scale
Benjamin Berg, Daniel S. Berger, Sara McAllister, Isaac Grosof, Nathan Beckmann, Mor Harchol-Balter, Greg Ganger
Convertible Codes: A New Class of Codes for Efficient Conversion of Coded Data in Distributed Storage
Francisco Maturana, Chaitanya Mukka, Rashmi Vinayak
Hardware-Accelerated Control Flow Integrity
Dominic Chen, Wen Shih Lim, Phil Gibbons, Bryan Parno, James Hoe
Reconciling LSM-Trees with Modern Drives using BlueFS
Abutalib Aghayev, Matias Bjorling*, Hans Holmsberg*, Marc Acosta*, Dan Helmick*, Sage Weil^, Greg Ganger, George Amvrosiadis; *Western Digital, Inc., ^Red Hat, Inc.
Mochi: Versatile Data Management Services for Future DOE Science
George Amvrosiadis, Greg Ganger, Chuck Cranor, Qing Zheng, Ankush Jain (CMU); Rob Ross, Phil Carns, Matthieu Dorier, Rob Latham, Shane Snyder (ANL); Galen Shipman, Brad Settlemyer, Gary Grider (LANL); Jerome Soumagne (HDF Group)
I/O Pattern Auto-detection for Better Cluster Job Scheduling
Chengze Fan, Minghua Deng, Mengyang Lyu, Michael Kuchnik, Chuck Cranor, Elisabeth Moore (LANL), Nathan DeBardeleben (LANL), George Amvrosiadis
The ATLAS Cluster Trace Repository
Michael Kuchnik, Zi Liang, Bryan Hooi, Chuck Cranor, Garth Gibson, George Amvrosiadis (CMU); Drew Gifford (SEI); Elisabeth Moore, Nathan DeBardeleben (LANL)
A Tensor Compiler for Recursive NLP Models
Pratik Fegade, Chris Fallin, Tianqi Chen, Todd Mowry, Phil Gibbons
Scalable Pointer Analysis for Data Structures Using Data Structures
Pratik Fegade, Christian Wimmer
Decode-enabled Storage for Edge Computing
Ziqiang (Edmond) Feng, Shilpa George, Roger Iyengar, Haithem Turki, Padmanabhan Pillai, Jan Harkes, Mahadev Satyanarayanan
PDL Computing Systems
Parallel Data Lab
Physical Makeup of the DCO
Parallel Data Lab
HOCI: Hardware/Software Co-Design and Processing-in-Memory for HTAP Databases
Amirali Boroumand (CMU), Saugata Ghose (CMU), Geraldo F. de Oliveira Jr. (ETH Zurich), Onur Mutlu (ETH Zurich/CMU)
Distributed Range Queries With Low Write Amplification
Ankush Jain, Qing Zheng, Chuck Cranor, Greg Ganger, George Amvrosiadis (CMU), Brad Settlemeyer, Gary Grider (LANL)
DriftSurf: A Risk-competitive Learning Algorithm under Concept Drift
Ellango Jothimurugesan, Phil Gibbons (CMU); Ashraf Tahmasbi, Srikanta Tirthapura (Iowa State University)
Production Dataset Reliability Analyses
Saurabh Kadekodi, Francisco Maturana, Sujas JS, Junchen Yang, Rashmi Vinayak, Greg Ganger
Pacemaker: Eliminating Transition Overload for Device-adaptive Redundancy
Saurabh Kadekodi, Francisco Maturana, Rashmi Vinayak, Greg Ganger, Junchen Yang, Suhas Jayaram
Implementing Pacemaker in HDFS
Jiaan Dai, Jiaqi Zuo, Jiongtao Ye, Sai Kiriti Badam, Xuren Zhou, Saurabh Kadekodi, Rashmi Vinayak, Greg Ganger
High Availability in Cheap Distributed Key Value Storage
Thomas Kim, Daniel Wong, Michael Kaminsky, Greg Ganger, David G. Andersen
Increasing Accelerator Utilization of Specialized Convolutional Neural Network Inference via Folding
Jack Kosaian, Amar Phanisayee, Matthai Philipose, Debadeepta Dey, Rashmi Vinayak
Progressive Compress Records: Taking a Byte out of Deep Learning Data
Michael Kuchnik, George Amvrosiadis, Virginia Smith
Massive Scaling of MASSIF: Algorithm Development for Hooke’s Law Simulations on Distributed GPU Systems
Anuva Kulkarni, Jelena Kovačević, Franz Franchetti
Fair Resource Allocation in Federated Learning
Tian Li, Maziar Sanjabi, Ahmad Beirami, Virginia Smith
Federated Optimization for Heterogeneous Networks
Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, Virginia Smith
NoisePage: The Self-Driving Database Management System
Matt Butrovich, Wan Shen Lim, Andy Pavlo
Efficient Fault Tolerance for Recommendation System Training Using Erasure Codes
Kaige Liu, Jack Kosaian, Rashmi Vinayak
Kangaroo: Caching Small Objects in Flash
Sara McAllister, Daniel Berger, Nathan Beckmann, Greg Ganger
Block-Aware Caching
Charles McGuffey, Nathan Beckmann, Phillip Gibbons
Writeback-Aware Caching
Charles McGuffey, Bernhard Haeupler, Nathan Beckmann, Phillip Gibbons
Sage: Parallel Semi-Asymmetric Graph Algorithms for NVRAMs
Guy Blelloch, Laxman Dhulipala, Phillip Gibbons, Charles McGuffey, Julian Shun
Permutable Compiled Queries: Dynamically Adapting Compiled Queries without Recompiling
Prashanth Menon, Amadou Ngom, Lin Ma, Todd C. Mowry, Andrew Pavlo
More IOPS for Less: Exploiting Burstable Storage in Public Clouds
Hojin Park, George Amvrosiadis, Greg Ganger
Mimir: Navigating the Ocean of Cloud Storage Options
Hojin Park, George Amvrosiadis, Greg Ganger
Pollux: Co-adaptive Cluster Scheduling for Goodput-Optimized Deep Learning
Aurick Qiao, Willie Neiswanger, Qirong Ho, Hao Zhang, Gregory R. Ganger, Eric P. Xing
Jumanji: The Case for Dynamic NUCA in the Datacenter
Brian Schwedock, Nathan Beckmann
GenASM: A Low-Power, Memory-Efficient Approximate String Matching Acceleration Framework for Genome Sequence Analysis
Damla Senol Cali1, Gurpreet S. Kalsi2, Lavanya Subramanian2, Can Firtina3, Anant V. Nori2, Jeremie S. Kim3,1, Zulal Bingöl4, Rachata Ausavarungnirun1, Mohammed Alser3, Juan Gomez-Luna3, Amirali Boroumand1, Allison Scibisz1, Can Alkan4, Sreenivas Subramoney2, Saugata Ghose1, and Onur Mutlu3,1; 1CMU, 2Intel, 3ETH Zurich, 4 Bilkent University
Practical Processing Inside Emerging Memory Technologies
Minh Truong, Liting Shen, James Bain, Rick Carley, Saugata Ghose
ML-based Cache Admission Policies for Flash Storage
Daniel Wong, Daniel Berger, Nathan Beckmann, Greg Ganger
C2DN: How to Code on the Edge for Content Delivery
Juncheng Yang*, Anirudh Sabnis^, Daniel S. Berger*, K. V. Rashmi*, Ramesh Sitaraman^;
*CMU, ^UMass
FilterKV: Fast Online Data Partitioning on Manycore Processors
Qing Zheng, Chuck Cranor, Ankush Jain, Greg Ganger, Garth Gibson, George Amvrosiadis, Brad Settlemyer (LANL), Gary Grider (LANL)
Scaling DeltaFS Indexed Massive Directories to 131,072 CPU Cores
Qing Zheng, Danhao Guo, Chuck Cranor, Greg Ganger, George Amvrosiadis, Garth Gibson, Brad Settlemyer (LANL), Gary Grider (LANL), Fan Guo (LANL)
Fast Trajectory Queries with DeltaFS Indexed Massive Directories
Qing Zheng, Chuck Cranor, George Amvrosiadis, Garth Gibson, Brad Settlemyer (LANL), Gary Grider (LANL), Fan Guo (LANL)