Recent PDL Publications


The PDL Packet - Fall 2024 Newsletter

3 Papers at SOSP!

LithOS: An Operating System for Efficient Machine Learning on GPUs

Patrick H. Coppock, Brian Zhang, Eliot H. Solomon, Vasilis Kypriotis, Leon Yang, Bikash Sharma, Dan Schatzberg, Todd C. Mowry, Dimitrios Skarlatos

SOSP ’25, October 13–16, 2025, Seoul, Republic of Korea.

The surging demand for GPUs in datacenters for machine learning (ML) workloads has made efficient GPU utilization crucial. However, meeting the diverse needs of individual ML models while optimizing resource usage is challenging. To enable transparent, fine-grained management of GPU resources that maximizes GPU utilization and energy efficiency while maintaining strong isolation, an operating systems (OS) approach is needed. Hence this paper introduces LithOS, a first step towards a GPU OS.[...more]

COpter: Efficient Large-Scale Resource-Allocation via Continual Optimization

Suhas Jayaram Subramanya, Don Kurian Dennis, Gregory R. Ganger, Virginia Smith

SOSP ’25, October 13–16, 2025, Seoul, Republic of Korea.

Optimization-based resource allocation in large-scale systems often must trade-off responsiveness and allocation quality. Generally, allocations are reconsidered every few minutes (a round) by formulating and solving a new optimization problem. This paper introduces continual optimization, which reframes round-based resource allocation as a sequence of interconnected problems, leveraging the observation that these resource allocation problems often only change by small amounts across successive rounds to reduce solving times. [...more]

Moirai: Optimizing Placement of Data and Compute in Hybrid Clouds

Ziyue Qiu, Hojin Park, Jing Zhao, Yukai Wang, Arnav Balyan, Gurmeet Singh, Yangjun Zhang, Suqiang (Jack) Song, Gregory R. Ganger, George Amvrosiadis

SOSP ’25, October 13–16, 2025, Seoul, Republic of Korea.

The deployment of large-scale data analytics between onpremise and cloud sites, i.e., hybrid clouds, requires careful partitioning of both data and computation to avoid massive networking costs. We present Moirai, a cost-optimization framework that analyzes job accesses and data dependencies and optimizes the placement of both in hybrid clouds. Moirai informs the job scheduler of data location and access predictions [...more]


Recent PDL News

PDL Alum Jure Leskovec Receives CMU 2025 Alumni Achievement Award

Jure Leskovec’s pioneering work in data science, machine learning and network science has shaped the way complex systems are studied and applied across academia, industry and society, proving what can be accomplished when talented, driven people put their hearts in the work. The awards are presented to alumni for exceptional accomplishment and leadership in their fields or vocations ...

Read More »

Best Research Paper Runner-up at VLDB'25

Congratulations to Sam Arch and co-authors Yuchen Liu, Todd Mowry, Jignesh Patel, Andrew Pavlo on receiving the Best Research Paper Runner-Up award at the 51st International Conference on Very Large Data Bases, held in London, UK ...

Read More »

Akshitha Sriraman Wins Google ML & Systems Award

Congratulations to Akshitha as she receives an inaugural 2025 Google ML and Systems Junior Faculty Award in recognition of the significance and promise of her work in HW/SW Co-Design and Acceleration ...

Read More »