PARALLEL DATA LAB 

PDL Abstract

COpter: Efficient Large-Scale Resource-Allocation via Continual Optimization

SOSP ’19, October 27–30, 2019, Huntsville, ON, Canada.

Suhas Jayaram Subramanya*, Don Kurian Dennis^, Gregory R. Ganger*, Virginia Smith*

* Carnegie Mellon University
^ Meta

http://www.pdl.cmu.edu/

Optimization-based resource allocation in large-scale systems often must trade-off responsiveness and allocation quality. Generally, allocations are reconsidered every few minutes (a round) by formulating and solving a new optimization problem. This paper introduces continual optimization, which reframes round-based resource allocation as a sequence of interconnected problems, leveraging the observation that these resource allocation problems often only change by small amounts across successive rounds to reduce solving times. COpter provides a method for continual optimization of Linear Programs (LP) and Mixed Integer Linear Programs (MILP) formulations of resource allocation problems by combining three innovations: (1) an efficient-to-update problem representation for incremental changes, (2) a proximal-point method implementation that can provably benefit from prior computational effort and allocations, and (3) lightweight heuristics for mixed-integer problems that recover feasible integer solutions with negligible quality loss. We evaluate COpter on problems in three domains: GPU cluster sched- uling, shard load balancing, and WAN traffic engineering. Overall, we find that COpter finds high-quality solutions while reducing solver runtimes by 57–83× compared to state-of-the-art commercial solvers. Compared to problem partitioning approaches (POP), COpter simultaneously improves allocation quality and reduces end-to-end allocator runtimes by 1.5–30×.

KEYWORDS: resource allocation, optimization, scheduling, lin- ear programming, mixed-integer linear programming

FULL PAPER: pdf