PDL Abstract

Easing the Management of Data-parallel Systems via Adaptation

Appears in the Proceedings of the 9th ACM SIGOPS European Workshop, Kolding, Denmark, September 17-20, 2000.

David Petrou, Khalil Amiri*, Gregory R. Ganger* and Garth A. Gibson

School of Computer Science
Dept. of Electrical and Computer Engineering*
Carnegie Mellon University
Pittsburgh, PA 15213


In recent years we have seen an enormous growth in the size and prevalence of data-mining workloads. We argue that high availability and fast turnaround for these workloads can only be realized by dynamically tuning a number of system parameters. Further, we argue that this tuning should be provided automatically by the system. We contribute a framework that enables the expression of a variety of data-parallel applications, but which is also sufficiently restricted so that the system can tune itself. This framework is part of the Abacus migration system, whose function placement algorithms are extended to reason about how many nodes should participate in a data-parallel computation, how to split up application objects among a client and server cluster, how often program state should be checkpointed, and the interaction (sometimes conflicting) between these questions.

FULL PAPER: pdf / postscript