PARALLEL DATA LAB

TetriScope

TetriScope is a combination of the application scheduler TetriSched and the visualization tool Atlas, both integrated into Hadoop's resource manager Yarn.

TetriSched is a high performance plan-ahead scheduler created by PDL researchers. It has been integrated into Yarn to serve as a pluggable scheduler. See the following diagram for a sketch of TetriSched.


[pdf version]

Atlas is web-based user interface developed by PDL. It has been integrated into Yarn's web site to provide a graphic view of the scheduled applications. Although it is initially designed to help researchers refine the TetriSched scheduler, Atlas is a scheduler independent UI, capable of working with any Yarn scheduler.

Following is the TetriScope system diagram, screenshots and interpretation.


[pdf version]

In addition to providing detailed view on how an application is scheduled on various nodes in a data center, Atlas provides a high-level view of aggregated load on per-rack basis for large data centers. It also provides a highly-manipulable timeline for the user to zoom and swipe to adjust the desired time window for viewing the applications.

Following are screencasts of TetriScops showing three scheduling scenarios. They also showcase certain features of Atlas, such as viewing aggregated load and timeline manipulation.

 

Plan ahead scheduling


For best results, use full screen to play. Click to download a full resolution mp4 version of this video.

Global scheduling


For best results, use full screen to play. Click to download a full resolution mp4 version of this video.

Pre-emption scheduling


For best results, use full screen to play. Click to download a full resolution mp4 version of this video.

People

FACULTY

Greg Ganger
Mor Harchol-Balter

STAFF

Bill Courtright
Xiaolin (Charlene) Zang

GRAD STUDENTS

Jun Woo Park
Alexey Tumanov
Timothy Zhu

INDUSTRY COLLABORATORS

Michael A. Kozuch (Intel Labs)

Publications

  • TetriSched: Global Rescheduling with Adaptive Plan-ahead in Dynamic Heterogeneous Clusters. Alexey Tumanov, Timothy Zhu, Jun Woo Park, Michael A. Kozuch, Mor Harchol-Balter, Gregory R. Ganger. ACM European Conference on Computer Systems, 2016 (EuroSys'16), 18th-21st April, 2016, London, UK.
    Abstract / PDF [8M]

  • TetriSched: Space-Time Scheduling for Heterogeneous Datacenters. Alexey Tumanov, Timothy Zhu, Michael A. Kozuch†, Mor Harchol-Balter, Gregory R. Ganger. Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-13-112, December, 2013.
    Abstract / PDF [716K]

Acknowledgements

We thank the members and companies of the PDL Consortium: Amazon, Google, Hitachi Ltd., Honda, Intel Corporation, IBM, Meta, Microsoft Research, Oracle Corporation, Pure Storage, Salesforce, Samsung Semiconductor Inc., Two Sigma, and Western Digital for their interest, insights, feedback, and support.