PARALLEL DATA LAB 

PDL Abstract

Diagnosing Performance Problems by Visualizing and Comparing System Behaviours

Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-10-103. February 2010.

Raja R. Sambasivan, Alice X. Zheng†, Elie Krevat, Spencer Whitman, Gregory R. Ganger

Electrical and Computer Engineering
Carnegie Mellon University
Pittsburgh, PA 15213

†Microsoft Research

http://www.pdl.cmu.edu/

Spectroscope is a new toolset aimed at assisting developers with the long-standing challenge of performance debugging in distributed systems. To do so, it mines end-to-end traces of request processing within and across components. Using Spectroscope, developers can visualize and compare system behaviours between two periods or system versions, identifying and ranking various changes in the flow or timing of request processing. Examples of how Spectroscope has been used to diagnose real performance problems seen in a distributed storage system are presented, and Spectroscope’s primary assumptions and algorithms are evaluated.

KEYWORDS: comparing system behaviours, end-to-end tracing, hypothesis testing, performance debugging, performance problem diagnosis, response-time mutations, request-flow graphs, structural mutations, structural performance problems, visualizing system behaviour

FULL TR: pdf