PARALLEL DATA LAB 

PDL Abstract

SALSA: Analyzing Logs as StAte Machines

Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-08-112, September 2008.

Jiaqi Tan, Xinghao Pan, Soila Kavulya, Rajeev Gandhi and Priya Narasimhan

Parallel Data Laboratory
School of Computer Science & Electrical and Computer Engineering
Carnegie Mellon University
Pittsburgh, PA 15213

http://www.pdl.cmu.edu/

SALSA examines system logs to derive state-machine views of the sytem’s execution, along with control-flow, data-flow models and related statistics. Exploiting SALSA’s derived views and statistics, we can effectively construct higher-level useful analyses. We demonstrate SALSA’s approach by analyzing system logs generated in a Hadoop cluster, and then illustrate SALSA’s value by developing visualization and failure-diagnosis techniques, for three different Hadoop workloads, based on our derived statemachine views and statistics.

KEYWORDS: Log Analysis, Hadoop, Failure Diagnosis

FULL TR: pdf