Thursday, April 24, 2008
12:00 noon - 1:00 pm
Intel Research Pittsburgh, 4720 Forbes Avenue, CIC Building 4th Floor, Suite 410
NOTE SPECIAL LOCATION
Matei Zaharia and Andy Konwinski
Monitoring and Debugging Hadoop using X-Trace
Today's Internet data center applications manage thousands of commodity machines and deal with node heterogeneity, load balancing, and node failures. This complexity makes data center applications difficult to debug, optimize and monitor. In this talk, we discuss our work using path-based tracing to understand massively parallel applications written using the Hadoop platform. We have instrumented Hadoop using the X-Trace path-based tracing framework. We present case studies showing the utility of tracing in several situations, and interesting Hadoop behavior we have observed. We also show how statistical techniques can be used to automatically identify faulty nodes and unusual software behavior from traces. Our instrumentation provides useful information with very little impact on performance.
Matei Zaharia is a first-year PhD student at UC Berkeley, working with professors Randy Katz and Ion Stoica on tracing and automatic monitoring of large data center applications. He is a member of the RAD Lab (Reliable, Adaptive Distributed systems). He did his undergrad at the University of Waterloo, where he worked with professor Srinivasan Keshav on peer-to-peer networks and technology for emerging regions.
Andy Konwinski is also a first-year PhD student in the RAD Lab at UC Berkeley, advised by Randy Katz. Andy completed his undergraduate degree at the University of Wisconsin. His current research interests include distributed tracing and monitoring frameworks, distributed file systems, and distributed computing frameworks such as MapReduce.
Host: Steven W. Schlosser
Visitor Coordinator: Shellee Lank, email@example.com
or visit http://www.pdl.cmu.edu/SDI/