Monday, July 11, 2011 - NOTE SPECIAL DAY
12:00 pm - 1:00 pm
PLACE: CIC 2101
Rodrigo Fonseca, Brown University
TITLE: Experiences with Causal Tracing using X-Trace
Diagnosing problems in distributed systems is notoriously hard, and getting more so as the scale of the systems we build increases exponentially.Task-centric tracing, unlike more the common device-centric monitoring,enables us to causally trace the complete execution of a distributed system across the boundaries of applications, components, protocols, and administrative domains. In this talk, I argue that causal, end-to-end tracing should be an integral part of distributed systems. Moreover, it is not fundamentally difficult to achieve, given a primitive that propagates task metadata alongside logical execution and communication paths. X-Trace is a framework that relies on such propagation to provide comprehensive causal tracing. We report on our experience integrating X-Trace into several production networked services -- including 802.1X authentication, Web content distribution, and DNS-based replica selection -- to illustrate benefits of causal tracing, and to discuss the instrumentation of different protocols and component architectures. We highlight the challenges we encountered and techniques we developed to better integrate causal tracing into systems, and how a standardized tracing framework could improve the way we approach the problem moving forward.
Rodrigo Fonseca is an assistant professor at the Brown University's Computer Science Department, where he does research in distributed systems, networking, and operating systems, with a focus on understanding and improving their execution in terms of performance and energy-efficiency. He obtained his PhD from UC Berkeley in 2008, and worked as a post-doctoral researcher at Yahoo! Research prior to joining Brown.
SDI / LCS Seminar Questions?
Karen Lindenfelser, 86716, or visit www.pdl.cmu.edu/SDI/