Thursday, March 8, 2012
SPEAKER: Raja Sambasivan, CMU
TITLE: Diagnosing Performance Changes by Comparing Request Flows
In this talk, I describe request-flow comparison, a new technique for automatically localizing the sources of performance changes in distributed systems and a logical first step toward the larger goal of completely automated diagnosis and healing. It uses the key insight that such changes often manifest as mutations in the path requests take through the distributed system---e.g., the components they visit and the functions they access---or in their timing. Exposing these mutations and showing how they differ from previous behaviour localizes the source of the problem and significantly guides developer effort. Case studies of using request-flow comparison to diagnose real, previously undiagnosed problems in a prototype distributed storage service and in certain Google services are presented. Finally, I conclude with a roadmap highlighting future research opportunities on the path to automated diagnosis.