Adam Oliner, Stanford University

TITLE: Using Influence to Understand Complex Systems

When a complex production system fails or just behaves badly, the debugging technique of first resort is to examine system logs or other passively gathered signals that might hold important clues about the nature of the problem. By their very nature, these data are normally noisy and incomplete. We define a statistical notion of "influence" between components that deals gracefully with this situation. Intuitively, two components influence each other if their behavior is statistically correlated, and we can often show correlation even if the available data is insufficient to show why the components might affect one another. We show how to compute influence, how to present influence in a useful form as a Structure-of-Influence Graph among components, and give example applications of these ideas to a variety of production systems, including autonomous vehicles and supercomputers.

Adam Oliner is a PhD student in the Computer Science Department at Stanford University, working with Alex Aiken. Before coming to Stanford, he earned a Master's of Engineering in electrical engineering and computer science at MIT, where he also received undergraduate degrees in computer science and mathematics. He interned several times at IBM with the Blue Gene/L system software team and spent a summer studying supercomputer logs at Sandia National Labs.

