PARALLEL DATA LAB 

PDL Abstract

External vs. Internal: An Essay on Machine Learning Agents for Autonomous Database Management Systems

Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, 42(2): 32-46 (2019).

Andrew Pavlo, Matthew Butrovich, Ananya Joshi, Lin Ma, Prashanth Menon, Dana Van Aken, Lisa Lee, Ruslan Salakhutdinov

Carnegie Mellon University

http://www.pdl.cmu.edu/

The limitless number of possible ways to configure database management systems (DBMSs) has rightfully earned them the reputation of being difficult to manage and tune. Optimizing a DBMS to meet the needs of an application has surpassed the abilities of humans. This is because the correct configuration of a DBMS is highly dependent on a number of factors that are beyond what humans can reason about. The problem is further exacerbated in large-scale deployments with thousands or even millions of individual DBMS installations that each have their own tuning requirements.

To overcome this problem, recent research has explored using machine learning-based (ML) agents for automated tuning of DBMSs. These agents extract performance metrics and behavioral information from the DBMS and then train models with this data to select tuning actions that they predict will have the most benefit. They then observe how these actions affect the DBMS and update their models to further improve their efficacy.

In this paper, we discuss two engineering approaches for integrating ML agents in a DBMS. The first is to build an external tuning controller that treats the DBMS as a black-box. The second is to integrate the ML agents natively in the DBMS’s architecture. We consider the trade-offs of these approaches in the context of two projects from Carnegie Mellon University (CMU).

FULL PAPER: pdf