In the last two decades, both researchers and vendors have built advisory tools to assist database administrators in various aspects of system tuning and physical design. Most of this previous work, however, is incomplete because they still require humans to make the final decisions about any changes to the database and are reactionary measures that fix problems after they occur.
What is needed for a truly “self-driving” database management system (DBMS) is a new architecture that is designed for autonomous operation. This is different than earlier attempts because all aspects of the system are controlled by an integrated planning component that not only optimizes the system for the current workload, but also predicts future workload trends so that the system can prepare itself accordingly. With this, the DBMS can support all of the previous tuning techniques without requiring a human to determine the right way and proper time to deploy them. It also enables new optimizations that are important for modern high-performance DBMSs, but which are not possible today because the complexity of managing these systems has surpassed the abilities of human experts.
Peloton is a relational database management system designed for fully autonomous optimization of hybrid workloads. It is built by students and researchers at the Carnegie Mellon Database Research Group. See the people page for the full listing of contributors.
- Postgres wire-protocol and JDBC compatible.
- Native support for byte-addressable non-volatile memory (NVM) storage technology.
- Lock-free multi-version concurrency control.
- Integrated artificial intelligence components that enable autonomous optimizations.
- High-performance, lock-free Bw-Tree for indexing.
- 100% Open-Source (Apache Software License v2.0)
Dana Van Aken
- Self-Driving Database Management Systems. A. Pavlo, G. Angulo, J. Arulraj, H. Lin, J. Lin, L. Ma, P. Menon, T. Mowry, M. Perron, I. Quah, S. Santurkar, A. Tomasic, S. Toor, D. V. Aken, Z. Wang, Y. Wu, R. Xian, and T. Zhang. In CIDR 2017, Conference on Innovative Data Systems Research. January 8-11, 2017, Chaminade, CA.
Abstract / PDF [680K]
- An Empirical Evaluation of In-Memory Multi-Version Concurrency Control. Yingjun Wu, Joy Arulraj, Jiexi Lin, Ran Xian, Andrew Pavlo. Proceedings of the VLDB Endowment, vol. 10, iss. 7, pages. 781—792, March 2017.
Abstract / PDF [660K]
- Bridging the Archipelago between Row-Stores and Column-Stores for Hybrid Workloads. Joy Arulraj, Andrew Pavlo, Prashanth Menon. SIGMOD’16, June 26-July 01, 2016, San Francisco, CA, USA.
Abstract / PDF [575K]
- Write-Behind Logging. J. Arulraj, M. Perron, A. Pavlo. Proc. VLDB Endow., vol. 10, pp. 337-348, December, 2016.
Abstract / PDF [931K]
- Let’s Talk About Storage & Recovery Methods for Non-Volatile Memory Database Systems. Joy Arulraj, Andrew Pavlo, Subramanya R. Dulloor. Proceedings ACM SIGMOD, Melbourne, Victoria, Australia, May 31-June 4, 2015.
Abstract / PDF [1M]
- The latest version of Peloton is available for download from the public Github repository. Refer to these instructions for building and installing.
- More info on supported platforms, disclaimers, etc.
We thank the members and companies of the PDL Consortium: Alibaba Group, Amazon, Datrium, Facebook, Google, Hewlett Packard Enterprise, Hitachi Ltd., Intel Corporation, IBM, Microsoft Research, NetApp, Inc., Oracle Corporation, Pure Storage, Salesforce, Samsung Semiconductor Inc., Seagate Technology, Two Sigma, and Western Digital for their interest, insights, feedback, and support.