Companion of the 2025 International Conference on Management of Data, 2025, pp. 247-250. June 22 - 27, 2025, Berlin, Germany.
Patrick Wang, Wan Shen Lim, William Zhang, Samuel Arch, Andrew Pavlo
Carnegie Mellon University
http://www.pdl.cmu.edu/
Machine learning (ML) has gained traction in academia and industry for database management system (DBMS) automation. Although studies demonstrate that ML-based tuning agents match or exceed human expert performance in optimizing DBMSs, researchers continue to build bespoke tuning pipelines from the ground up. The lack of a reusable infrastructure leads to redundant engineering effort and increased difficulty in comparing modeling methods. This paper demonstrates the database gym framework, a standardized training environment that provides a unified API of pluggable components. The database gym simplifies ML model training and evaluation to accelerate autonomous DBMS research. In this demonstration, we showcase the effectiveness of automated tuning and the gym’s ease of use by allowing a human expert to compete against an ML-based tuning agent implemented in the gym.
FULL PAPER: pdf