PARALLEL DATA LAB 

PDL Abstract

Storage Device Performance Prediction with CART Models

12th Annual Meeting of the IEEE/ACM International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS). Volendam, The Netherlands. October 5-7, 2004. Supercedes Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-04-103, March 2004.

Mengzhi Wang, Kinman Au*, Anastassia Ailamaki, Anthony Brockwell*, Christos Faloutsos, Gregory R. Ganger**

School of Computer Science
Dept. of Statistics*
Dept. Electrical and Computer Engineering**
Carnegie Mellon University
Pittsburgh, PA 15213

http://www.pdl.cmu.edu/

Storage device performance prediction is a key element of self-managed storage systems. This work explores the application of a machine learning tool, CART models, to storage device modeling. Our approach predicts a device’s performance as a function of input workloads, requiring no knowledge of the device internals. We propose two uses of CART models: one that predicts per-request response times (and then derives aggregate values) and one that predicts aggregate values directly from workload characteristics. After being trained on the device in question, both provide accurate black-box models across a range of test traces from real environments. Experiments show that these models predict the average and 90th percentile response time with an relative error as low as 19% when the training workloads are similar to the testing workloads and a good interpolation across different workloads.

KEYWORDS: Performance prediction, storage device modeling

FULL PAPER: pdf / postscript
ORIGINAL TR VERSION OF THIS PAPER: pdf