PARALLEL DATA LAB 

PDL Abstract

Modeling the Relative Fitness of Storage

SIGMETRICS’07, June 12–16, 2007, San Diego, California, USA.

Michael P. Mesnier*, Matthew Wachs, Raja R. Sambasivan, Alice X. Zheng, Gregory R. Ganger

Parallel Data Laboratory,
Carnegie Mellon University
Pittsburgh, PA 15213

* CMU and Intel

http://www.pdl.cmu.edu/

Relative fitness is a new black-box approach to modeling the performance of storage devices. In contrast with an absolute model that predicts the performance of a workload on a given storage device, a relative fitnss model predicts performance differences between a pair of devices. There are two primary advantages to this approach. First, because a relative fitness model is constructed for a device pair, the application-device feedback of a closed workload can be captured (e.g., how the I/O arrival rate changes as the workload moves from device A to device B). Second, a relative fitness model allows performance and resource utilization to be used in place of workload characteristics. This is beneficial when workload characteristics are difficult to obtain or concisely express (e.g., rather than describe the spatio-temporal characteristics of a workload, one could use the observed cache behaviour of device A to help predict the performance of B).

This paper describes the steps necessary to build a relative fitness model, with an approach that is general enough to be used with any black-box modeling technique. We compare relative fitness models and absolute models across a variety of workloads and storage devices. On average, relative fitness models predict bandwidth and throughput within 10--20% and can reduce prediction error by as much as a factor of two when compared to absolute models.

KEYWORDS: black-box, storage, modeling, CART

FULL PAPER: pdf