Proceedings of the 30th VLDB Conference. Toronto, Canada, 29 August - 3 September 2004. Supercedes Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-04-102, March 2004.
Minglong Shao*, Jiri Schindler, Steven W. Schlosser, Anastassia Ailamaki*, Gregory R. Ganger
School of Computer Science*
Electrical and Computer Engineering
Carnegie Mellon University
Pittsburgh, PA 15213
http://www.pdl.cmu.edu/
As database application performance depends on the utilization of the disk and memory hierarchy, and the speed gap between the processor and memory components widens, smart data placement plays a central role in increasing locality and in improving memory utilization. Existing techniques, however, do not optimize accesses to all levels of memory hierarchy and for all the different workloads, because each storage level uses different technology (cache, memory, disks) and each application accesses data using different (often conflicting) patterns. This paper introduces Clotho, a new buffer pool and storage management architecture. Clotho decouples in-memory page layout from data organization on non-volatile storage devices, enabling independent data layout design at each level of the storage hierarchy. Using Clotho, a DBMS can maximize cache and memory utilization by (a) transparently using appropriate data layouts on memory and non-volatile storage, and (b) dynamically synthesizing data pages to follow application access patterns at each level as needed. Clotho enables (a) independently-tailored page layouts for dynamically changing as well as compound workloads, and (b) use of alternative technologies at each level (e.g., disk arrays or MEMS-based storage devices). We describe the Clotho design and implementation using disk array logical volumes and simulated MEMS-based storage devices, and we evaluate performance under a variety of workloads.
KEYWORDS: Disk arrays, disk performance, database access
FULL PAPER, CONFERENCE VERSION: pdf
FULL PAPER, ORIGINAL TR VERSION: pdf