PDL Abstract

Efficient Use of the Query Optimizer for Automated
Physical Design

Proceedings of the 33rd International Conference on Very Large Data Bases (VLDB’07), Vienna, Austria, September 2007.

Stratos Papadomanolakis, Debabrata Dash, Anastasia Ailamaki

Computer Science Department
Carnegie Mellon Universit


MultiMap is an algorithm for mapping multidimensional datasets so as to preserve the data’s spatial locality on disks. Without revealing disk-specific details to applications, MultiMap exploits modern disk characteristics to provide full streaming bandwidth for one (primary) dimension and maximally efficient non-sequential access (i.e., minimal seek and no rotational latency) for the other dimensions. This is in contrast to existing approaches, which either severely penalize non-primary dimensions or fail to provide full streaming bandwidth for any dimension. Experimental evaluation of a prototype implementation demonstrates MultiMap’s superior performance for range and beam queries. On average,MultiMap reduces total I/O time by over 50% when compared to traditional linearized layouts and by over 30%when compared to space-filling curve approaches such as Z-ordering and Hilbert curves. For scans of the primary dimension, MultiMap and traditional linearized layouts provide almost two orders of magnitude higher throughput than space-filling curve approaches.