Thursday, December 8, 2005
12.00 noon – 1.00 pm
Intel Seminar (CIC Suite 410)
EVENTS PAGE: http://www.intel-research.net/pittsburgh/events.htm
Intel Research Pittsburgh
On Multidimensional Data and Modern Disks
With the deeply-ingrained notion that disks can efficiently access only one dimensional data, current approaches for mapping multidimensional data to disk blocks either allow efficient accesses in only one dimension, trading off the efficiency of accesses in other dimensions, or equally penalize access to all dimensions. Yet, existing technology and functions readily available inside disk firmware can identify non-contiguous logical blocks that preserve spatial locality of multidimensional datasets. These blocks, which span on the order of a hundred adjacent tracks, can be accessed with minimal positioning cost. This paper details these technologies, analyzes their trends, and shows how they can be exposed to applications while maintaining existing abstractions. The described approach can achieve the best possible access efficiency afforded by the disk technologies: sequential access along the primary dimension and access with minimal positioning cost for all other dimensions. Experimental evaluation of a prototype implementation demonstrates a reduction of overall I/O time for multidimensional data queries between 30% and 50% when compared to existing approaches.
Steve Schlosser joined Intel Research Pittsburgh in 2004 and has been working in the areas of data storage optimization for databases and scientific computing, execution logging for safety and performance in multicore architectures, and other aspects of intelligent storage infrastructures. He graduated from Carnegie Mellon University in 2004, having studied the impact on computer systems of alternative storage technology built using MEMS.
Contact Kim Kaan, 412-605-1203,
or visit http://www.intel-research.net.
SDI Home: http://www.pdl.cmu.edu/SDI/