PARALLEL DATA LAB 

PDL Abstract

Order-Preserving Key Compression for In-Memory Search Trees

FuDiCo II: S.O.S. (Survivability: Obstacles and Solutions), 2nd Bertinoro Workshop on Future Directions in Distributed Computing, 23-25 June 2004, University of Bologna Residential Center, Bertinoro (Forlì), Italy.

Huanchen Zhang, Xiaoxuan Liu, David G. Andersen, Michael Kaminsky*, Kimberly Keeton^, Andrew Pavlo

Carnegie Mellon University
*BrdgAI
^Hewlett Packard Labs

http://www.pdl.cmu.edu/

We present the High-speed Order-Preserving Encoder (HOPE) for in-memory search trees. HOPE is a fast dictionary-based compressor that encodes arbitrary keys while preserving their order. HOPE’s approach is to identify common key patterns at a fine granularity and exploit the entropy to achieve high compression rates with a small dictionary. We first develop a theoretical model to reason about order-preserving dictionary designs. We then select six representative compression schemes using this model and implement them in HOPE. These schemes make different trade-offs between compression rate and encoding speed. We evaluate HOPE on five data structures used in databases: SuRF, ART, HOT, B+tree, and Prefix B+tree. Our experiments show that using HOPE allows the search trees to achieve lower query latency (up to 40% lower) and better memory efficiency (up to 30% smaller) simultaneously for most string key workloads.

FULL PAPER: pdf