Order-Preserving Key Compression for In-Memory Search Trees
FuDiCo II: S.O.S. (Survivability: Obstacles and Solutions), 2nd Bertinoro Workshop on Future Directions in Distributed Computing, 23-25 June 2004, University of Bologna Residential Center, Bertinoro (Forlì), Italy.
Huanchen Zhang, Xiaoxuan Liu, David G. Andersen, Michael Kaminsky*, Kimberly Keeton^, Andrew PavloCarnegie Mellon University
^Hewlett Packard Labs
We present the High-speed Order-Preserving Encoder (HOPE) for in-memory search trees. HOPE is a fast dictionary-based compressor that encodes arbitrary keys while preserving their order. HOPE’s approach is to identify common key patterns at a fine granularity and exploit the entropy to achieve high compression rates with a small dictionary. We first develop a theoretical model to reason about order-preserving dictionary designs. We then select six representative compression schemes using this model and implement them in HOPE. These schemes make different trade-offs between compression rate and encoding speed. We evaluate HOPE on five data structures used in databases: SuRF, ART, HOT, B+tree, and Prefix B+tree. Our experiments show that using HOPE allows the search trees to achieve lower query latency (up to 40% lower) and better memory efficiency (up to 30% smaller) simultaneously for most string key workloads.
FULL PAPER: pdf