PARALLEL DATA LAB 

PDL Abstract

QPipe: A Simultaneously Pipelined Relational Query Engine

SIGMOD 2005, June 14-16, 2005, Baltimore, Maryland, USA.

Stavros Harizopoulos, Anastassia Ailamaki, Vladislav Shkapenyuk*

School of Computer Science
Carnegie Mellon University
Pittsburgh, PA 15213

*Rutgers University, work done while at CMU

http://www.pdl.cmu.edu/

Relational DBMS typically execute concurrent queries independently by invoking a set of operator instances for each query. To exploit common data retrievals and computation in concurrent queries, researchers have proposed a wealth of techniques, ranging from buffering disk pages to constructing materialized views and optimizing multiple queries. The ideas proposed, however, are inherently limited by the query-centric philosophy of modern engine designs. Ideally, the query engine should proactively coordinate same-operator execution among concurrent queries, thereby exploiting common accesses to memory and disks as well as common intermediate result computation.

This paper introduces on-demand simultaneous pipelining (OSP), a novel query evaluation paradigm for maximizing data and work sharing across concurrent queries at execution time. OSP enables proactive, dynamic operator sharing by pipelining the operator’s output simultaneously to multiple parent nodes. This paper also introduces QPipe, a new operator-centric relational engine that effortlessly supports OSP. Each relational operator is encapsulated in a micro-engine serving query tasks from a queue, naturally exploiting all data and work sharing opportunities. Evaluation of QPipe built on top of BerkeleyDB shows that QPipe achieves a 2x speedup over a commercial DBMS when running a workload consisting of TPC-H queries.

FULL PAPER: pdf