DATE: Wednesday, Apreil 4, 2012
TIME: 1 pm - 2 pm
PLACE: GHC 4405 (Reddy Conference Room)

Directions: RajReddy Conference Room – GHC 4405 - Hillman Center for Future-Generation Technologies #10b -

SPEAKER: Tim Kraska, Postdoc at UC Berkeley

TITLE: CrowdDB: Answering Impossible Queries

Some database queries cannot be answered by machines alone. Processing such queries requires human input, e.g., for providing information that is missing from the database, for performing computationally difficult functions and for matching, ranking, or aggregating results based on fuzzy criteria. CrowdDB is the first hybrid human/machine database system that uses human input via crowdsourcing to process queries that neither database systems nor search engines can answer adequately. While CrowdDB has many similarities to traditional database systems, there are also important differences. Perhaps most fundamentally, the "closed-world assumption" underlying relational query semantics does not hold in such systems. As a consequence, the meaning of even simple queries is not well defined. Furthermore, monitoring query progress becomes difficult due to the peculiarities of crowdsourced data.

In this talk, I will first provide an overview of the design and implementation of CrowdDB and how we leverage humans during query processing. Afterwards, I will focus on the statistical tools we developed to enable users to reason about the query completeness and the tradeoff between time and cost in the absence of the closed-world assumption. This work was done as part of a larger project around Big Data management. At the end of the talk, I will provide an overview of some of my other projects and give an outline for future work.

Tim Kraska is a PostDoc in the AMPLab, which is part of the Computer Science Division at UC Berkeley. Currently his research focuses on Big Data management in the cloud and hybrid human/machine database systems. Before joining UC Berkeley, Tim Kraska received his PhD from ETH Zurich, where he worked on transaction management and stream processing in the cloud as part of the Systems Group. He received a Swiss National Science Foundation Prospective Researcher Fellowship (2010), a DAAD Scholarship (2006), a University of Sydney Master of Information Technology Scholarship for outstanding achievement (2005), the University of Sydney Siemens Prize (2005), and a VLDB best demo award (2011). (

VISITOR HOST: Garth Gibson

VISITOR COORDINATOR: Jennifer Landefeld (

Karen Lindenfelser, 86716, or visit