PDL PROJECTS

Incremental Computation (ThomasDB)

Contact: Andy Pavlo

Automatic incrementalization is a difficult problem, as it requires the development of sophisticated algorithms and a runtime system to support performant code. There is a natural tension between generality and efficiency for this problem, especially when developing practical systems. Existing self-adjusting computation techniques, however, are suitable for single node, sequential systems.

We are applying incremental computation techniques to large-scale dynamic data sets and massively parallel fault-tolerant systems to enable more efficient and responsive machine learning. The key challenges are as follows:

  1. How to parallelize existing incremental computation methods?
  2. How to develop new techniques for managing the large data sets containing the system’s runtime metadata (e.g. dependency information, results of intermediate computations) that are needed to support incremental computation?

We are also exploring richer forms of incremental computation that allow for more interesting forms of interaction with data. This will enable incremental computation to be used within a computation itself. We are also investigating new abstractions for expressing and implementing interactive computations at a high level and efficiently. We have found that many interactive computations today are expressed based on an event-driven programming model, which leads to low-level programs that can be difficult to reason about. Our goal is to develop abstractions that are both high level and efficient. The project will apply and improve techniques from elastic database management systems (DBMS), high-performance transaction processing, and non-relational DBMSs. The runtime system will employ machine learning techniques to automatically reconfigure the data partitioning and placement strategies based on changes in database access patterns.

People

FACULTY

Andy Pavlo
Umut Acar

GRAD STUDENTS

Thomas Marshall
Ram Raghunathan

Publications

Acknowledgements

This research is funded (in part) by Microsoft Research.

We thank the members and companies of the PDL Consortium: Amazon, Google, Hitachi Ltd., Honda, Intel Corporation, IBM, Meta, Microsoft Research, Oracle Corporation, Pure Storage, Salesforce, Samsung Semiconductor Inc., Two Sigma, and Western Digital for their interest, insights, feedback, and support.

^TOP

 

 

© 2024. Legal Info.
Last updated 29 August, 2016