Parallel Data Laboratory

FAWN: A Fast Array of Wimpy Nodes

FAWN is a fast, scalable, and energy-efficient cluster architecture for data-intensive computing. A FAWN cluster links together a large number of "wimpy" nodes built using energy-efficient processors and small amounts of flash memory into an ensemble cluster that can perform the same amount of work as a traditional cluster but at a fraction of the power.

We have designed and implemented a clustered key-value storage system, FAWN-KV, that runs atop these node. Nodes in FAWN-KV use a specialized log-like back-end hash-based datastore (FAWN-DS) to ensure that the system can absorb the large write workload imposed by frequent node arrivals and departures. FAWN-KV uses a two-level cache hierarchy to ensure that imbalanced workloads cannot create hot-spots on one or a few wimpy nodes that impair the system's ability to service queries at its guaranteed rate.

Our evaluation of a small-scale FAWN cluster and several candidate FAWN node systems suggest that FAWN can be a practical approach to building large-scale storage for seek-intensive workloads. Our further analysis indicates that a FAWN cluster is cost-competitive with other approaches (e.g., DRAM, multitudes of magnetic disks, solid-state disk) to providing high query rates, while consuming significantly less power.

FAWN 5G, 4G, 3G, 2G, and 1G Prototypes

FAWN 5G

FAWN 4G

FAWN 3G

FAWN 2G

FAWN 1G

People

FACULTY

Dave Andersen

GRAD STUDENTS

Jason Franklin
Amar Phanishayee
Iulian Moraru
Lawrence Tan
Vijay Vasudevan

Publications

Using Vector Interfaces to Deliver Millions of IOPS from a Networked Key-value Storage Server. Vijay Vasudevan, Michael Kaminsky, David G. Andersen. SOCC'12, October 14-17, 2012, San Jose, CA USA.
Abstract / PDF [648K]
FAWNSort: Energy-efficient Sorting of 10GB, 100GB, and 1TB (2012). Padmanabhan Pillai, Michael Kaminsky, Michael A. Kozuch, David Andersen. Winner of 2012 10GB, 100GB, and 1TB, Joulesort Daytona and Indy categories.
PDF
SILT: A Memory-Efficient, High-Performance Key-Value Store. Hyeontaek Lim, Bin Fan, David Andersen, Michael Kaminsky. In Proc. 23nd ACM Symposium on Operating Systems Principles (SOSP 2011), Cascais, Portugal. October 2011.
Abstract / PDF [1.15M]
Small Cache, Big Effect: Provable Load Balancing for Randomly Partitioned Cluster Services. Bin Fan, Hyeontaek Lim, David Andersen, Michael Kaminsky. In Proc. ACM Symposium on Cloud Computing (SOCC 2011), Cascais, Portugal. October 2011.
Abstract / PDF [336K]
FAWN: A Fast Array of Wimpy Nodes. David Andersen, Jason Franklin, Michael Kaminsky, Amar Phanishayee, Lawrence Tan, Vijay Vasudevan. In Communications of the ACM, July 2011.
Abstract / PDF
FAWNSort: Energy-efficient Sorting of 10GB. Vijay Vasudevan Lawrence Tan, David Andersen, Michael Kaminsky, Michael A. Kozuch, Padmanabhan Pillai, Winner of 2010 10GB Joulesort, Daytona and Indy categories. http://sortbenchmark.org/. July 2010
Abstract / PDF [90K]
Energy-efficient Cluster Computing with FAWN: Workloads and Implications. Vijay Vasudevan David Andersen, Michael Kaminsky, Lawrence Tan, Jason Franklin, Iulian Moraru . Proceedings of 1st Int'l Conf. on Energy-Efficient Computing & Networking (e-Energy 2010), Univ. of Passau, Germany. April 13-15, 2010.
Abstract / PDF [645K]
FAWN: A Fast Array of Wimpy Nodes. David Andersen, Jason Franklin, Michael Kaminsky, Amar Phanishayee, Lawrence Tan, Vijay Vasudevan. Proc. 22nd ACM Symposium on Operating Systems Principles (SOSP 2009), Big Sky, MT. October 2009. BEST PAPER AWARD!
Abstract / PDF [332K]
FAWNdamentally Power-efficient Clusters. Vijay Vasudevan, Jason Franklin, David Andersen, Amar Phanishayee, Lawrence Tan, Michael Kaminsky, and Iulian Moraru. Proc. 12th Workshop on Hot Topics in Operating Systems (HotOS XII), Monte Verita, May 2009.
Abstract / PDF

Source Code

Source code for basic FAWN-KV
https://github.com/vrv/FAWN-KV
Source code for SILT
https://github.com/silt/silt

Acknowledgements

We thank the members and companies of the PDL Consortium: Amazon, Google, Hitachi Ltd., Honda, Intel Corporation, IBM, Meta, Microsoft Research, Oracle Corporation, Pure Storage, Salesforce, Samsung Semiconductor Inc., Two Sigma, and Western Digital for their interest, insights, feedback, and support.