PDL Abstract

Zzyzx: Scalable Fault Tolerance Through Byzantine Locking

Proceedings of the 40th Annual IEEE/IFIP International Conference on Dependable Systems and Networks. Chicago, Illinois, June 2010.

James Hendricks*, Shafeeq Sinnamohideen, Gregory R. Ganger, Michael K. Reiter**

Electrical and Computer Engineering
Carnegie Mellon University
Pittsburgh, PA 15213

**University of North Carolina at Chapel Hill

Zzyzx is a Byzantine fault-tolerant replicated state machine protocol that outperforms prior approaches and provides near-linear throughput scaling. Using a new technique called Byzantine Locking, Zzyzx allows a client to extract state from an underlying replicated state machine and access it via a second protocol specialized for use by a single client. This second protocol requires just one roundtrip and 2 f+1 responsive servers—compared to Zyzzyva, this results in 39–43% lower response times and a factor of 2.2–2.9× higher throughput. Furthermore, the extracted state can be transferred to other servers, allowing nonoverlapping sets of servers to manage different state. Thus, Zzyzx allows throughput to be scaled by adding servers when concurrent data sharing is not common. When data sharing is common, performance can match that of the underlying replicated state machine protocol.