Distributed speculative execution for reliability and fault tolerance: an operational semantics

Creators: Ţăpuş, Cristian; Hickey, Jason

Abstract

This paper examines the use of speculations, a form of distributed transactions, for improving the reliability and fault tolerance of distributed systems. A speculation is defined as a computation that is based on an assumption that is not validated before the computation is started. If the assumption is later found to be false, the computation is aborted and the state of the program is rolled back; if the assumption is found to be true, the results of the computation are committed. The primary difference between a speculation and a transaction is that a speculation is not isolated—for example, a speculative computation may send and receive messages, and it may modify shared objects. As a result, processes that share those objects may be absorbed into a speculation. We present a syntax, and an operational semantics in two forms. The first one is a speculative model, which takes full advantage of the speculative features. The second one is a nonspeculative, nondeterministic model, where aborts are treated as failures. We prove the equivalence of the two models, demonstrating that speculative execution is equivalent to failure-free computation.

Additional Information

© 2009 Springer. Received: 8 September 2005 Accepted: 12 September 2008. Published online: 11 November 2008. The authors would like to thank Nathan Gray, David Noblet, and Aleksey Nogin for their careful reading of this paper and for their valuable feedback.We also thank the anonymous reviewers for their thoughtful and helpful comments

	All versions	This version
Views	0	0
Downloads	0	0
Data volume	0 Bytes	0 Bytes

Distributed speculative execution for reliability and fault tolerance: an operational semantics

Abstract

Additional Information

Additional details