Distributed speculative execution for reliability and fault tolerance: an operational semantics
- Creators
- Ţăpuş, Cristian
- Hickey, Jason
Abstract
This paper examines the use of speculations, a form of distributed transactions, for improving the reliability and fault tolerance of distributed systems. A speculation is defined as a computation that is based on an assumption that is not validated before the computation is started. If the assumption is later found to be false, the computation is aborted and the state of the program is rolled back; if the assumption is found to be true, the results of the computation are committed. The primary difference between a speculation and a transaction is that a speculation is not isolated—for example, a speculative computation may send and receive messages, and it may modify shared objects. As a result, processes that share those objects may be absorbed into a speculation. We present a syntax, and an operational semantics in two forms. The first one is a speculative model, which takes full advantage of the speculative features. The second one is a nonspeculative, nondeterministic model, where aborts are treated as failures. We prove the equivalence of the two models, demonstrating that speculative execution is equivalent to failure-free computation.
Additional Information
© 2009 Springer. Received: 8 September 2005 Accepted: 12 September 2008. Published online: 11 November 2008. The authors would like to thank Nathan Gray, David Noblet, and Aleksey Nogin for their careful reading of this paper and for their valuable feedback.We also thank the anonymous reviewers for their thoughtful and helpful commentsAdditional details
- Eprint ID
- 16079
- DOI
- 10.1007/s00446-008-0073-1
- Resolver ID
- CaltechAUTHORS:20090928-140300739
- Created
-
2009-10-06Created from EPrint's datestamp field
- Updated
-
2021-11-08Created from EPrint's last_modified field