skip to main content
article

Travelling through wormholes: a new look at distributed systems models

Published: 01 March 2006 Publication History

Abstract

The evolution of distributed computing and applications has put new challenges on models, architectures and systems. To name just one, 'reconciling uncertainty with predictability' is required by today's simultaneous pressure on increasing the quality of service of applications, and on degrading the assurance given by the infrastructure.This challenge can be mapped onto more than one facet, such as time or security or others. In this paper we explore the time facet, reviewing past and present of distributed systems models, and making the case for the use of hybrid (vs. homogeneous) models, as a key to overcoming some of the difficulties faced when asynchronous models (uncertainty) meet timing specifications (predictability). The Wormholes paradigm is described as the first experiment with hybrid distributed systems models.

References

[1]
M. Aguilera, G. Le Lann, and S. Toueg. On the impact of fast failure detectors on real-time fault-tolerant systems. In Proc. of DISC 2002, October 2002.]]
[2]
E. Anceaume, B. Charron-Bost, P. Minet, and S. Toueg. On the formal specification of group membership services. Technical Report RR-2695, INRIA, Rocquencourt, France, November 1995.]]
[3]
M. Ben-Or. Another advantage of free choice: Completely asynchronous agreement protocols. In Proceedings of the 2nd ACM Symposium on Principles of Distributed Computing, pages 27--30, August 1983.]]
[4]
A. Casimiro, P. Martins, and P. Veríssimo. How to build a Timely Computing Base using Real-Time Linux. In Proceedings of the IEEE International Workshop on Factory Communication Systems, pages 127--134, September 2000.]]
[5]
M. Castro and B. Liskov. Practical Byzantine fault tolerance and proactive recovery. ACM Transactions on Computer Systems, 20(4):398--461, November 2002.]]
[6]
T. Chandra, V. Hadzilacos, S. Toueg, and B. Charron-Bost. On the impossibility of group membership. In Proceedings of the 15th ACM Symposium on Principles of Distributed Computing, pages 322--330, May 1996.]]
[7]
T. Chandra and S. Toueg. Unreliable failure detectors for reliable distributed systems. Journal of the ACM, 43(2):225--267, March 1996.]]
[8]
F. Christian and C. Fetzer. The timed asynchronous system model. In Proceedings of the 28th IEEE International Symposium on Fault-Tolerant Computing, pages 140--149, 1998.]]
[9]
M. Correia, N. F. Neves, L. C. Lung, and P. Veríssimo. Low complexity Byzantine-resilient consensus. Distributed Computing, 17(3):237--249, 2005.]]
[10]
M. Correia, P. Veríssimo, and N. F. Neves. The design of a COTS real-time distributed security kernel. In Proceedings of the Fourth European Dependable Computing Conference, pages 234--252, October 2002.]]
[11]
C. Delporte-Gallet, H. Fauconnier, and R. Guerraoui. A realistic look at failure detectors. In Proceedings of the International Conference on Dependable Systems and Networks, pages 213--222, Washington, USA, June 2002.]]
[12]
D. Dolev, C. Dwork, and L. Stockmeyer. On the minimal synchronism needed for distributed consensus. Journal of the ACM, 34(1):77--97, January 1987.]]
[13]
C. Dwork, N. Lynch, and L. Stockmeyer. Consensus in the presence of partial synchrony. Journal of the ACM, 35(2):288--323, April 1988.]]
[14]
Christof Fetzer. Perfect failure detection in timed asynchronous systems. IEEE Trans. Comput., 52(2):99--112, 2003.]]
[15]
M. J. Fischer, N. A. Lynch, and M. S. Paterson. Impossibility of distributed consensus with one faulty process. Journal of the ACM, 32(2):374--382, April 1985.]]
[16]
R. Friedman, A. Moustefaoui, S. Rajsbaum, and M. Raynal. Error correcting codes: A future direction to solve distributed agreement problems? In International Workshop on Future Directions of Distributed Computing, FuDiCo, June 2002.]]
[17]
Roy Friedman, Achour Mostéfaoui, and Michel Raynal. Building and using quorums despite any number of process crashes. In 5th European Dependable Computing Conference (EDCC'05), Budapest, Hungary.]]
[18]
J.M. Helary, M. Hurfin, A. Mostefaoui, M. Raynal, and Tronel F. Computing global functions on asynchronous distributed systems with perfect failure detectors. IEEE Transactions on Parallel and Distributed Systems, 11(9), September 2000.]]
[19]
I. Keidar and S. Rajsbaum. On the cost of fault-tolerant consensus when there are no faults - a tutorial. SIGACTN: SIGACT News (ACM Special Interest Group on Automata and Computability Theory), 32(2):45--63, 2001. Preliminary version, MIT Technical Report MIT-LCS-TR-821, May 24, 2001.]]
[20]
F. Meyer and D. Pradhan. Consensus with dual failure modes. In Proceedings of the 17th IEEE International Symposium on Fault-Tolerant Computing, pages 214--222, July 1987.]]
[21]
A. Mostéfaoui, E. Mourgaya, and M. Raynal. Asynchronous implementation of failure detectors. In Int. IEEE/IFIP Conference on Dependable Systems and Networks (DSN'03), San Francisco (USA).]]
[22]
N. F. Neves, M. Correia, and P. Veríssimo. Solving vector consensus with a wormhole. IEEE Transactions on Parallel and Distributed Systems, 16(12):1120--1131, December 2005.]]
[23]
D. Powell. Fault assumptions and assumption coverage. In Proceedings of the 22nd IEEE International Symposium of Fault-Tolerant Computing, July 1992.]]
[24]
D. Powell, D. Seaton, G. Bonn, P. Veríssimo, and F. Waeselynk. The Delta-4 approach to dependability in open distributed computing systems. In Proceedings of the 18th IEEE International Symposium on Fault-Tolerant Computing, June 1988.]]
[25]
M. Raynal. Short introduction to failure detectors for asynchronous distributed systems. SIGACTN: SIGACT News (ACM Special Interest Group on Automata and Computability Theory), 36(1):53--70, 2005.]]
[26]
Nicola Santoro and Peter Widmayer. Majority and unanimity in synchronous networks with ubiquitous dynamic faults. In SIROCCO, pages 262--276, 2005.]]
[27]
P. Sousa, N. F. Neves, and P. Verissimo. How resilient are distributed f fault/intrusion-tolerant systems? In Proceedings of the IEEE International Conference on Dependable Systems and Networks, June 2005.]]
[28]
Jan van Leeuwen and Jir Wiedermann. Beyond the turing limit: Evolving interactive systems. In Leszek Pacholski and Peter Ruzicka, editors, SOFSEM: Theory and Practice of Informatics, 28th Conference on Current Trends in Theory and Practice of Informatics, volume 2234 of Lecture Notes in Computer Science, pages 90--109, Piestany, Slovak Republic, 2001. Springer.]]
[29]
P. Veríssimo. Uncertainty and predictability: Can they be reconciled? In Future Directions in Distributed Computing, volume 2584 of Lecture Notes in Computer Science, pages 108--113. Springer-Verlag, 2003.]]
[30]
P. Veríssimo and A. Casimiro. The Timely Computing Base model and architecture. IEEE Transactions on Computers, 51(8):916--930, August 2002. Supersedes Tech. Rep. DI/FCUL TR-99-2, Dpt. of Informatics, University of Lisboa, May 1999.]]
[31]
P. Veríssimo, A. Casimiro, and C. Fetzer. The Timely Computing Base: Timely actions in the presence of uncertain timeliness. In Proceedings of the International Conference on Dependable Systems and Networks, pages 533--542, June 2000.]]
[32]
P. Veríssimo and L. Rodrigues. Distributed Systems for System Architects. Kluwer Academic Publishers, 2001.]]
[33]
P. Veríssimo, L. Rodrigues, and A. Casimiro. Cesiumspray: a precise and accurate global clock service for large-scale systems. Journal of Real-Time Systems, 12(3):243--294, May 1997.]]
[34]
C. Walter, N. Suri, and M. Hugue. Continual on-line diagnosis of hybrid faults. In Proceedings of the 4th IFIP International Working Conference on Dependable Computing for Critical Applications, 1994.]]
[35]
L. Zhou, F. Schneider, and R. van Renesse. COCA: A secure distributed on-line certification authority. ACM Transactions on Computer Systems, 20(4):329--368, November 2002.]]

Cited By

View all
  • (2024)The bedrock of byzantine fault toleranceProceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation10.5555/3691825.3691847(371-400)Online publication date: 16-Apr-2024
  • (2024)Efficient On-Chip ReplicationIEEE Access10.1109/ACCESS.2024.348401312(172581-172595)Online publication date: 2024
  • (2023)Resilient and Secure System on Chip with Rejuvenation in the Wake of Persistent AttacksProceedings of the 16th European Workshop on System Security10.1145/3578357.3589456(37-43)Online publication date: 8-May-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGACT News
ACM SIGACT News  Volume 37, Issue 1
March 2006
93 pages
ISSN:0163-5700
DOI:10.1145/1122480
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 March 2006
Published in SIGACT Volume 37, Issue 1

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)15
  • Downloads (Last 6 weeks)0
Reflects downloads up to 26 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)The bedrock of byzantine fault toleranceProceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation10.5555/3691825.3691847(371-400)Online publication date: 16-Apr-2024
  • (2024)Efficient On-Chip ReplicationIEEE Access10.1109/ACCESS.2024.348401312(172581-172595)Online publication date: 2024
  • (2023)Resilient and Secure System on Chip with Rejuvenation in the Wake of Persistent AttacksProceedings of the 16th European Workshop on System Security10.1145/3578357.3589456(37-43)Online publication date: 8-May-2023
  • (2022)Behind the last line of defense: Surviving SoC faults and intrusionsComputers & Security10.1016/j.cose.2022.102920123(102920)Online publication date: Dec-2022
  • (2022)Darwinian standard model of physics obtains general relativityQuantum Information Processing10.1007/s11128-022-03455-321:3Online publication date: 1-Mar-2022
  • (2021)Failure Detectors of Strong S and Perfect P Classes for Time Synchronous Hierarchical Distributed SystemsResearch Anthology on Architectures, Frameworks, and Integration Strategies for Distributed and Cloud Computing10.4018/978-1-7998-5339-8.ch064(1317-1343)Online publication date: 2021
  • (2021)Byzantine Fault-tolerant State-machine Replication from a Systems PerspectiveACM Computing Surveys10.1145/343672854:1(1-38)Online publication date: 11-Feb-2021
  • (2021)Threat Adaptive Byzantine Fault Tolerant State-Machine Replication2021 40th International Symposium on Reliable Distributed Systems (SRDS)10.1109/SRDS53918.2021.00017(78-87)Online publication date: Sep-2021
  • (2019)Towards an architectural patterns language for systems-of-systemsProceedings of the 26th Conference on Pattern Languages of Programs10.5555/3492252.3492254(1-24)Online publication date: 7-Oct-2019
  • (2019)Failure Detectors of Strong S and Perfect P Classes for Time Synchronous Hierarchical Distributed SystemsApplying Integration Techniques and Methods in Distributed Systems and Technologies10.4018/978-1-5225-8295-3.ch010(246-280)Online publication date: 2019
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media