Skip to main content

Low-overhead time-triggered group membership

  • Contributed Papers
  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1320))

Abstract

A group membership protocol is presented and proven correct for a synchronous time-triggered model of computation with processors in a ring that broadcast in turn. The protocol, derived from one used for critical control functions in automobiles, accepts a very restrictive fault model to achieve low overhead and requires only one bit of membership information piggybacked on regular broadcasts. Given its strong fault model, the protocol guarantees that a faulty processor will be promptly diagnosed and removed from the agreed group of processors, and will also diagnose itself as faulty. The protocol is correct under a fault-arrival assumption that new faults arrive at least n + 1 time units apart, when there are n processors. Exploiting this assumption leads to unusual real-time reasoning in the correctness proof.

This work was supported by Arpa through USAF Electronic Systems Center Contract F19628-96-C-0006, by the Air Force Office of Scientific Research, Air Force Materiel Command, USAF, under contract F49620-95-C0044, and by the National Science Foundation under contract CCR-9509931.

This is a preview of subscription content, log in via an institution.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. ARINC Specification 659: Backplane Data Bus. Aeronautical Radio, Inc, Annapolis, MD, December 1993. Prepared by the Airlines Electronic Engineering Committee.

    Google Scholar 

  2. Flaviu Cristian. Reaching agreement on processor-group membership in synchronous distributed systems. Distributed Systems, 4:175–187, 1991.

    Google Scholar 

  3. David L. Dill. The Muro verification system. In Rajeev Alur and Thomas A. Henzinger, editors, Computer-Aided Verification, CAV '96, volume 1102 of Lecture Notes in Computer Science, pages 390–393, New Brunswick, NJ, July/August 1996. Springer-Verlag.

    Google Scholar 

  4. Li Gong, Patrick Lincoln, and John Rushby. Byzantine agreement with authentication: Observations and applications in tolerating hybrid and link ] aults. In Dependable Computing for Critical Applications—5, pages 79–90, Champaign, IL, September 1995. IFIP WG 10.4, preliminary proceedings; final proceedings to be published by IEEE.

    Google Scholar 

  5. Fault Tolerant Computing Symposium 25: Highlights from 25 Years, Pasadena, CA, June 1995. IEEE Computer Society.

    Google Scholar 

  6. H. Kopetz. Automotive electronics-present state and future prospects. In Fault Tolerant Computing Symposium 25: Special Issue, pages 66–75, Pasadena, CA, June 1995. IEEE Computer Society.

    Google Scholar 

  7. H. Kopetz, G. Grünsteidl, and J. Reisinger. Fault-tolerant membership service in a synchronous distributed real-time system. In A. Avizienis and J. C. Laprie, editors, Dependable Computing for Critical Applications, volume 4 of Dependable Computing and Fault-Tolerant Systems, pages 411–429, Santa Barbara, CA, August 1989. Springer-Verlag, Vienna, Austria.

    Google Scholar 

  8. Hermann Kopetz and Günter Grünsteidl. TTP-a protocol for fault-tolerant real-time systems. IEEE Computer, 27(1):14–23, January 1994.

    Google Scholar 

  9. Patrick Lincoln and John Rushby. A formally verified algorithm for interactive consistency under a hybrid fault model. In Fault Tolerant Computing Symposium 23, pages 402–411, Toulouse, France, June 1993. IEEE Computer Society. Reprinted in [5, pp. 438–447].

    Google Scholar 

  10. Z. Manna and A. Pnueli. Temporal Verification of Reactive Systems: Safety. Springer-Verlag, 1995.

    Google Scholar 

  11. Fred J. Meyer and Dhiraj K. Pradhan. Consensus with dual failure modes. IEEE Transactions on Parallel and Distributed Systems, 2(2):214–222, April 1991.

    Article  Google Scholar 

  12. Sam Owre, John Rushby, Natarajan Shankar, and Friedrich von Henke. Formal verification for fault-tolerant architectures: Prolegomena to the design of PVS. IEEE Transactions on Software Engineering, 21(2):107–125, February 1995.

    Article  Google Scholar 

  13. John Rushby. A formally verified algorithm for clock synchronization under a hybrid fault model. In Thirteenth ACM Symposium on Principles of Distributed Computing, pages 304–313, Los Angeles, CA, August 1994. Association for Computing Machinery1.

    Google Scholar 

  14. Fred B. Schneider. Implementing fault-tolerant services using the state machine approach: A tutorial. ACM Computing Surveys, 22(4):299–319, December 1990.

    Article  Google Scholar 

  15. Philip Thambidurai and You-Keun Park. Interactive consistency with multiple failure modes. In 7th Symposium on Reliable Distributed Systems, pages 93–100, Columbus, OH, October 1988. IEEE Computer Society.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Marios Mavronicolas Philippas Tsigas

Rights and permissions

Reprints and permissions

Copyright information

© 1997 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Katz, S., Lincoln, P., Rushby, J. (1997). Low-overhead time-triggered group membership. In: Mavronicolas, M., Tsigas, P. (eds) Distributed Algorithms. WDAG 1997. Lecture Notes in Computer Science, vol 1320. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0030682

Download citation

  • DOI: https://doi.org/10.1007/BFb0030682

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-63575-8

  • Online ISBN: 978-3-540-69600-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics