Skip to main content

Modular Model-Checking of a Byzantine Fault-Tolerant Protocol

  • Conference paper
  • First Online:
NASA Formal Methods (NFM 2017)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 10227))

Included in the following conference series:

Abstract

With proof techniques like IC3 and k-induction, model-checking scales further than ever before. Still, fault-tolerant distributed systems are particularly challenging to model-check given their large state spaces and non-determinism. The typical approach to controlling complexity is to construct ad-hoc abstractions of faults, message-passing, and behaviors. However, these abstractions come at the price of divorcing the model from its implementation and making refactoring difficult. In this work, we present a model for fault-tolerant distributed system verification that combines ideas from the literature including calendar automata, symbolic fault injection, and abstract transition systems, and then use it to model-check various implementations of the Hybrid Oral Messages algorithm that differ in the fault model, timing model, and local node behavior. We show that despite being implementation-level models, the verifications are scalable and modular, insofar as isolated changes to an implementation require isolated changes to the model and proofs. This work is carried out in the SAL model-checker.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/GaloisInc/mmc-paper.

  2. 2.

    There are exceptions; for example, benign faults may be detected by a node itself (e.g., in a built-in-test).

  3. 3.

    https://github.com/GaloisInc/atom-sally.

References

  1. Bevier, W.R., Young, W.D.: The proof of correctness of a fault-tolerant circuit design. Computational Logic Inc., Technical report 57 (1990). http://computationallogic.com/reports/index.html

  2. Young, W.D.: Comparing verification systems: interactive consistency in ACL2. IEEE Trans. Softw. Eng. 23(4), 214–223 (1997)

    Article  Google Scholar 

  3. Lincoln, P., Rushby, J.: A formally verified algorithm for interactive consistency under a hybrid fault model. In: 23rd Fault Tolerant Computing Symposium, pp. 402–411. IEEE Computer Society (1993)

    Google Scholar 

  4. Owre, S., Rushby, J., Shankar, N., von Henke, F.: Formal verification for fault-tolerant architectures: prolegomena to the design of PVS. IEEE Trans. Software Eng. 21(2), 107–125 (1995)

    Article  Google Scholar 

  5. Chandra, T.D., Griesemer, R., Redstone, J.: Paxos made live: an engineering perspective. In: ACM Symposium on Principles of Distributed Computing (PODC), pp. 398–407. ACM (2007)

    Google Scholar 

  6. Dutertre, B., Sorea, M.: Modeling and verification of a fault-tolerant real-time startup protocol using calendar automata. In: Lakhnech, Y., Yovine, S. (eds.) FORMATS/FTRTFT -2004. LNCS, vol. 3253, pp. 199–214. Springer, Heidelberg (2004). doi:10.1007/978-3-540-30206-3_15

    Chapter  Google Scholar 

  7. Boyer, R.S., Moore, J.S.: MJRTY-a fast majority vote algorithm. In: Boyer, R.S. (ed.) Automated Reasoning. Automated Reasoning Series, vol. 1, pp. 105–117. Springer, Dordrecht (1991)

    Chapter  Google Scholar 

  8. Azadmanesh, M.H., Kieckhafer, R.M.: Exploiting omissive faults in synchronous approximate agreement. IEEE Trans. Comput. 49(10), 1031–1042 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  9. Pike, L., Maddalon, J., Miner, P., Geser, A.: Abstractions for fault-tolerant distributed system verification. In: Slind, K., Bunker, A., Gopalakrishnan, G. (eds.) TPHOLs 2004. LNCS, vol. 3223, pp. 257–270. Springer, Heidelberg (2004). doi:10.1007/978-3-540-30142-4_19

    Chapter  Google Scholar 

  10. Rushby, J.: SAL tutorial: analyzing the fault-tolerant algorithm OM(1). Computer Science Laboratory, SRI International, Menlo Park, CA, CSL Technical note. http://www.csl.sri.com/users/rushby/abstracts/om1

  11. Thambidurai, P., Park, Y.-K.: Interactive consistency with multiple failure modes. In: Symposium on Reliable Distributed Systems, pp. 93–100. IEEE (1988)

    Google Scholar 

  12. Rushby, J.: Verification diagrams revisited: disjunctive invariants for easy verification. In: Emerson, E.A., Sistla, A.P. (eds.) CAV 2000. LNCS, vol. 1855, pp. 508–520. Springer, Heidelberg (2000). doi:10.1007/10722167_38

    Chapter  Google Scholar 

  13. Dutertre, B., Sorea, M.: Timed systems in SAL. In: SRI International, Menlo Park, CA, SDL Technical report SRI-SDL-04-03, July 2004

    Google Scholar 

  14. Lamport, L., Shostak, R., Pease, M.: The Byzantine generals problem. ACM Trans. Program. Lang. Syst. 4(3), 382–401 (1982)

    Article  MATH  Google Scholar 

  15. Bensalem, S., Ganesh, V., Lakhnech, Y., Muñoz, C., Owre, S., Rueß, H., Rushby, J., Rusu, V., Saïdi, H., Shankar, N., Singerman, E., Tiwari, A.: An overview of SAL. In: NASA Langley Formal Methods Workshop, pp. 187–196 (2000)

    Google Scholar 

  16. Rushby, J.: The versatile synchronous observer. In: Iida, S., Meseguer, J., Ogata, K. (eds.) Specification, Algebra, and Software. LNCS, vol. 8373, pp. 110–128. Springer, Heidelberg (2014). doi:10.1007/978-3-642-54624-2_6

    Chapter  Google Scholar 

  17. Kopetz, H.: Real-Time Systems: Design Principles for Distributed Embedded Applications. Kluwer, Philadelphia (1997)

    MATH  Google Scholar 

  18. Javanović, D., Dutertre, B.: Property-directed \(k\)-induction. In: Formal Methods in Computer Aided Design (FMCAD) (2016)

    Google Scholar 

  19. Bokor, P., Serafini, M., Suri, N.: On efficient models for model checking message-passing distributed protocols. In: Hatcliff, J., Zucca, E. (eds.) FMOODS/FORTE -2010. LNCS, vol. 6117, pp. 216–223. Springer, Heidelberg (2010). doi:10.1007/978-3-642-13464-7_17

    Chapter  Google Scholar 

Download references

Acknowledgments

This work is partially supported by NASA contract #NNL14AA08C. We are indebted to our collaborators Brendan Hall and Srivatsan Varadarajan at Honeywell Labs, and to Wilfredo Torres-Pomales at NASA Langley for their discussions and insights. Additionally, we acknowledge that this work is heavily inspired by a series of papers authored by John Rushby.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Benjamin F. Jones .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Jones, B.F., Pike, L. (2017). Modular Model-Checking of a Byzantine Fault-Tolerant Protocol. In: Barrett, C., Davies, M., Kahsai, T. (eds) NASA Formal Methods. NFM 2017. Lecture Notes in Computer Science(), vol 10227. Springer, Cham. https://doi.org/10.1007/978-3-319-57288-8_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-57288-8_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-57287-1

  • Online ISBN: 978-3-319-57288-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics