Skip to main content
Log in

Symbolic synthesis of masking fault-tolerant distributed programs

  • Published:
Distributed Computing Aims and scope Submit manuscript

Abstract

We focus on automated addition of masking fault-tolerance to existing fault-intolerant distributed programs. Intuitively, a program is masking fault-tolerant, if it satisfies its safety and liveness specifications in the absence and presence of faults. Masking fault-tolerance is highly desirable in distributed programs, as the structure of such programs are fairly complex and they are often subject to various types of faults. However, the problem of synthesizing masking fault-tolerant distributed programs from their fault-intolerant version is NP-complete in the size of the program’s state space, setting the practicality of the synthesis problem in doubt. In this paper, we show that in spite of the high worst-case complexity, synthesizing moderate-sized masking distributed programs is feasible in practice. In particular, we present and implement a BDD-based synthesis heuristic for adding masking fault-tolerance to existing fault-intolerant distributed programs automatically. Our experiments validate the efficiency and effectiveness of our algorithm in the sense that synthesis is possible in reasonable amount of time and memory. We also identify several bottlenecks in synthesis of distributed programs depending upon the structure of the program at hand. We conclude that unlike verification, in program synthesis, the most challenging barrier is not the state explosion problem by itself, but the time complexity of the decision procedures.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Alpern B., Schneider F.B.: Defining liveness. Inf. Process. Lett. 21, 181–185 (1985)

    Article  MathSciNet  MATH  Google Scholar 

  2. Arora A., Gouda M.G.: Closure and convergence: a foundation of fault-tolerant computing. IEEE Trans. Softw. Eng. 19(11), 1015–1027 (1993)

    Article  Google Scholar 

  3. Arora A., Kulkarni S.S.: Component based design of multitolerant systems. IEEE Trans. Softw. Eng. 24(1), 63–78 (1998)

    Article  Google Scholar 

  4. Asarin, E., Maler, O.: As soon as possible: time optimal control for timed automata. In: Hybrid Systems: Computation and Control (HSCC), pp. 19–30 (1999)

  5. Asarin, E., Maler, O., Pnueli, A., Sifakis, J.: Controller synthesis for timed automata. In: IFAC Symposium on System Structure and Control, pp. 469–474 (1998)

  6. Attie P., Emerson E.A.: Synthesis of concurrent systems with many similar processes. ACM Trans. Program. Lang. Syst. (TOPLAS) 20(1), 51–115 (1998)

    Article  Google Scholar 

  7. Bonakdarpour, B., Kulkarni, S.S.: Incremental synthesis of fault-tolerant real-time programs. In: International Symposium on Stabilization, Safety, and Security of Distributed Systems (SSS), LNCS 4280, pp. 122–136 (2006)

  8. Bonakdarpour, B., Kulkarni, S.S.: Exploiting symbolic techniques in automated synthesis of distributed programs with large state space. In: IEEE International Conference on Distributed Computing Systems (ICDCS), pp. 3–10 (2007)

  9. Bonakdarpour, B., Kulkarni, S.S.: Masking faults while providing bounded-time phased recovery. In: International Symposium on Formal Methods (FM), pp. 374–389 (2008a)

  10. Bonakdarpour, B., Kulkarni, S.S.: Revising distributed UNITY programs is NP-complete. In: Principles of Distributed Systems (OPODIS), pp. 408–427 (2008b)

  11. Bonakdarpour B., Kulkarni S.S.: SYCRAFT: a tool for synthesizing fault-tolerant distributed programs. In: van Breugel, F., Chechik, M. (eds) Concurrency Theory (CONCUR), pp. 167–171. Spinger, Heidelberg (2008)

  12. Bournai, P., Borgne, M.L., Guernic, P.L.: Synthesis of discrete-event controllers based on the signal environment. In: Discrete Event Dynamic System: Theory and Applications, pp. 325–346 (2000)

  13. Bouyer, P., D’Souza, D., Madhusudan, P., Petit, A.: Timed control with partial observability. In: Computer Aided Verification (CAV), pp. 180–192 (2003)

  14. Bouyer, P., Chevalier, F., D’Souza, D.: Fault diagnosis using timed automata. In: Foundations of Software Science and Computation Structure, pp. 219–233 (2005)

  15. Bryant R.E.: Graph-based algorithms for Boolean function manipulation. IEEE Trans. Comput. 35(8), 677–691 (1986)

    Article  MATH  Google Scholar 

  16. Burch, J., Clarke, E., Long, D.: Symbolic model checking with partitioned transition relations. In: International Conference on Very Large Scale Integration, pp. 49–58 (1991)

  17. Burch J.R., Clarke E.M., McMillan K.L., Dill D.L., Hwang L.J.: Symbolic model checking: 1020 states and beyond. Inf. Comput. 98(2), 142–170 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  18. Cassez, F., David, A., Fleury, E., Larsen, K., Lime, D.: Efficient on-the-fly algorithms for the analysis of timed games. In: Concurrency Theory (CONCUR), pp. 66–80 (2005)

  19. Cho K.H., Lim J.T.: Synthesis of fault-tolerant supervisor for automated manufacturing systems: a case study on photolithography process. IEEE Trans. Robot. Autom. 14(2), 348–351 (1998)

    Article  Google Scholar 

  20. Ciardo, G., Yu, A.J.: Saturation-based symbolic reachability analysis using conjunctive and disjunctive partitioning. In: Correct Hardware Design and Verification Methods (CHARME), pp. 146–161 (2005)

  21. Ciardo, G., Lüttgen, G., Siminiceanu, R.: Saturation: an efficient iteration strategy for symbolic state-space generation. In: Tools and Algorithms for the Construction and Analysis of Systems (TACAS), pp. 328–342 (2001)

  22. Cimatti A., Clarke E.M., Giunchiglia F., Roveri M.: NUSMV: a new symbolic model checker. Softw. Tools Tech. Transf. (STTT) 2(4), 410–425 (2000)

    Article  MATH  Google Scholar 

  23. Clarke, E.M., Filkorn, T., Jha, S.: Exploiting symmetry in temporal logic model checking. In: Computer Aided Verification (CAV), pp. 450–462 (1993)

  24. D’Souza, D., Madhusudan, P.: Timed control synthesis for external specifications. In: Symposium on Theoretical Aspects of Computer Science (STACS), pp. 571–582 (2002)

  25. Ebnenasir, A.: DiConic addition of failsafe fault-tolerance. In: Automated Software Engineering (ASE), pp. 44–53 (2007)

  26. Ebnenasir A., Kulkarni S.S., Arora A.: FTSyn: a framework for automatic synthesis of fault-tolerance. Int. J. Soft. Tools Tech. Transf. (STTT) 10(5), 455–471 (2008)

    Article  Google Scholar 

  27. Emerson, E.A., Lei, C.L.: Efficient model checking in fragments of the propositional model mu-calculus. In: Logic in Computer Science (LICS), pp. 267–278 (1986)

  28. Emerson E.A., Sistla A.P.: Symmetry and model checking. Formal Methods Syst. Des. Int. J. 9(1/2), 105–131 (1996)

    Article  Google Scholar 

  29. Fisler, K., Fraer, R., Kamhi, G., Vardi, M.Y., Yang, Z.: Is there a best symbolic cycle-detection algorithm? In: In Tools and Algorithms for the Construction and Analysis of Systems (TACAS), pp. 420–434 (2001)

  30. Girault A., Rutten É.: Automating the addition of fault tolerance with discrete controller synthesis. Formal Methods Syst. Des. (FMSD) 35(2), 190–225 (2009)

    Article  MATH  Google Scholar 

  31. Gohari P., Wonham W.M.: On the complexity of supervisory control design in the RW framework. IEEE Trans. Syst. Man Cybern. 30(5), 643–652 (2000)

    Article  Google Scholar 

  32. Henzinger T.A., Nicollin X., Sifakis J., Yovine S.: Symbolic model checking for real-time systems. Inf. Comput. 111(2), 193–244 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  33. Kulkarni, S.S., Arora, A.: Automating the addition of fault-tolerance. In: Formal Techniques in Real-Time and Fault-Tolerant Systems (FTRTFT), pp. 82–93 (2000)

  34. Kulkarni S.S., Arumugam M.: Infuse: a TDMA based data dissemination protocol for sensor networks. Int. J. Distrib. Sens. Netw. (IJDSN) 2(1), 55–78 (2006)

    Article  Google Scholar 

  35. Kulkarni, S.S., Ebnenasir, A.: Automated synthesis of multitolerance. In: International Conference on Dependable Systems and Networks (DSN), pp. 209–219 (2004)

  36. Kulkarni, S.S., Ebnenasir, A.: Adding fault-tolerance using pre-synthesized components. In: European Dependable Computing Conference (EDCC), pp. 72–90 (2005)

  37. Kulkarni, S.S., Arora, A., Chippada, A.: Polynomial time synthesis of Byzantine agreement. In: Symposium on Reliable Distributed Systems (SRDS), pp. 130–140 (2001)

  38. Kumar R., Garg V.K.: Optimal supervisory control of discrete event dynamicalsystems. SIAM J. Control Optim. 33(2), 419–439 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  39. Lamport L., Shostak R., Pease M.: The Byzantine generals problem. ACM Trans. Program. Lang. Syst. (TOPLAS) 4(3), 382–401 (1982)

    Article  MATH  Google Scholar 

  40. Lee, E.A.: Cyber-physical systems—are computing foundations adequate? In: Position Paper for NSF Workshop On Cyber-Physical Systems: Research Motivation, Techniques and Roadmap (2006)

  41. Lin F., Wonham W.M.: Decentralized control and coordination of discrete-event systems with partial observation. IEEE Trans. Autom. Control 35(12), 1330–1337 (1990)

    Article  MathSciNet  MATH  Google Scholar 

  42. Maler, O., Pnueli, A., Sifakis, J.: On the synthesis of discrete controllers for timed systems. In: 12th Annual Symposium on Theoretical Aspects of Computer Science (STACS), pp. 229–242 (1995)

  43. McMillan K.L.: Symbolic Model Checking. Kluwer, Dordrecht (1993)

    Book  MATH  Google Scholar 

  44. Pnueli, A., Rosner, R.: On the synthesis of a reactive module. In: Principles of Programming Languages (POPL), pp. 179–190 (1989a)

  45. Pnueli, A., Rosner, R.: On the synthesis of an asynchronous reactive module. In: International Colloqium on Automata, Languages, and Programming (ICALP), pp. 652–671 (1989b)

  46. Ramadge P.J., Wonham W.M.: The control of discrete event systems. Proc. IEEE 77(1), 81–98 (1989)

    Article  Google Scholar 

  47. Ranjan, R., Aziz, A., Brayton, R., Plessier, B., Pixley, C.: Efficient BDD algorithms for FSM synthesis and verification. In: IEEE/ACM International Workshop on Logic Synthesis (1995)

  48. Rudie K., Wonham W.M.: Think globally, act locally: decentralized supervisory control. IEEE Trans. Autom. Control 37(11), 1692–1708 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  49. Schlichting R.D., Schneider F.B.: Fail-stop processors: an approach to designing fault-tolerant computing systems. ACM Trans. Comput. 1(3), 222–238 (1983)

    Article  Google Scholar 

  50. Stankovic J.A., Lee I., Mok A.K., Rajkumar R.: Opportunities and obligations for physical computing systems. IEEE Comput. 38(11), 23–31 (2005)

    Article  Google Scholar 

  51. Thomas, W.: chap 4: Automata on infinite objects. In: Handbook of Theoretical Computer Science, vol. B, pp. 133–192. Elsevier Science Publishers B. V., Amsterdam (1990)

  52. Thomas, W.: On the synthesis of strategies in infinite games. In: Theoretical Aspects of Computer Science (STACS), pp. 1–13 (1995)

  53. Tripakis, S.: Fault diagnosis for timed automata. In: Formal Techniques in Real-Time and Fault-Tolerant Systems (FTRTFT), pp. 205–224 (2002)

  54. Tripakis S.: Undecidable problems of decentralized observation and control on regular languages. Inf. Process. Lett. 90(1), 21–28 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  55. Tripakis, S., Altisen, K.: On-the-fly controller synthesis for discrete and dense time systems. In: Formal Methods 1999 (FM), pp. 233–252 (1999)

  56. Wallmeier, N., Hütten, P., Thomas, W.: Symbolic synthesis of finite-state controllers for request-response specifications. In: Implementation and Application of Automata (CIAA), pp. 11–22 (2003)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Borzoo Bonakdarpour.

Additional information

This work is partially sponsored by Canada NSERC DG 357121-2008, ORF RE03-045, ORE RE-04-036, and ISOP IS09-06-037 grants, and, by USA AFOSR FA9550-10-1-0178 and NSF CNS 0914913 grants.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bonakdarpour, B., Kulkarni, S.S. & Abujarad, F. Symbolic synthesis of masking fault-tolerant distributed programs. Distrib. Comput. 25, 83–108 (2012). https://doi.org/10.1007/s00446-011-0139-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00446-011-0139-3

Keywords

Navigation