Skip to main content
Log in

A distributed formal-based model for self-healing behaviors in autonomous systems: from failure detection to self-recovery

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

The challenges of current software-intensive systems, large-scale information and computing systems environments, which are highly dynamic, heterogeneous, and unpredictable, have motivated the development of techniques that enhance these systems with autonomous behaviors. Even though different concerns about these systems have been deeply studied, their design is still considerably more challenging than traditional ones. Self-healing is one of the main features that characterize autonomic computing systems. Failure detection, recovery strategies, and reliability are of paramount importance to ensure continuous operation and correct functioning even in the presence of a given maximum amount of faulty components. Most existing research and implementations focus on architecture-specific solutions to introduce self-healing behaviors. This implies that users must tailor their software by taking into account architecture-specific fault tolerance features, which requires too much effort from developers and users. This paper proposes a distributed formal model for the specification, verification, and analysis of self-healing behaviors in autonomous systems, from failure-detection to self-recovery. Such a high-level model allows users to specify and apply the desired type of failure detection and recovery without requiring any knowledge about its implementation. Our model allows not only formal verification of different properties but also performance evaluation. We provide the verification of qualitative properties using state-space exploration tools, and quantitative properties are also validated through statistical model-checking. All these properties are preserved in actual implementation by ensuring that the deployed code is consistent with the validated model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

References

  1. Oreizy P, Medvidovic N, Taylor RN (1998) Architecture-based runtime software evolution. In: Proceedings of the 20th International Conference on Software Engineering, IEEE, pp 177–186

  2. Hölzl M, Rauschmayer A, Wirsing M (2008) Engineering of software-intensive systems: state of the art and research challenges. Software-Intensive Systems and New Computing Paradigms. Springer, New York, pp 1–44

    MATH  Google Scholar 

  3. Oquendo F (2016) Software architecture challenges and emerging research in software-intensive systems-of-systems. European Conference on Software Architecture. Springer, New York, pp 3–21

    Chapter  Google Scholar 

  4. Gerostathopoulos I, Bures T, Hnetynka P, Keznikl J, Kit M, Plasil F, Plouzeau N (2016) Self-adaptation in software-intensive cyber-physical systems: from system goals to architecture configurations. J Syst Softw 122:378–397

    Article  Google Scholar 

  5. Wang H, Zhong D, Zhao T (2019) Avionics system failure analysis and verification based on model checking. Eng Fail Anal 105:373–385

    Article  Google Scholar 

  6. Pelliccione P, Tivoli M, Bucchiarone A, Polini A (2008) An architectural approach to the correct and automatic assembly of evolving component-based systems. J Syst Softw 81(12):2237–2251

    Article  Google Scholar 

  7. Guarro S, Yau MK, Ozguner U, Aldemir T, Kurt A, Hejase M, Knudson M (2017) Formal framework and models for validation and verification of software-intensive aerospace systems. In: AIAA Information Systems-AIAA Infotech@ Aerospace, p 0418

  8. Salvador R, Otero A, Mora J, de la Torre E, Sekanina L, Riesgo T (2011) Fault tolerance analysis and self-healing strategy of autonomous, evolvable hardware systems. In: Proceedings of the 2011 International Conference on Reconfigurable Computing and FPGAs, IEEE, pp. 164–169

  9. Pierce WH (2014) Failure-Tolerant Computer Design. Academic Press, New York

    Google Scholar 

  10. Stengel RF (1991) Intelligent failure-tolerant control. IEEE Control Syst Mag 11(4):14–23

    Article  Google Scholar 

  11. Schneider M (1993) Self-stabilization. ACM Comput Surv (CSUR) 25(1):45–67

    Article  Google Scholar 

  12. Kochte MA, Wunderlich H (2018) Self-test and diagnosis for self-aware systems. IEEE Design Test 35(5):7–18

    Article  Google Scholar 

  13. Basu A, Bensalem S, Bozga M, Combaz J, Jaber M, Nguyen T, Sifakis J (2011) Rigorous component-based system design using the BIP framework. IEEE Softw 28(3):41–48

    Article  Google Scholar 

  14. Nouri A, Mediouni BL, Bozga M, Combaz J, Bensalem S, Legay A (2018) Performance evaluation of stochastic real-time systems with the SBIP framework. IJCCBS 8(3/4):340–370

    Article  Google Scholar 

  15. Nouri A, Bensalem S, Bozga M, Delahaye B, Jégourel C, Legay A (2015) Statistical model checking QoS properties of systems with SBIP. STTT 17(2):171–185

    Article  Google Scholar 

  16. McGann C, Py F, Rajan K, Thomas H, Henthorn R, McEwen RS (2008) A deliberative architecture for AUV control. In: Proceedings of the 2008 IEEE International Conference on Robotics and Automation, ICRA, IEEE, pp 1049–1054

  17. Psaier H, Dustdar S (2011) A survey on self-healing systems: approaches and systems. Computing 91(1):43–73

    Article  Google Scholar 

  18. Pereira EG, Pereira R, Taleb-Bendiab A (2005) Performance evaluation for self-healing distributed services. In: Proceedings of the 11th International Conference on Parallel and Distributed Systems, ICPADS, pp 135–139

  19. McMinn P (2004) Search-based software test data generation: a survey. Softw Test Verif Reliab 14(2):105–156

    Article  Google Scholar 

  20. Briand L, Nejati S, Sabetzadeh M, Bianculli D (2016) Testing the untestable: model testing of complex software-intensive systems. In: Proceedings of the 38th International Conference on Software Engineering Companion, pp 789–792

  21. Deonandan I, Valerdi R, Lane JA, Macias F (2010) Cost and risk considerations for test and evaluation of unmanned and autonomous systems of systems. In: Proceedings of the 2010 5th International Conference on System of Systems Engineering, IEEE, pp 1–6

  22. Krishna CM (2014) Fault-tolerant scheduling in homogeneous real-time systems. ACM Comput Surv (CSUR) 46(4):1–34

    Article  Google Scholar 

  23. Devaraj R, Sarkar A, Biswas S (2017) Fault-tolerant preemptive aperiodic RT scheduling by supervisory control of TDES on multiprocessors. ACM Trans Embed Comput Syst (TECS) 16(3):1–25

    Article  Google Scholar 

  24. Devaraj R, Sarkar A Resource-optimal fault-tolerant scheduler design for task graphs using supervisory control. IEEE Trans Ind Inform

  25. Ye L, Lin LZ (2010) Study of superconducting fault current limiters for system integration of wind farms. IEEE Trans Appl Supercond 20(3):1233–1237

    Article  Google Scholar 

  26. Azad SP, Niazmand B, Janson K, George N, Oyeniran AS, Putkaradze T, Kaur A, Raik J, Jervan G, Ubar R (2017) From online fault detection to fault management in network-on-chips: a ground-up approach. In: IEEE 20th International Symposium on Design and Diagnostics of Electronic Circuits and Systems (DDECS). IEEE 2017, pp 48–53

  27. Hu J, Bhowmick P, Jang I, Arvin F, Lanzon A A decentralized cluster formation containment framework for multirobot systems. IEEE Trans Robot

  28. Filippidis I, Dimarogonas DV, Kyriakopoulos KJ (2012) Decentralized multi-agent control from local LTL specifications. In: Proceedings of the 2012 IEEE 51st IEEE Conference on Decision and Control (CDC), IEEE, pp 6235–6240

  29. Weyns D, Iftikhar MU, de la Iglesia DG, Ahmad T (2012) A survey of formal methods in self-adaptive systems. In: Fifth International C* Conference on Computer Science and Software Engineering, C3S2E ’12, pp 67–79

  30. Iftikhar MU, Weyns D (2012) A case study on formal verification of self-adaptive behaviors in a decentralized system. In: Proceedings 11th International Workshop on Foundations of Coordination Languages and Self Adaptation, FOCLASA, pp 45–62

  31. Güdemann M, Ortmeier F, Reif W (2006) Safety and dependability analysis of self-adaptive systems. In: Second International Symposium on Leveraging Applications of Formal Methods, Verification and Validation (isola 2006), IEEE, pp 177–184

  32. Mian NA, Ahmad F (2018) Agent based architecture for modeling and analysis of self adaptive systems using formal methods. Int J Adv Comput Sci Appl 9(1):563–567

    Google Scholar 

  33. Salehie M, Tahvildari L (2009) Self-adaptive software: landscape and research challenges. ACM Trans Auton Adapt Syst (TAAS) 4(2):1–42

    Article  Google Scholar 

  34. Dashofy EM, Van der Hoek A, Taylor RN (2002) Towards architecture-based self-healing systems. In: Proceedings of the First Workshop on Self-Healing Systems, pp 21–26

  35. Garlan D, Schmerl B (2002) Model-based adaptation for self-healing systems. In: Proceedings of the First Workshop on Self-Healing Systems, pp 27–32

  36. Oreizy P, Gorlick MM, Taylor RN, Heimhigner D, Johnson G, Medvidovic N, Quilici A, Rosenblum DS, Wolf AL (1999) An architecture-based approach to self-adaptive software. IEEE Intell Syst Appl 14(3):54–62

    Article  Google Scholar 

  37. Putze F, Ihrig T, Schultz T, Stuerzlinger W (2020) Platform for studying self-repairing auto-corrections in mobile text entry based on brain activity, gaze, and context. In: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, pp 1–13

  38. Oquendo F (2016) Formally describing the architectural behavior of software-intensive systems-of-systems with sosadl. In: Proceedings of the 21st International Conference on Engineering of Complex Computer Systems (ICECCS), IEEE, pp 13–22

  39. Ben-Rayana S, Bozga M, Bensalem S, Combaz J (2016) Rtd-finder: A tool for compositional verification of real-time component-based systems. In: International Conference on Tools and Algorithms for the Construction and Analysis of Systems, Springer, pp 394–406

  40. Gurunathan A, Viswanatham VM (2017) Autonomic performance enhancement environment for websphere application server. Int J Pure Appl Math 116(23):719–731

    Google Scholar 

  41. Simmons R, Pecheur C, Srinivasan G (2000) Towards automatic verification of autonomous systems. In: Proceedings of the 2000 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2000) (Cat. No.00CH37113), Vol. 2, pp 1410–1415

  42. Ehrig H, Ermel C, Runge O, Bucchiarone A, Pelliccione P (2010) Formal analysis and verification of self-healing systems. In: International Conference on Fundamental Approaches to Software Engineering, Springer, pp 139–153

  43. Basu A, Bozga M, Sifakis J (2006) Modeling heterogeneous real-time components in bip. In: Fourth IEEE International Conference on Software Engineering and Formal Methods (SEFM’06), IEEE, pp 3–12

  44. Mediouni BL, Nouri A, Bozga M, Dellabani M, Legay A, Bensalem S (2018) S BIP 2.0: Statistical model checking stochastic real-time systems. In: International Symposium on Automated Technology for Verification and Analysis, Springer, pp 536–542

  45. Bliudze S, Sifakis J (2008) The algebra of connectors: structuring interaction in BIP. IEEE Trans Comput 57(10):1315–1330

    Article  MathSciNet  MATH  Google Scholar 

  46. Park T, Byun I, Kim H, Yeom HY (2002) The performance of checkpointing and replication schemes for fault tolerant mobile agent systems. In: Proceedings of the 21st IEEE Symposium on Reliable Distributed Systems, 2002. IEEE, pp 256–261

  47. Glass M, Lukasiewycz M, Streichert T, Haubelt C, Teich J (2007) Reliability-aware system synthesis, design. Automation Test in Europe Conference Exhibition pp 1–6

  48. Ben-Hafaiedh I, Graf S, Quinton S (2011) Building distributed controllers for systems with priorities. J Log Algeb Prog 80(3–5):194–218

    Article  MathSciNet  MATH  Google Scholar 

  49. Köhler A, Bertsche B (2021) Cyclisation of safety diagnoses: influence on the evaluation of fault metrics. In: Annual Reliability and Maintainability Symposium (RAMS). IEEE pp 1–7

  50. Fleury S, Herrb M, Chatila R (1997) G\(^{\text{en}}\)om: a tool for the specification and the implementation of operating modules in a distributed robot architecture. In: Proceedings of the 1997 IEEE/RSJ International Conference on Intelligent Robot and Systems. Innovative Robotics for Real-World Applications. IROS, IEEE, 1997, pp 842–849

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Imene Ben Hafaiedh.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hafaiedh, I.B., Slimane, M.B. A distributed formal-based model for self-healing behaviors in autonomous systems: from failure detection to self-recovery. J Supercomput 78, 18725–18753 (2022). https://doi.org/10.1007/s11227-022-04614-0

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-022-04614-0

Keywords

Navigation