Skip to main content

Fault-Adaptivity in Hard Real-Time Component-Based Software Systems

  • Chapter

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 7475))

Abstract

Complexity in embedded software systems has reached the point where we need run-time mechanisms that provide fault management services. Testing and verification may not cover all possible scenarios that a system encounters, hence a simpler, yet formally specified run-time monitoring, diagnosis, and fault mitigation architecture is needed to increase the software system’s dependability. The approach described in this paper borrows concepts and principles from the field of ‘Systems Health Management’ for complex aerospace systems and implements a novel two level health management architecture that can be applied in the context of a model-based software development process.

At the first level, the Component-level Health Manager (CLHM) provides localized and limited service for managing the health of individual software components. A higher-level System-level Health Manager (SLHM) manages the health of the overall system. SLHM includes a diagnosis engine that uses a Timed Failure Propagation (TFPG) model automatically synthesized from the system specification built in the model-based design environment that accompanies the runtime system. SLHM also includes a reactive timed state machine used for mitigation, whose code is also generated from the model-based specification. This paper uses simple examples to illustrate the use of the approach.

This paper is based upon work supported by NASA under award NNX08AY49A. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Aeronautics and Space Administration. The authors would like to thank Dr Paul Miner, Eric Cooper, and Suzette Person of NASA LaRC for their help and guidance on the project.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. ARINC specification 653-2: Avionics application software standard interface part 1 - Required Services. Aeronautical Radio, lnc.

    Google Scholar 

  2. Abdelwahed, S., Karsai, G., Mahadevan, N., Ofsthun, S.C.: Practical considerations in systems diagnosis using timed failure propagation graph models. IEEE Transactions on Instrumentation and Measurement 58(2), 240–247 (2009)

    Article  Google Scholar 

  3. Abdelwahed, S., Karsai, G., Biswas, G.: A consistency-based robust diagnosis approach for temporal causal systems. In: 16th International Workshop on Principles of Diagnosis, pp. 73–79 (2005)

    Google Scholar 

  4. Avizienis, A., Laprie, J.C., Randell, B., Landwehr, C.: Basic concepts and taxonomy of dependable and secure computing. IEEE Transactions on Dependable and Secure Computing 1(1), 11–33 (2004)

    Article  Google Scholar 

  5. Bureau, A.T.S.: In-flight upset; 240km NW Perth, WA; Boeing Co 777-200, 9M-MRG. Tech. rep. (August 2005), http://www.atsb.gov.au/publications/investigation_reports/2005/AAIR/aair200503722.aspx

  6. Bureau, A.T.S.: AO-2008-070: In-flight upset, 154 km west of Learmonth, WA, 7, VH-QPA, Airbus A330-303. Tech. rep (October 2008), http://www.atsb.gov.au/publications/investigation_reports/2008/AAIR/aair200806143.aspx

  7. Bustard, D.W., Sterritt, R.: A requirements engineering perspective on autonomic systems development. In: Autonomic Computing: Concepts, Infrastructure, and Applications, pp. 19–33 (2006)

    Google Scholar 

  8. Butler, R.: A primer on architectural level fault tolerance. Tech. rep., NASA Scientific and Technical Information (STI) Program Office, Report No. NASA/TM-2008-215108 (2008), http://shemesh.larc.nasa.gov/fm/papers/Butler-TM-2008-215108-Primer-FT.pdf

  9. Charette, R.: This car runs on code. IEEE Spectrum (February 2009)

    Google Scholar 

  10. Cheng, B.H.C., de Lemos, R., Giese, H., Inverardi, P., Magee, J., Andersson, J., Becker, B., Bencomo, N., Brun, Y., Cukic, B., Di Marzo Serugendo, G., Dustdar, S., Finkelstein, A., Gacek, C., Geihs, K., Grassi, V., Karsai, G., Kienle, H.M., Kramer, J., Litoiu, M., Malek, S., Mirandola, R., Müller, H.A., Park, S., Shaw, M., Tichy, M., Tivoli, M., Weyns, D., Whittle, J.: Software Engineering for Self-Adaptive Systems: A Research Roadmap. In: Cheng, B.H.C., de Lemos, R., Giese, H., Inverardi, P., Magee, J. (eds.) Software Engineering for Self-Adaptive Systems. LNCS, vol. 5525, pp. 1–26. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  11. Dashofy, E.M., van der Hoek, A., Taylor, R.N.: Towards architecture-based self-healing systems. In: WOSS 2002: Proceedings of the First Workshop on Self-healing Systems, pp. 21–26. ACM Press, New York (2002)

    Chapter  Google Scholar 

  12. DO-178B, Software considerations in airborne systems and equipment certification. RTCA, Incorporated (1992)

    Google Scholar 

  13. Dubey, A., Karsai, G., Mahadevan, N.: Towards model-based software health management for real-time systems. Tech. Rep. ISIS-10-106, Institute for Software Integrated Systems, Vanderbilt University (August 2010), http://isis.vanderbilt.edu/node/4196

  14. Dubey, A., Karsai, G., Mahadevan, N.: A component model for hard real-time systems: CCM with ARINC-653. Software: Practice and Experience 41(12), 1517–1550 (2011), http://dx.doi.org/10.1002/spe.1083

    Google Scholar 

  15. Dubey, A., Karsai, G., Mahadevan, N.: Model-based Software Health Management for Real-Time Systems. In: IEEE Aerospace Conference, pp. 1–18. IEEE (2011)

    Google Scholar 

  16. Dubey, A., Mahadevan, N., Karsai, G.: The inertial measurement unit example: A software health management case study. Tech. Rep. ISIS-12-101, Institute for Software Integrated Systems, Vanderbilt University (February 2012), http://isis.vanderbilt.edu/node/4496

  17. Garlan, D., Cheng, S.W., Schmerl, B.: Increasing System Dependability Through Architecture-based self-repair. In: de Lemos, R., Gacek, C., Romanovsky, A. (eds.) Architecting Dependable Systems. LNCS, vol. 2677, pp. 61–89. Springer, Heidelberg (2003), http://dl.acm.org/citation.cfm?id=1768179.1768183

    Chapter  Google Scholar 

  18. Greenwell, W.S., Knight, J., Knight, J.C.: What should aviation safety incidents teach us? In: SAFECOMP 2003, The 22nd International Conference on Computer Safety, Reliability and Security (2003)

    Google Scholar 

  19. Harel, D.: Statecharts: a visual formalism for complex systems. Science of Computer Programming 8(3), 231–274 (1987), http://www.sciencedirect.com/science/article/pii/0167642387900359

    Article  MathSciNet  MATH  Google Scholar 

  20. Hayden, S., Oza, N., Mah, R., Mackey, R., Narasimhan, S., Karsai, G., Poll, S., Deb, S., Shirley, M.: Diagnostic technology evaluation report for on-board crew launch vehicle. Tech. rep., NASA (2006)

    Google Scholar 

  21. Jaffe, M., Busser, R., Daniels, D., Delseny, H., Romanski, G.: Progress report on some proposed upgrades to the conceptual underpinnings of do-178b/ed-12b. In: 2008 3rd IET International Conference on System Safety, pp. 1–6. IET (2008)

    Google Scholar 

  22. Johnson, S., Gormley, T., Kessler, S., Mott, C., Patterson-Hine, A., Reichard, K., Scandura Jr., P.: System Health Management: With Aerospace Applications. John Wiley & Sons, Inc. (2011)

    Google Scholar 

  23. de Lemos, R.: Analysing failure behaviours in component interaction. Journal of Systems and Software 71(1-2), 97–115 (2004)

    Article  Google Scholar 

  24. Lightstone, S.: Seven software engineering principles for autonomic computing development. ISSE 3(1), 71–74 (2007)

    Google Scholar 

  25. Lyu, M.R.: Software Fault Tolerance. John Wiley & Sons, Inc., New York (1995), http://www.cse.cuhk.edu.hk/~lyu/book/sft/

    Google Scholar 

  26. Lyu, M.R.: Software reliability engineering: A roadmap. In: 2007 Future of Software Engineering, FOSE 2007, pp. 153–170. IEEE Computer Society, Washington, DC (2007), http://dx.doi.org/10.1109/FOSE.2007.24

    Google Scholar 

  27. Mahadevan, N., Dubey, A., Karsai, G.: Application of software health management techniques. In: Proceedings of the 2011 ICSE Workshop on Software Engineering for Adaptive and Self-Managing Systems, SEAMS 2011. ACM, New York (2011)

    Google Scholar 

  28. Potocti de Montalk, J.: Computer software in civil aircraft. In: IEEE/AIAA 10th Digital Avionics Systems Conference, pp. 324–330 (October 1991)

    Google Scholar 

  29. NASA: Report on the loss of the mars polar lander and deep space 2 missions. Tech. rep., NASA (2000), ftp://ftp.hq.nasa.gov/pub/pao/reports/2000/2000_mpl_report_1.pdf

  30. Ofsthun, S.: Integrated vehicle health management for aerospace platforms. IEEE Instrumentation Measurement Magazine 5(3), 21–24 (2002)

    Article  Google Scholar 

  31. Ofsthun, S.C., Abdelwahed, S.: Practical applications of timed failure propagation graphs for vehicle diagnosis. In: Proc. IEEE Autotestcon, September 17-20, pp. 250–259 (2007)

    Google Scholar 

  32. Prisaznuk, P.: Arinc 653 role in integrated modular avionics (IMA). In: IEEE/AIAA 27th Digital Avionics Systems Conference, DASC 2008, pp. 1.E.5–1 – 1.E.5–10. IEEE (2008)

    Google Scholar 

  33. Pullum, L.L.: Software fault tolerance techniques and implementation. Artech House, Inc., Norwood (2001)

    MATH  Google Scholar 

  34. Robertson, P., Williams, B.: Automatic recovery from software failure. Commun. ACM 49(3), 41–47 (2006)

    Article  Google Scholar 

  35. Rohr, M., Boskovic, M., Giesecke, S., Hasselbring, W.: Model-driven development of self-managing software systems. In: Proceedings of the Workshop “Models@run.time” at the 9th International Conference on model Driven Engineering Languages and Systems, MoDELS/UML 2006 (2006)

    Google Scholar 

  36. Sha, L.: The complexity challenge in modern avionics software. In: National Workshop on Aviation Software Systems: Design for Certifiably Dependable Systems (2006)

    Google Scholar 

  37. Shaw, M.: “self-healing”: softening precision to avoid brittleness: position paper for woss 2002: workshop on self-healing systems. In: WOSS 2002: Proceedings of the First Workshop on Self-healing Systems, pp. 111–114. ACM Press, New York (2002)

    Chapter  Google Scholar 

  38. Srivastava, A., Schumann, J.: The Case for Software Health Management. In: Fourth IEEE International Conference on Space Mission Challenges for Information Technology, SMC-IT 2011, pp. 3–9 (August 2011)

    Google Scholar 

  39. Taleb-Bendiab, A., Bustard, D.W., Sterritt, R., Laws, A.G., Keenan, F.: Model-based self-managing systems engineering. In: DEXA Workshops, pp. 155–159 (2005)

    Google Scholar 

  40. Torres-Pomales, W.: Software fault tolerance: A tutorial. Tech. rep., NASA (2000), http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.32.8307

  41. Wallace, M.: Modular architectural representation and analysis of fault propagation and transformation. Electron. Notes Theor. Comput. Sci. 141(3), 53–71 (2005)

    Article  Google Scholar 

  42. Wang, N., Schmidt, D.C., O’Ryan, C.: Overview of the CORBA component model. In: Component-based Software Engineering: Putting the Pieces Together, pp. 557–571 (2001)

    Google Scholar 

  43. Williams, B., Ingham, M., Chung, S., Elliott, P.: Model-based programming of intelligent embedded systems and robotic space explorers. Proceedings of the IEEE 91(1), 212–237 (2003)

    Article  Google Scholar 

  44. Williams, B.C., Ingham, M., Chung, S., Elliott, P., Hofbaur, M., Sullivan, G.T.: Model-based programming of fault-aware systems. AI Magazine 24(4), 61–75 (2004)

    Google Scholar 

  45. Zhang, J., Cheng, B.H.C.: Specifying adaptation semantics. In: WADS 2005: Proceedings of the 2005 Workshop on Architecting Dependable Systems, pp. 1–7. ACM, New York (2005)

    Google Scholar 

  46. Zhang, J., Cheng, B.H.C.: Model-based development of dynamically adaptive software. In: ICSE 2006: Proceeding of the 28th International Conference on Software Engineering, pp. 371–380. ACM, New York (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Dubey, A., Karsai, G., Mahadevan, N. (2013). Fault-Adaptivity in Hard Real-Time Component-Based Software Systems. In: de Lemos, R., Giese, H., Müller, H.A., Shaw, M. (eds) Software Engineering for Self-Adaptive Systems II. Lecture Notes in Computer Science, vol 7475. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35813-5_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-35813-5_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-35812-8

  • Online ISBN: 978-3-642-35813-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics