Software Fault Tolerance in Safety-Critical Applications

Leveson, Nancy G.

doi:10.1007/978-3-642-45628-2_1

Nancy G. Leveson³

Part of the book series: Informatik-Fachberichte ((INFORMATIK,volume 147))

Abstract

Software fault tolerance has primarily been aimed at increasing total software reliability. Unfortunately, it is impossible to provide general techniques that tolerate all faults with a very high confidence rate. This paper presents some of the available experimental evidence. However, in some situations a more limited fault tolerance may be all that is needed, i.e., the program must be able to prevent unsafe states (but not necessarily all incorrect states) or detect them and recover to a safe (but not necessarily correct) state. This approach is application-specific; the particular fault-tolerance facilities are designed specifically for the particular application. This paper briefly describes how this can be accomplished. Although more specific analysis of the problem is required for this approach than the more general ones, it provides the advantage of partial verification of the adequacy of the fault tolerance used (e.g., it is possible to show that certain hazardous states cannot be caused by software faults) and therefore will aid in certifying and licensing software that can potentially have catastrophic consequences. That is, the approach provides greater confidence about a more limited goal than more general approaches. These techniques can also be used to tailor more general fault-tolerance techniques, such as recovery blocks, and to aid in writing acceptance tests that will ensure safety. Even with the use of these techniques, systems with very low acceptable risk may not be able to be built using software components.

The work reported in this paper was partially supported by Micros grants funded by the University of California, TRW, and Hughes Aircraft Co., by NASA grants NAG-1-511 and NAG-1-668, and by NSF grants DCR-8406532 and DCR-8521398.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Strategies for Enhancing Software Reliability: A Focus on Fault Avoidance Techniques

What is a fault? and why does it matter?

Article 22 August 2017

Fault Detection Model for Software Correctness and Reliability

References

Anderson, T., Barrett, P.A., Halliwell, D.N., and Moulding, M.R. “An evaluation of software fault tolerance in a practical system,” 15th Int. Symposium on Fault Tolerant Computing (FTCS-15), June 1985, pp. 140-145.
Google Scholar
Andrews, D.M. and Benson, J.T. “An automated program testing methodology and its implementation,” Proc. 5th Int. Conference on Software Engineering, San Diego, CA, March 1981.
Google Scholar
Brilliant, S.S, Knight, J.C., and Leveson, N.G. “Analysis of faults in an n-version software experiment,” submitted for publication, 1986.
Google Scholar
Brilliant, S.S., Knight, J.C., and Leveson, N.G. “The consistent comparison problem in n-version software,” ACM SIGSOFT Software Engineering Notes, vol. 12, no. 1, January 1987).
Google Scholar
Cha, S., Leveson, N.G., Shimeall, T.J., and Knight, J.C. “An empirical study of software error detection using self-checks,” 17th Int. Symposium on Fault Tolerant Computing, Pittburgh, July 1987.
Google Scholar
Chen, L. and Avizienis, A. “N-version programming: A fault-tolerance approach to reliability of software operation,” 8th Int. Symposium on Fault Tolerant Computing, Toulouse, France, June 1978, pp. 3-9.
Google Scholar
Joyce, E. “Software bugs: A matter of life and liability,” Datamation, vol. 33, no. 10, 15 May 1987, pp. 88–92.
Google Scholar
Kit, E. “State-of-the-art C Compiler Testing,” Tandem Systems Review, vol. 2, no. 2, June 1986, pp. 73–78.
Google Scholar
Knight, J.C. and Leveson, N.G. “An experimental evaluation of the assumption of independence in multiversion programming,” IEEE Trans. on Software Engineering, vol. SE-12, no. 1, January 1986, pp. 96–109.
Google Scholar
Knight, J.C. and Leveson, N.G. “An empirical study of failure probabilities in multiversion software,” Proc. 16th Int. Symposium on Fault Tolerant Computing (FTCS-16), Vienna, Austria, July 1986, pp. 165-170.
Google Scholar
Leveson, N.G. “Software safety: Why, what, and how,” ACM Computing Surveys, vol. 18, no. 2, June 1986, pp. 125–163.
Article Google Scholar
Leveson, N.G. and Harvey, P.R. “Analyzing software safety,” IEEE Trans. on Software Engineering, vol. SE-9, no. 5, September 1983, pp. 569–579.
Article Google Scholar
Leveson, N.G. and Stolzy, J.L. “Safety analysis using petri nets,” IEEE Trans. on Software Engineering, vol. SE-13, no. 3, March 1987, pp. 386–397.
Article Google Scholar
Randell, B. “System structure for software fault tolerance,” IEEE Trans. on Software Engineering, vol. SE-1, pp. 220–232, June 1975.
Google Scholar
Scott, R.K., Gault, J.W., McAllister, D.F. “Fault-tolerant software reliability modeling,” IEEE Trans. on Software Engineering, vol. SE-13, no.5, May 1987, pp. 582–592.
Article Google Scholar
Stucki, L.G. “New directions in automated tools for improving software quality,” Current Trends in Programming Methodology (Volume II: Program Validation), Prentice-Hall, 1977.
Google Scholar
Thompson, K. “Reflections on trusting trust,” Communications of the ACM, vol. 27, no. 8, August 1984, pp. 761–763.
Article Google Scholar
Vesely, W.E., Goldberg, F.F., Roberts, N.H., and Haasl, D.F. Fault Tree Handbook, NUREG-0492, U.S. Nuclear Regulatory Commission, January 1981.
Google Scholar
Yount, L.J., Lievel, K.A., and Hill. B.H. “Fault effect protection and partitioning for fly-by-wire/fly-by-light avionics systems,” AIAA Computers in Aerospace V Conference, Long Beach, CA, October 1985, pp.275-284.
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Information and Computer Science, University of California, Irvine, Irvine, CA, 92717, USA
Nancy G. Leveson

Authors

Nancy G. Leveson
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Fachbereich 2, Hochschule Bremerhaven, Bürgermeister-Smidt-Straße 20, D-2850, Bremerhaven, Germany
F. Belli
Institut für Rechnerentwurf und Fehlertoleranz Fakultät für Informatik, Universität Karlsruhe, Postfach 6980, D-7500, Karlsruhe 1, Germany
W. Görke

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Leveson, N.G. (1987). Software Fault Tolerance in Safety-Critical Applications. In: Belli, F., Görke, W. (eds) Fehlertolerierende Rechensysteme / Fault-Tolerant Computing Systems. Informatik-Fachberichte, vol 147. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-45628-2_1

Download citation

DOI: https://doi.org/10.1007/978-3-642-45628-2_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-18294-8
Online ISBN: 978-3-642-45628-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics