Abstract
The use of fault-tolerant design to increase the reliability of computer systems is widely accepted. For highly critical applications, the level of confidence that may be assigned to predictions of the reliability of such systems is limited by many factors, including uncertainty in assumed fault types, limitations in the verifiability of hardware and software designs, inadequate models for the causes and effects of design errors, imperfect testability of physical implementations, and inadequate consideration of the effects of human error. New techniques for specification, design and analysis are being developed, but research and development must be accelerated to keep up with the rapid pace of increasing device and system complexity. Better integration of life-cycle considerations is needed to adapt systems to actual fault conditions, and to achieve reliable and efficient interaction with operators and maintenance persons.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Anderson, T. and P.A. Lee. Fault Tolerance, Principles and Practice. Prentice/Hall International, 1981.
Avizienis, A. (editor). Special Issue on Fault Tolerant Computing. Proceedings of IEEE, 1978. Volume 66, Number 10.
Boyer, R. and Moore J. A Computational Logic. Academic Press, 1979.
Chen, L. and A. Avizienis. N-Version Programming: A Fault-Tolerance Approach to Reliability of Software Operation. In Digest of the 8th Int. Symp. on Fault-Tolerant Computing. IEEE, June, 1978.
Costes, A., J.E. Doucet, C. Landrault, and J.C. Laprie. SURF: A Program for Dependability Evaluation of Complex Fault-tolerant Computing Systems. In Digest of the 11th Int. Symp. on Fault-Tolerant Computing. IEEE, June, 1981.
Dolev, D. The Byzantine Generals Strike Again. Journal of Algorithms 3(l):14–30, 1982.
Genesereth, M.R. Diagnosis Using Hierarchical Design Models. In Proc. of the Natl. Conf. on Artificial Intelligence, AAAI82, pages 278–283. AAAI, Aug., 1982.
Genesereth, M.R. An Overview of Meta-Level Architecture. In Proc. of the Natl. Conf. on Artificial Intelligence, AAAI-83, pages 119–123. AAAI, Aug., 1983.
Giloth, P.K. and K.D. Frantzen. Can the Reliability of Telecommunication Switching Systems be Predicted and Measured? In Digest of the 13th Int. Symp. on Fault-Tolerant Computing, pages 392–398. IEEE, June, 1983.
Goldberg, Jack. SIFT: A Provable Fault-Tolerant Computer for Aircraft Flight Control. In S.H. Lavington (editor), Information Processing 80, pages 151–156. International Federation for Information Processing (IFIP), 1980.
Goldberg, Jack. The SIFT Computer and Its Development. In Proceedings of the 4th Digital Avionics Systems Conference. IEEE, Nov., 1981.
Hughes, J.L.A. Error Detection and Correction Techniques for Dataflow Systems. In Digest of the 13th Intl. Symp. on Fault-Tolerant Computing, pages 318–321. IEEE, June, 1983.
Igarashi, S., London, R., Luckham, D. Automatic Program Verification I: A Logical Basis and its Implementation. Ada Informatica 4:145–182, 1975.
Kelly, J.P.J. and A. Avizienis. A Specification-Oriented Multi-Version Software Experiment. In Digest of the 13th Int. Symp. on Fault-Tolerant Computing, pages 120–126. IEEE, June, 1983.
Lamport, L. and Melliar-Smith, P.M. Synchronizing Clocks in the Presence of Faults. 1982. Revised from July 1981.
Lamport, L., Shostak, R., and Pease, M. The Byzantine Generals Problem. ACM Tran. on Prog. Lang, and Sys. 4(3):382–401, Jul, 1982.
Melliar-Smith, P.M. and Schwartz, R.L. Formal Specification and Mechanical Verification of SIFT: A Fault-Tolerant Flight Control System. IEEE Transactions on Computers C-31(7):616–630, Jul, 1982.
Meyer, John. Closed-Form Solutions of Performability. In Digest of the 11th Int. Symp. on Fault-Tolerant Computing. IEEE, June, 1981.
Pease, M., Shostak, R., and Lamport, L. Reaching Agreements in the Presence of Faults. Journal of the Association for Computing Machinery 27(2):228–234, Apr, 1980.
Randell, B. System Structure for Software Fault Tolerance. IEEE Trans. on Software Engineering SE-1, No. 2, June 1975.
Randell, Brian. The Structuring of Distributed Computing Systems. Univ. of Newcastle upon Tyne, Computing Laboratory:, 1982.
Russell, David. State Restoration in Systems of Communicating Processes. IEEE Trans. on Software Engineering SE-6, No. 2:183–194, March 1980.
Shostak, R., Schwartz, R.L. and Melliar-Smith, P.M. STP: A Mechanized Logic for Specification and Verification. In 6th Conference on Automated Deduction. International Federation for Information Processing (IFIP), Jun, 1982.
Shrivastava, S.K. and F. Panzieri. The Design of a Reliable Remote Procedure Call Mechanism. IEEE trans. on Computers C-31, No. 7:692–697, July 1982.
Siewiorek, D.P. and Swarz, R.S. The Theory and Practice of Reliable System Design. Digital Press, 1982.
Weinstock, C.B. SIFT: System Design and Implementation. In Digest of the 10th Int. Symp. on Fault-Tolerant Computing. Oct., 1980.
Wensley, J.H., et al. SIFT: Design and Analysis of a Fault-Tolerant Computer for Aircraft Control. Proceedings of the IEEE 66(10): 1240–1255, Oct, 1978.
Siewiorek, D.P. and Swarz, R.S., The Theory and Practice of Reliable System Design, Digital Press, 1982.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1984 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Goldberg, J. (1984). The Problem of Confidence in Fault-Tolerant Computer Design. In: Wettstein, H. (eds) Architektur und Betrieb von Rechensystemen. Informatik-Fachberichte, vol 78. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-69394-6_25
Download citation
DOI: https://doi.org/10.1007/978-3-642-69394-6_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-12913-4
Online ISBN: 978-3-642-69394-6
eBook Packages: Springer Book Archive