Fault tolerance in computerized systems involved in production has become an ever more important requirement. Existing fault tolerance approaches, wherever used, deal mainly with hardware faults. Nevertheless, the vast majority of contemporary system failures are software related. This paper introduces a knowledge-based approach to handling software related faults occurring in supervisory control systems. These systems are event driven and use data, stored in complex databases, to react to events coming from different kinds of devices by identifying, scheduling, initiating and monitoring operations. Failure of part of the supervisory control system's software to behave rationally when unexpected events occur is called an application fault. The approach introduced in this paper is based on a supervisory control system reference model which reveals the set of all possible application faults together with the major functions of the recovery processes associated with each fault, and leads to a high-level knowledge-based system architecture capable of handling every fault-related condition. This system is called PROFIT (Intelligent PROduction systems Fault Tolerance) and consists of three main components: the fault diagnosis module, the instant fault correction module and the learning module, co-ordinated by a PROFIT meta-level module. The prototype version of PROFIT is analysed and the development as well as the run-time environment that prove the applicability and effectiveness of the system are presented.
Similar content being viewed by others
References
ADEPT Consortium (1990) Definition of requirements for ADEPT, Project ADEPT 2331, CEC Deliverable D1.
ADEPT Consortium (1990) Definition of logical model for application fault tolerant systems, Project ADEPT 2331, CEC Deliverable D3.
Anderson, T. and Kerr, R. (1976) Recovery blocks in action: a system supporting high reliability, in Proceedings of the 2nd International Conference on Software Engineering, pp. 447–457.
Avizienis, A. (1985) The N-version approach to fault tolerant software. IEEE Transactions on Software Engineering, 11 (12), 1491–1501.
Jackson, P. and King, M. (1989) ADEPT - a way forward, in Proceedings of the 5th CIM Europe Conference, 17–19 May, Athens, Greece.
Kelly, J. P. J., Eckhardt, D. E. Jr., Vouk, M. A., McAllister, D. F. and Caglayan, A. (1988) A large scale second generation experiment in multi-version software: description and early results, in Proceedings of the 18th International Symposium on Fault Tolerant Computing, June, pp. 9–14.
Kim, K. H. and Yoon, J. C. (1988) Approaches to implementation of a repairable distributed recovery block scheme, in Proceedings of the 18th International Symposium on Fault Tolerant Computing, June, pp. 50–55.
McGill, W. F. and Smith, S. E. (1984) Fault tolerance in continuous process control, IEEE Micro, 4 (6), 22–23.
O'Grady, P. J. (1986) Controlling Automated Manufacturing Systems, Kogan Page.
Prerau, D. S. (1990) Developing and Managing Expert Systems. Proven Techniques for Business and Industry, Addison-Wesley.
Serlin, O. (1984). Fault tolerant systems in commercial applications. Computer, August, 19–30.
Tello, E. R. (1989) Object-Oriented Programming For Artificial Intelligence. A Guide to Tools and System Design, Addison-Wesley.
Williams, T. J. (1984) The development of reliability in industrial control systems. IEEE Micro, 4 (6), 66–80.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Askounis, D.T., Assimakopoulos, V. & Psarras, J. Fault tolerance in supervisory control systems: a knowledge-based approach. J Intell Manuf 5, 323–331 (1994). https://doi.org/10.1007/BF00127650
Issue Date:
DOI: https://doi.org/10.1007/BF00127650