Abstract
With the popularization of network applications and multiprocessor systems, dependability of systems has drawn considerable attention. This paper presents a new technique of node grouping for system-level fault diagnosis to simplify the complexity of large system diagnosis. The technique transforms a complicated system to a group network, where each group may consist of many nodes that are either fault-free or faulty. It is proven that the transformation leads to a unique group network to ease system diagnosis. Then it studies systematically one-step t-faults diagnosis problem based on node grouping by means of the concept of independent point sets and gives a simple sufficient and necessary condition. The paper presents a diagnosis procedure for t-diagnosable systems. Furthermore, an efficient probabilistic diagnosis algorithm for practical applications is proposed based on the belief that most of the nodes in a system are fault-free. The result of software simulation shows that the probabilistic diagnosis provides high probability of correct diagnosis and low diagnosis cost, and is suitable for systems of any kind of topology.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Preparata F P, Metze G, Chien R T. On the connection assignment problem of diagnosable systems.IEEE Trans. Electronic Computer, 1967, 16(12): 848–854.
Barsi F, Grandoni F, Maestrini P. A theory of diagnosability of digital systems.IEEE Trans. Computers, 1976, 25(7): 585–593.
Chwa K Y, Hakimi S L. Schemes for fault-tolerant computing: A comparison of modularly redundant andt-diagnosable systems.Information Control, 1981, 49(2): 212–238.
Malek M. A comparison connection assignment for diagnosis of multiprocessor systems. InProc. 7th Symp. Comput. Architecture, IEEE, France, May, 1980, pp.31–36.
Maheshwart S N, Hakimi S L. On models for diagnosable system and probabilistic fault diagnosis.IEEE Trans. Computers, 1976, 25(3): 228–236.
Chen T H. Fault Diagnosis and Fault Tolerance: A Systematic Approach to Special Topics. Berlin: Springer-Verlag, 1992.
Andrzej Pelc. Optimal diagnosis of heterogeneous systems with random faults.IEEE Trans. Computers, 1998, 47(3): 298–304.
Krzysztrf Diks, Andrzej Pelc. Globally optimal diagnosis in system with random faults.IEEE Trans. Computers, 1997, 46(2): 200–204.
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was supported by the National Natural Science Foundation of China under the grants No.69973016 and No.69733010.
ZHANG Dafang, born in 1959, is a professor and Ph.D. supervisor and director of Department of Computer Science, Hunan University. Prof. Zhang is engaged in research and teaching of test & diagnosis, fault-tolerant computing, network and communication technology. He is now taking charge of 3 projects supported by the National Natural Science Foundation of China and 1 project by the National 863 Programme. He has published 11 books and edited 4 books. Now, Prof. Zhang is a member of the Technical Committee on Fault-Tolerant Computing of China Computer Federation, deputy director of a technical group on testing and diagnosis, and a member of IEEE. He also serves as vice-president of the Research Committee of the China Computer Continuing Education.
XIE Gaogang, born in 1974, received the M.S. degree in computer science from Hunan University. He is currently pursuing his Ph.D. degree in computer science at Hunan University and Institute of Computing Technology, The Chinese Academy of Sciences. His research interests include test & diagnosis, network and communication technology, distributed computing. He has published more than 10 papers on magazines and proceedings.
Rights and permissions
About this article
Cite this article
Zhang, D., Xie, G. & Min, Y. Node grouping in system-level fault diagnosis. J. Comput. Sci. & Technol. 16, 474–479 (2001). https://doi.org/10.1007/BF02948966
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF02948966