Abstract
We consider the problem of fault diagnosis in multiprocessor systems. Every processor can test its neighbors; fault-free processors correctly identify the fault status of tested neighbors, while faulty testers can give arbitrary test results. Processors fail independently with constant probability p<1/2 and the goal is to identify correctly the status of all processors, based on the set of test results. We give fast diagnosis algorithms with the highest possible probability of correctness for systems represented by complete bipartite graphs and by simple paths. This is for the first time that the most reliable fault diagnosis is given for these systems in a probabilistic model without any assumptions on the behavior of faulty processors.
Research partly supported by a grant from KBN. This work was done during the author's stay at the Université du Québec à Hull, supported by NSERC International Fellowship.
Research supported in part by NSERC grant OGP 0008136.
Preview
Unable to display preview. Download preview PDF.
References
D.M. Blough and A. Pelc, Complexity of fault diagnosis in comparison models, IEEE Transactions on Computers 41 (1992), 318–324.
D.M. Blough, G.F. Sullivan and G.M. Masson, Efficient diagnosis of multiprocessor systems under probabilistic models, IEEE Transactions on Computers 41 (1992), 1126–1136.
D.M. Blough, G.F. Sullivan and G.M. Masson, Intermittent fault diagnosis in multiprocessor systems, IEEE Transactions on Computers 41 (1992), 1430–1441.
M. Blount, Probabilistic treatment of diagnosis in digital systems, Dig. 7th Int. Symp. Fault-Tolerant Computing, IEEE Computer Society Press, (1977), 72–77.
A.T. Dahbura, An efficient algorithm for identifying the most likely fault set in a probabilistically diagnosable system, IEEE Transactions on Computers 35 (1986), 354–356.
A.T. Dahbura, System-level diagnosis: A perspective for the third decade, Concurrent Computation: Algorithms, Architectures, Technologies, Plenum Press, New York (1988).
K. Diks and A. Pelc, Globally optimal diagnosis in systems with random faults, IEEE Transactions on Computers, to appear.
S.N. Maheshwari and S.L. Hakimi, On models for diagnosable systems and probabilistic fault diagnosis, IEEE Transactions on Computers 25 (1976), 228–236.
A. Pelc, Undirected graph models for system-level fault diagnosis, IEEE Transactions on Computers 40 (1991), 1271–1276.
F. Preparata, G. Metze and R. Chien, On the connection assignment problem of diagnosable systems, IEEE Transactions on Electron. Computers 16 (1967), 848–854.
S. Rangarajan and D. Fussell, A probabilistic method for fault diagnosis of multiprocessor systems, Dig. 18th Int. Symp. Fault-Tolerant Computing, IEEE Computer Society Press, (1988), 278–283.
E. Scheinerman, Almost sure fault-tolerance in random graphs, SIAM Journal on Computing 16 (1987), 1124–1134.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1997 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Diks, K., Pelc, A. (1997). System diagnosis with smallest risk of error. In: d'Amore, F., Franciosa, P.G., Marchetti-Spaccamela, A. (eds) Graph-Theoretic Concepts in Computer Science. WG 1996. Lecture Notes in Computer Science, vol 1197. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-62559-3_13
Download citation
DOI: https://doi.org/10.1007/3-540-62559-3_13
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-62559-9
Online ISBN: 978-3-540-68072-7
eBook Packages: Springer Book Archive