Skip to main content
Log in

An Isochronous Testing Strategy for Hierarchical Adaptive Distributed System-Level Diagnosis

  • Published:
Journal of Electronic Testing Aims and scope Submit manuscript

Abstract

Distributed System-level diagnosis allows the fault-free components of a fault-tolerant distributed system to determine which components of the system are faulty and which are fault-free. The time it takes for nodes running the algorithm to diagnose a new event is called the algorithm's latency. In this paper we present a new distributed system-level diagnosis algorithm which presents a latency of O(log N) testing rounds, for a system of N nodes. A previous hierarchical distributed system-level diagnosis algorithm, Hi-ADSD, presents a latency of O(log 2 N) testing rounds. Nodes are grouped in progressively larger logical clusters for the purpose of testing. The algorithm employs an isochronous testing strategy that forces all fault-free nodes to execute tests on clusters of the same size each testing round. This strategy is based on two main principles: a tested node must test its tester in the same round; a node only accepts tests according to a lexical priority order. We present formal proofs that the algorithm's latency is at most 2log N – 1 testing rounds and that the testing strategy of the algorithm leads to the execution of isochronous tests. Simulation results are shown for systems of up to 64 nodes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. R.P. Bianchini and R. Buskens, “Implementation of On-Line Distributed System-Level Diagnosis Theory,” IEEE Transactions on Computers, vol. 41, pp. 616-626, 1992.

    Google Scholar 

  2. E.P. Duarte Jr. and T. Nanya, “Multi-Cluster Adaptive Distributed System-Level Diagnosis Algorithms,” IEICE Technical Report FTS 95-73, 1995.

  3. E.P. Duarte Jr. and T. Nanya, “A Hierarchical Adaptive Distributed System-Level Diagnosis Algorithm,” IEEE Transactions on Computers, vol. 47,no. 1, pp. 34-45, Jan 1998.

    Google Scholar 

  4. S.L. Hakimi and A.T. Amin, “Characterization of Connection Assignments of Diagnosable Systems,” IEEE Transactions on Computers, vol. 23, pp. 86-88, 1974.

    Google Scholar 

  5. S.L. Hakimi and K. Nakajima, “On Adaptive System Diagnosis,” IEEE Transactions on Computers, vol. 33, pp. 234-240, 1984.

    Google Scholar 

  6. S.H. Hosseini, J.G. Kuhl, and S.M. Reddy, “A Diagnosis Algorithm for Distributed Computing Systems with Failure and Repair,” IEEE Transactions on Computers, vol. 33, pp. 223-233, 1984.

    Google Scholar 

  7. P. Jalote, Fault Tolerance in Distributed Systems, Englewood Cliffs, N.J.: Prentice Hall, 1994.

    Google Scholar 

  8. M.H. MacDougall, Simulating Computer Systems: Techniques and Tools, Cambridge, MA: The MIT Press, 1987.

    Google Scholar 

  9. G. Masson, D. Blough, and G. Sullivan, “System Diagnosis,” in Fault-Tolerant Computer System Design, D.K. Pradhan (ed)., Prentice-Hall, 1996.

  10. F. Preparata, G. Metze, and R.T. Chien, “On The Connection Assignment Problem of Diagnosable Systems,” IEEE Transactions on Electronic Computers, vol. 16, pp. 848-854, 1968.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Brawerman, A., Duarte, E.P. An Isochronous Testing Strategy for Hierarchical Adaptive Distributed System-Level Diagnosis. Journal of Electronic Testing 17, 185–195 (2001). https://doi.org/10.1023/A:1011182029135

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1011182029135

Navigation