Skip to main content
Log in

Probabilistic fault localization with sliding windows

  • Research Paper
  • Published:
Science China Information Sciences Aims and scope Submit manuscript

Abstract

Fault localization is a central element in network fault management. This paper takes a weighted bipartite graph as a fault propagation model and presents a heuristic fault localization algorithm based on the idea of incremental coverage, which is resilient to inaccurate fault propagation model and the noisy environment. Furthermore, a sliding window mechanism is proposed to tackle the inaccuracy of this algorithm in the presence of improper time windows. As shown in the simulation study, our scheme achieves higher detection rate and lower false positive rate in the noisy environment as well as in the presence of inaccurate windows, than current fault localization algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Steinder M, Sethi A S. A survey of fault localization techniques in computer networks. Sci Comput Progr, 2004, 53: 165–194

    Article  MathSciNet  MATH  Google Scholar 

  2. Mas C, Thiran P. An efficient algorithm for locating soft and hard failures in WDM networks. IEEE J Sel Area Commun, 2000, 18: 1900–1911

    Article  Google Scholar 

  3. Wang C, Schwartz M. Fault detection with multiple observers. IEEE/ACM Trans Netw, 1993, 1: 48–55

    Article  Google Scholar 

  4. Liu G, Mok A K, Yang E J. Composite events for network event correlation. In: Proceedings of IFIP/IEEE International Symposium on Integrated Network Management(IM), Boston, 1999. 247–260

  5. Lewis L. A case-based reasoning approach to the resolution of faults in communications networks. In: Proceedings of IFIP/IEEE International Symposium on Integrated Network Management(IM), San Francisco, 1993. 671–681

  6. Wietgrefe H. Investigation and practical assessment of alarm correlation methods for the use in GSM access networks. In: Proceedings of IFIP/IEEE Network Operation and Management Symposium(NOMS), Florence, 2002. 391–404

  7. Benveniste A, Fabre E, Haar S, et al. Diagnosis of asynchronous discrete-event systems: a net unfolding approach. IEEE Trans Aut Contr, 2003, 48: 714–727

    Article  MathSciNet  Google Scholar 

  8. Rouvellou I, Hart G W. Automatic alarm correlation for fault identification. In: Proceedings of 14th Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM). Bringing Information to People, Boston, 1995. 553–561

  9. Zhang C, Liao J X, Zhu X M. SWPM: An incremental fault localization algorithm based on sliding window with preprocessing mechanism. In: Proceedings of 9th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT), New Zealand, 2008. 235–242

  10. Brodie M, Rish I, Ma S, et al. Active probing strategies for problem diagnosis in distributed systems. In: Proceeding of International Joint Conferences on Artificial Intelligence(IJCAI), Acapulco, 2003. 1337–1338

  11. Tang Y N, Al-Shaer E S, Boutaba R. Active integrated fault localization in communication networks. In: Proceeding of 9th IFIP/IEEE International Symposium on Integrated Network Management (IM), Nice, 2005. 543–556

  12. Katzela I, Schwartz M. Schemes for fault identification in communication networks. IEEE/ACM Trans Netw, 1995, 3: 733–764

    Article  Google Scholar 

  13. Peng G Q, Cheng H. A causal model for diagnostic reasoning. J Comput Sci Tech, 2000, 15: 287–294

    Article  MathSciNet  MATH  Google Scholar 

  14. Kandula S, Katabi D, Vasseur J P. Shrink: a tool for failure diagnosis in IP networks. In: ACM SIGCOMM Workshop on Mining Network Data (MineNet), Philadelphia, 2005. 173-178

  15. Khanafer R M, Solana B, Triola J, et al. Automated diagnosis for UMTS networks using Bayesian network approach. IEEE Trans Vehic Tech, 2008, 57: 2451–2461

    Article  Google Scholar 

  16. Steinder M, Sethi A S. Probabilistic fault localization in communication systems using belief networks. IEEE/ACM Trans Netw, 2004, 12: 809–822

    Article  Google Scholar 

  17. Rao N S V. Computational complexity issues in operative diagnosis of graph-based systems. IEEE Trans Comput, 1993, 42: 447–457

    Article  Google Scholar 

  18. Kompella R R, Yates J, Greenberg A, et al. IP fault localization via risk modeling. In: Proceedings of 2nd ACM/USENIX Symposium on Networked Systems Design and Implementation (NSDI), Boston, 2005. 57–70

  19. Huang X H, Zou S H, Wang W D, et al. Fault management for Internet service: modeling and algorithms. In: Proceedings of IEEE Communication on Conference (ICC), Istanbul, 2006. 854–859

  20. Steinder M, Sethi A S. Probabilistic event-driven fault diagnosis through incremental hypothesis updating. In: Proceedings of IFIP/IEEE International Symposium on Integrated Network Management(IM), Colorado Springs, 2003. 635–648

  21. Zheng Q H, Qian Y T. An event correlation approach based on the combination of IHU and codebook. In: International Conference Computational Intelligence and Security(CIS), Xi’an, 2005. 757–763

  22. Zheng Q H, Qian Y T, Yao M. A network event correlation algorithm based on fault filtration. In: Proceeding of the 9th Pacific Rim International Conference on Artificial Intelligence (PRICAI), Guilin, 2006. 864–869

  23. Natu M, Sethi A S. Probabilistic fault diagnosis using adaptive probing. In: IFIP/IEEE International Workshop on Distributed Systems: Operations and Managements(DSOM), San Jose, 2007. 38–49

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to JianXin Liao.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, C., Liao, J., Li, T. et al. Probabilistic fault localization with sliding windows. Sci. China Inf. Sci. 55, 1186–1200 (2012). https://doi.org/10.1007/s11432-012-4567-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11432-012-4567-x

Keywords

Navigation