Skip to main content
Log in

Backward Inference in Bayesian Networks for Distributed Systems Management

  • Published:
Journal of Network and Systems Management Aims and scope Submit manuscript

The growing complexity of distributed systems in terms of hardware components, operating system, communication and application software and the huge amount of dependencies among them have caused an increase in demand for distributed management systems. An efficient distributed management system needs to work effectively even in face of incomplete management information, uncertain situations, and dynamic changes. In this paper, Bayesian networks are proposed to model dependencies between managed objects in distributed systems. The strongest dependency route (SDR) algorithm is developed for backward inference in Bayesian networks. The SDR algorithm can track the strongest causes and trace the strongest routes between particular effects and its causes, the strongest dependency of causes can be also achieved by the algorithm. Thus, the backward inference provides an efficient mechanism in fault locating, and is beneficial for performance management.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.
Fig. 6.
Fig. 7.
Fig. 8.

Similar content being viewed by others

REFERENCES

  1. A. Osmani and F. Krief, Model-Based Diagnosis for Fault Management in ATM Networks, Proceedings of International Conference on ATM ICATM 99. pp. 91–99, 1999.

  2. J. Zupan and D. Medhi, An Alarm Management Approach in the Management of Multi-Layered Networks, 3rd IEEE International Workshop on IP Operations & Management (IPOM 2003), pp. 77–84. 2003.

  3. R. E. Miller and K. A. Arisha, Fault Management Using Passive Testing for Mobile IPv6 Networks, Proceedings of 2001 IEEE Global Telecommunications Conference. Vol. 3, pp. 1923–1927, 2001.

  4. I. Rouvellou and G. W. Hart, Automatic Alarm Correlation for Fault Identification, Proceedings of IEEE INFOCOM’95, pp. 553–561, 1995.

  5. S. A. Yemini, S. Kliger, E. Mozes, Y. Yemini, and D. Ohsie, High speed and robust event correlation, IEEE Communications Magazine, Vol. 34, No. 5, pp. 82–90, 1996.

    Article  Google Scholar 

  6. C. Lo, S. H. Chen, and B. Lin, Coding-based schemes for fault identification in communication networks, Journal of Network and Systems Management, Vol. 10, No. 3, pp. 157–164, 2000.

    Google Scholar 

  7. L. Lewis, A case-based reasoning approach to the resolution of faults in communication networks, in Integrated Network Management, III, Elsevier Science Publishers B.V., Amsterdam, pp. 671–682, 1993.

    Google Scholar 

  8. G. Pemido, J. Nogueira, and C. Machado, An Automatic Fault Diagnosis and Correction System for Telecommunications Management, Proceedings of 6th IFIP/IEEE International Symposium on Integrated Network Management, pp. 777–791, 1999.

  9. S. Kätker and K. Geihs, A generic model for fault isolation in integrated management systems, Journal of Network and Systems Management Vol. 5, No. 2, pp. 109–130, 1997.

    Article  Google Scholar 

  10. R. H. Deng, A. A. Lazar, and W. Wang, A probabilistic approach to fault diagnosis in linear lightwave networks, IEEE Journal on Selected Areas in Communications, Vol. 11, No. 9, pp. 1438–1448, 1993.

    Article  Google Scholar 

  11. C. S. Hood and C. Ji, Proactive network-fault detection, IEEE Transactions on Reliability, Vol. 46, No. 3, pp. 333–341, 1997.

    Article  Google Scholar 

  12. R. Sterritt and D. W. Bustard, Fusing hard and soft computing for fault management in telecommunications systems, IEEE Transactions on Systems, Man, and Cybernetics, Part C, Vol. 32, No. 2, pp. 92–98, 2002.

    Article  Google Scholar 

  13. C. S. Chao, D. L. Yang, and A. C. Liu. An automated fault diagnosis system using hierarchical reasoning and alarm correlation, Journal of Network and Systems Management, Vol. 9, No. 2, pp. 183–202, 2001.

    Article  Google Scholar 

  14. C. Hill, High-availability systems boost network uptime: Part 1, http://www.eetasia.com/ARTICLES/2001JUL/2001JUL01_NTEK_ST_QA_TA.PDF. Motorola Telecom Business Unit, 2001.

  15. D. Nikovski, Constructing Bayesian networks for medical diagnosis from incomplete and partially correct statistics, IEEE Transactions on Knowledge and Data Engineering, Vol. 12, No. 4, pp. 509–516, 2000.

    Article  Google Scholar 

  16. W. Wiegerinck, H. J. Kappen, E. W. M. T. ter Braak, W. J. P. P. ter Burg, M. J. Nijman, Y. L. O, and J. P. Neijt, Approximate inference for medical diagnosis, Pattern Recognition Letters, Vol. 20, pp. 1231–1239, 1999.

    Article  Google Scholar 

  17. K. Basye, T. Dean, and J. Scott Vitter, Coping with Uncertainty in Map Learning, Machine Learning Vol. 29, No. 1, pp. 65–88, 1997.

    Article  MATH  Google Scholar 

  18. E. Charniak and R. P. Goldman, A Semantics for Probabilistic Quantifier-Free First-Order Languages, with Particular Application to Story Understanding, Proceedings of the IJCAI-89, pp. 1074–1079, Morgan-Kaufmann.

  19. J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, Morgan Kaufmann, San Mateo, CA, 1988.

    Google Scholar 

  20. R. G. Cowell, A. P. Dawid, S. L. Lauritzen, and D. J. Spiegelhalter, Probabilistic Networks and Expert Systems, Springer-Verlag, New York, 1999.

    Google Scholar 

  21. J. Pearl, Causality: Models, Reasoning, and Inference, Cambridge, England, Cambridge University Press, New York, NY, 2000.

    Google Scholar 

  22. Y. Xiang, Probabilistic Reasoning in Multiagent Systems: A graphical models approach, Cambridge University Press, Cambridge, ISBN 0-521-81308-5, 2002.

  23. I. Katzela and M. Schwarz, Schemes for fault identification in communication networks, IEEE Transactions on Networking, Vol. 3, No. 6, pp. 733–764, 1995.

    Article  Google Scholar 

  24. S. Klinger, S. Yemini, Y. Yemini, D. Ohsie and S. Stolfo, A Coding Approach to Event Correlation, Proceedings of the fourth international symposium on Integrated network management IV, pp. 266–277, 1995.

  25. D. Heckerman and M. P. Wellman, Bayesian networks, Communications of the ACM, Vol. 38, No. 3, pp. 27–30, 1995.

    Article  Google Scholar 

  26. M. Gupta, A. Neogi, M. K. Agarwal and G. Kar, Discovering Dynamic Dependencies in Enterprise Environments for Problem Determination, 14th IEEE/IFIP International Workshop on Distributed Systems Operations and Management, Heidelberg, Germany, 2003.

  27. A. Keller, U. Blumenthal and G. Kar, Classification and Computation of Dependencies for Distributed Management, Proceedings of 5th IEEE Symposium on Computers and Communications, Antibes-Juan-les-Pins, France, 2000.

  28. J. Gao, G. Kar and P. Kermani, Approaches to Building Self Healing Systems using Dependency Analysis, Proceedings of the IEEE/IFIP Network Operations and Management Symposium, April, 2004.

  29. M. Matsumoto and Y. Kurita, Twisted GFSR generators, ACM Transactions on Modeling and Computer Simulation, Vol. 2, pp. 179–194, 1992.

    Article  MATH  Google Scholar 

  30. M. Matsumoto and Y. Kurita, Twisted GFSR generators II, ACM Transactions on Modeling and Computer Simulation, Vol. 4, pp. 254–266, 1994.

    Article  MATH  Google Scholar 

  31. S. L. Lauritzen and D. J. Spiegelhalter, Local computations with probabilities on graphical structures and their application to expert systems, Journal of the Royal Statistical Society, Series B, Vol. 50, pp. 157–224, 1988.

    MATH  MathSciNet  Google Scholar 

  32. J. Pearl, A Constraint-Propagation Approach to Probabilistic Reasoning, Uncertainty in Artificial Intelligence, North-Holland, Amsterdam, pp. 357–369, 1986.

    Google Scholar 

  33. R. M. Neal, Probabilistic inference using Markov chain Monte Carlo methods, Technical Report CRG-TR93-1, University of Toronto, Department of Computer Science, 1993.

  34. G. Cooper, Computational complexity of probabilistic inference using Bayesian belief networks, Artificial Intelligence, Vol. 42, pp. 393–405, 1990.

    Article  MathSciNet  Google Scholar 

  35. F. L. Koch, and C. B. Westphall, Decentralized network management using distributed artificial intelligence, Journal of Network and Systems Management, Vol. 9, No. 4, December 2001.

  36. C. F. Aliferis and G. F. Cooper, A Structurally and Temporally Extended Bayesian Belief Network Model: Definitions, Properties, and Modeling Techniques, Proceedings of the 12th Conference on Uncertainty in Artificial Intelligence, pp. 28–39, 1996.

  37. J. D. Young and E. Santos, Introduction to Temporal Bayesian Networks, Presented at the 7th Midwest AI and Cognitive Science Conference, 1996.

  38. A. S. Weigend and N. A. Gershenfeld, Time Series Prediction: Forecasting the Future and Understanding the Past, Addison-Wesley, ISBN: 0-201-62602-0, 1994.

  39. H. J. Suermondt and G. F. Cooper, Probabilistic inference in multiply connected belief network using loop cutsets, International Journal of Approximate Reasoning, Vol. 4, pp. 283–306, 1990.

    Article  MATH  MathSciNet  Google Scholar 

Download references

ACKNOWLEDGMENTS

This research is part of international quality networks (IQN) project and is supported by DAAD (the German Academic Exchange Service). The authors would like to thank Carsten Schippang for providing the sample data of the campus network of FernUniversität Hagen for a whole year. Also many thanks are due to the anonymous reviewers for their valuable comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jianguo Ding.

Additional information

Jianguo Ding received his M. Sc. in computer science from Hefei University of Technology, P. R. China, in 1999. He obtained a joint Ph.D. in computer science between Shanghai Jiao Tong University in P. R. China and FernUniversität Hagen in Germany in 2005. He was supported by DAAD (the German Academic Exchange Service) scholarship. He is a member of the IEEE. His current research interests include distributed systems management, intelligent technology and probabilistic reasoning.

Bernd Krämer is a full professor at FernUniversität in Hagen, Germany. He obtained his diploma and doctorate in computer science from the Technical University of Berlin. He is the president of the international Society for Process and Design Sciences. His research interests include distributed software engineering, e-learning technology, distributed systems management, and dependable software.

Yingcai Bai graduated from Tsinghua University, P. R. China. He is a full professor at Shanghai Jiao Tong University, P. R. China. He is also the president of Shanghai Engineering Center of GOLDEN Network and the president of Shanghai Computer Open System Association. His research interests include network architecture, network security, and distributed systems management.

Hansheng Chen graduated in mathematics from Fudan University, P. R. China. He is a professor at East-China Institute of Computer Technology. He is a visiting professor at FernUniversität in Hagen, Germany. His research focuses on software engineering and distributed systems.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ding, J., Krämer, B., Bai, Y. et al. Backward Inference in Bayesian Networks for Distributed Systems Management. J Netw Syst Manage 13, 409–427 (2005). https://doi.org/10.1007/s10922-005-9003-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10922-005-9003-8

KEY WORDS:

Navigation