The growing complexity of distributed systems in terms of hardware components, operating system, communication and application software and the huge amount of dependencies among them have caused an increase in demand for distributed management systems. An efficient distributed management system needs to work effectively even in face of incomplete management information, uncertain situations, and dynamic changes. In this paper, Bayesian networks are proposed to model dependencies between managed objects in distributed systems. The strongest dependency route (SDR) algorithm is developed for backward inference in Bayesian networks. The SDR algorithm can track the strongest causes and trace the strongest routes between particular effects and its causes, the strongest dependency of causes can be also achieved by the algorithm. Thus, the backward inference provides an efficient mechanism in fault locating, and is beneficial for performance management.








Similar content being viewed by others
REFERENCES
A. Osmani and F. Krief, Model-Based Diagnosis for Fault Management in ATM Networks, Proceedings of International Conference on ATM ICATM 99. pp. 91–99, 1999.
J. Zupan and D. Medhi, An Alarm Management Approach in the Management of Multi-Layered Networks, 3rd IEEE International Workshop on IP Operations & Management (IPOM 2003), pp. 77–84. 2003.
R. E. Miller and K. A. Arisha, Fault Management Using Passive Testing for Mobile IPv6 Networks, Proceedings of 2001 IEEE Global Telecommunications Conference. Vol. 3, pp. 1923–1927, 2001.
I. Rouvellou and G. W. Hart, Automatic Alarm Correlation for Fault Identification, Proceedings of IEEE INFOCOM’95, pp. 553–561, 1995.
S. A. Yemini, S. Kliger, E. Mozes, Y. Yemini, and D. Ohsie, High speed and robust event correlation, IEEE Communications Magazine, Vol. 34, No. 5, pp. 82–90, 1996.
C. Lo, S. H. Chen, and B. Lin, Coding-based schemes for fault identification in communication networks, Journal of Network and Systems Management, Vol. 10, No. 3, pp. 157–164, 2000.
L. Lewis, A case-based reasoning approach to the resolution of faults in communication networks, in Integrated Network Management, III, Elsevier Science Publishers B.V., Amsterdam, pp. 671–682, 1993.
G. Pemido, J. Nogueira, and C. Machado, An Automatic Fault Diagnosis and Correction System for Telecommunications Management, Proceedings of 6th IFIP/IEEE International Symposium on Integrated Network Management, pp. 777–791, 1999.
S. Kätker and K. Geihs, A generic model for fault isolation in integrated management systems, Journal of Network and Systems Management Vol. 5, No. 2, pp. 109–130, 1997.
R. H. Deng, A. A. Lazar, and W. Wang, A probabilistic approach to fault diagnosis in linear lightwave networks, IEEE Journal on Selected Areas in Communications, Vol. 11, No. 9, pp. 1438–1448, 1993.
C. S. Hood and C. Ji, Proactive network-fault detection, IEEE Transactions on Reliability, Vol. 46, No. 3, pp. 333–341, 1997.
R. Sterritt and D. W. Bustard, Fusing hard and soft computing for fault management in telecommunications systems, IEEE Transactions on Systems, Man, and Cybernetics, Part C, Vol. 32, No. 2, pp. 92–98, 2002.
C. S. Chao, D. L. Yang, and A. C. Liu. An automated fault diagnosis system using hierarchical reasoning and alarm correlation, Journal of Network and Systems Management, Vol. 9, No. 2, pp. 183–202, 2001.
C. Hill, High-availability systems boost network uptime: Part 1, http://www.eetasia.com/ARTICLES/2001JUL/2001JUL01_NTEK_ST_QA_TA.PDF. Motorola Telecom Business Unit, 2001.
D. Nikovski, Constructing Bayesian networks for medical diagnosis from incomplete and partially correct statistics, IEEE Transactions on Knowledge and Data Engineering, Vol. 12, No. 4, pp. 509–516, 2000.
W. Wiegerinck, H. J. Kappen, E. W. M. T. ter Braak, W. J. P. P. ter Burg, M. J. Nijman, Y. L. O, and J. P. Neijt, Approximate inference for medical diagnosis, Pattern Recognition Letters, Vol. 20, pp. 1231–1239, 1999.
K. Basye, T. Dean, and J. Scott Vitter, Coping with Uncertainty in Map Learning, Machine Learning Vol. 29, No. 1, pp. 65–88, 1997.
E. Charniak and R. P. Goldman, A Semantics for Probabilistic Quantifier-Free First-Order Languages, with Particular Application to Story Understanding, Proceedings of the IJCAI-89, pp. 1074–1079, Morgan-Kaufmann.
J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, Morgan Kaufmann, San Mateo, CA, 1988.
R. G. Cowell, A. P. Dawid, S. L. Lauritzen, and D. J. Spiegelhalter, Probabilistic Networks and Expert Systems, Springer-Verlag, New York, 1999.
J. Pearl, Causality: Models, Reasoning, and Inference, Cambridge, England, Cambridge University Press, New York, NY, 2000.
Y. Xiang, Probabilistic Reasoning in Multiagent Systems: A graphical models approach, Cambridge University Press, Cambridge, ISBN 0-521-81308-5, 2002.
I. Katzela and M. Schwarz, Schemes for fault identification in communication networks, IEEE Transactions on Networking, Vol. 3, No. 6, pp. 733–764, 1995.
S. Klinger, S. Yemini, Y. Yemini, D. Ohsie and S. Stolfo, A Coding Approach to Event Correlation, Proceedings of the fourth international symposium on Integrated network management IV, pp. 266–277, 1995.
D. Heckerman and M. P. Wellman, Bayesian networks, Communications of the ACM, Vol. 38, No. 3, pp. 27–30, 1995.
M. Gupta, A. Neogi, M. K. Agarwal and G. Kar, Discovering Dynamic Dependencies in Enterprise Environments for Problem Determination, 14th IEEE/IFIP International Workshop on Distributed Systems Operations and Management, Heidelberg, Germany, 2003.
A. Keller, U. Blumenthal and G. Kar, Classification and Computation of Dependencies for Distributed Management, Proceedings of 5th IEEE Symposium on Computers and Communications, Antibes-Juan-les-Pins, France, 2000.
J. Gao, G. Kar and P. Kermani, Approaches to Building Self Healing Systems using Dependency Analysis, Proceedings of the IEEE/IFIP Network Operations and Management Symposium, April, 2004.
M. Matsumoto and Y. Kurita, Twisted GFSR generators, ACM Transactions on Modeling and Computer Simulation, Vol. 2, pp. 179–194, 1992.
M. Matsumoto and Y. Kurita, Twisted GFSR generators II, ACM Transactions on Modeling and Computer Simulation, Vol. 4, pp. 254–266, 1994.
S. L. Lauritzen and D. J. Spiegelhalter, Local computations with probabilities on graphical structures and their application to expert systems, Journal of the Royal Statistical Society, Series B, Vol. 50, pp. 157–224, 1988.
J. Pearl, A Constraint-Propagation Approach to Probabilistic Reasoning, Uncertainty in Artificial Intelligence, North-Holland, Amsterdam, pp. 357–369, 1986.
R. M. Neal, Probabilistic inference using Markov chain Monte Carlo methods, Technical Report CRG-TR93-1, University of Toronto, Department of Computer Science, 1993.
G. Cooper, Computational complexity of probabilistic inference using Bayesian belief networks, Artificial Intelligence, Vol. 42, pp. 393–405, 1990.
F. L. Koch, and C. B. Westphall, Decentralized network management using distributed artificial intelligence, Journal of Network and Systems Management, Vol. 9, No. 4, December 2001.
C. F. Aliferis and G. F. Cooper, A Structurally and Temporally Extended Bayesian Belief Network Model: Definitions, Properties, and Modeling Techniques, Proceedings of the 12th Conference on Uncertainty in Artificial Intelligence, pp. 28–39, 1996.
J. D. Young and E. Santos, Introduction to Temporal Bayesian Networks, Presented at the 7th Midwest AI and Cognitive Science Conference, 1996.
A. S. Weigend and N. A. Gershenfeld, Time Series Prediction: Forecasting the Future and Understanding the Past, Addison-Wesley, ISBN: 0-201-62602-0, 1994.
H. J. Suermondt and G. F. Cooper, Probabilistic inference in multiply connected belief network using loop cutsets, International Journal of Approximate Reasoning, Vol. 4, pp. 283–306, 1990.
ACKNOWLEDGMENTS
This research is part of international quality networks (IQN) project and is supported by DAAD (the German Academic Exchange Service). The authors would like to thank Carsten Schippang for providing the sample data of the campus network of FernUniversität Hagen for a whole year. Also many thanks are due to the anonymous reviewers for their valuable comments.
Author information
Authors and Affiliations
Corresponding author
Additional information
Jianguo Ding received his M. Sc. in computer science from Hefei University of Technology, P. R. China, in 1999. He obtained a joint Ph.D. in computer science between Shanghai Jiao Tong University in P. R. China and FernUniversität Hagen in Germany in 2005. He was supported by DAAD (the German Academic Exchange Service) scholarship. He is a member of the IEEE. His current research interests include distributed systems management, intelligent technology and probabilistic reasoning.
Bernd Krämer is a full professor at FernUniversität in Hagen, Germany. He obtained his diploma and doctorate in computer science from the Technical University of Berlin. He is the president of the international Society for Process and Design Sciences. His research interests include distributed software engineering, e-learning technology, distributed systems management, and dependable software.
Yingcai Bai graduated from Tsinghua University, P. R. China. He is a full professor at Shanghai Jiao Tong University, P. R. China. He is also the president of Shanghai Engineering Center of GOLDEN Network and the president of Shanghai Computer Open System Association. His research interests include network architecture, network security, and distributed systems management.
Hansheng Chen graduated in mathematics from Fudan University, P. R. China. He is a professor at East-China Institute of Computer Technology. He is a visiting professor at FernUniversität in Hagen, Germany. His research focuses on software engineering and distributed systems.
Rights and permissions
About this article
Cite this article
Ding, J., Krämer, B., Bai, Y. et al. Backward Inference in Bayesian Networks for Distributed Systems Management. J Netw Syst Manage 13, 409–427 (2005). https://doi.org/10.1007/s10922-005-9003-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10922-005-9003-8