Backward Inference in Bayesian Networks for Distributed Systems Management

Ding, Jianguo; Krämer, Bernd; Bai, Yingcai; Chen, Hansheng

doi:10.1007/s10922-005-9003-8

Backward Inference in Bayesian Networks for Distributed Systems Management

Published: 10 December 2005

Volume 13, pages 409–427, (2005)
Cite this article

Journal of Network and Systems Management Aims and scope Submit manuscript

Jianguo Ding¹,
Bernd Krämer²,
Yingcai Bai³ &
…
Hansheng Chen⁴

237 Accesses
13 Citations
Explore all metrics

The growing complexity of distributed systems in terms of hardware components, operating system, communication and application software and the huge amount of dependencies among them have caused an increase in demand for distributed management systems. An efficient distributed management system needs to work effectively even in face of incomplete management information, uncertain situations, and dynamic changes. In this paper, Bayesian networks are proposed to model dependencies between managed objects in distributed systems. The strongest dependency route (SDR) algorithm is developed for backward inference in Bayesian networks. The SDR algorithm can track the strongest causes and trace the strongest routes between particular effects and its causes, the strongest dependency of causes can be also achieved by the algorithm. Thus, the backward inference provides an efficient mechanism in fault locating, and is beneficial for performance management.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Checking Causal Consistency of MongoDB

Article 31 January 2022

The Transition from A Priori to A Posteriori Information: Bayesian Procedures in Distributed Large-Scale Data Processing Systems

Article 01 July 2018

A Novel Fault Diagnosis and Recovery Mechanism Based on Events Prediction in Distributed Network

REFERENCES

A. Osmani and F. Krief, Model-Based Diagnosis for Fault Management in ATM Networks, Proceedings of International Conference on ATM ICATM 99. pp. 91–99, 1999.
J. Zupan and D. Medhi, An Alarm Management Approach in the Management of Multi-Layered Networks, 3rd IEEE International Workshop on IP Operations & Management (IPOM 2003), pp. 77–84. 2003.
R. E. Miller and K. A. Arisha, Fault Management Using Passive Testing for Mobile IPv6 Networks, Proceedings of 2001 IEEE Global Telecommunications Conference. Vol. 3, pp. 1923–1927, 2001.
I. Rouvellou and G. W. Hart, Automatic Alarm Correlation for Fault Identification, Proceedings of IEEE INFOCOM’95, pp. 553–561, 1995.
S. A. Yemini, S. Kliger, E. Mozes, Y. Yemini, and D. Ohsie, High speed and robust event correlation, IEEE Communications Magazine, Vol. 34, No. 5, pp. 82–90, 1996.
Article Google Scholar
C. Lo, S. H. Chen, and B. Lin, Coding-based schemes for fault identification in communication networks, Journal of Network and Systems Management, Vol. 10, No. 3, pp. 157–164, 2000.
Google Scholar
L. Lewis, A case-based reasoning approach to the resolution of faults in communication networks, in Integrated Network Management, III, Elsevier Science Publishers B.V., Amsterdam, pp. 671–682, 1993.
Google Scholar
G. Pemido, J. Nogueira, and C. Machado, An Automatic Fault Diagnosis and Correction System for Telecommunications Management, Proceedings of 6th IFIP/IEEE International Symposium on Integrated Network Management, pp. 777–791, 1999.
S. Kätker and K. Geihs, A generic model for fault isolation in integrated management systems, Journal of Network and Systems Management Vol. 5, No. 2, pp. 109–130, 1997.
Article Google Scholar
R. H. Deng, A. A. Lazar, and W. Wang, A probabilistic approach to fault diagnosis in linear lightwave networks, IEEE Journal on Selected Areas in Communications, Vol. 11, No. 9, pp. 1438–1448, 1993.
Article Google Scholar
C. S. Hood and C. Ji, Proactive network-fault detection, IEEE Transactions on Reliability, Vol. 46, No. 3, pp. 333–341, 1997.
Article Google Scholar
R. Sterritt and D. W. Bustard, Fusing hard and soft computing for fault management in telecommunications systems, IEEE Transactions on Systems, Man, and Cybernetics, Part C, Vol. 32, No. 2, pp. 92–98, 2002.
Article Google Scholar
C. S. Chao, D. L. Yang, and A. C. Liu. An automated fault diagnosis system using hierarchical reasoning and alarm correlation, Journal of Network and Systems Management, Vol. 9, No. 2, pp. 183–202, 2001.
Article Google Scholar
C. Hill, High-availability systems boost network uptime: Part 1, http://www.eetasia.com/ARTICLES/2001JUL/2001JUL01_NTEK_ST_QA_TA.PDF. Motorola Telecom Business Unit, 2001.
D. Nikovski, Constructing Bayesian networks for medical diagnosis from incomplete and partially correct statistics, IEEE Transactions on Knowledge and Data Engineering, Vol. 12, No. 4, pp. 509–516, 2000.
Article Google Scholar
W. Wiegerinck, H. J. Kappen, E. W. M. T. ter Braak, W. J. P. P. ter Burg, M. J. Nijman, Y. L. O, and J. P. Neijt, Approximate inference for medical diagnosis, Pattern Recognition Letters, Vol. 20, pp. 1231–1239, 1999.
Article Google Scholar
K. Basye, T. Dean, and J. Scott Vitter, Coping with Uncertainty in Map Learning, Machine Learning Vol. 29, No. 1, pp. 65–88, 1997.
Article MATH Google Scholar
E. Charniak and R. P. Goldman, A Semantics for Probabilistic Quantifier-Free First-Order Languages, with Particular Application to Story Understanding, Proceedings of the IJCAI-89, pp. 1074–1079, Morgan-Kaufmann.
J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, Morgan Kaufmann, San Mateo, CA, 1988.
Google Scholar
R. G. Cowell, A. P. Dawid, S. L. Lauritzen, and D. J. Spiegelhalter, Probabilistic Networks and Expert Systems, Springer-Verlag, New York, 1999.
Google Scholar
J. Pearl, Causality: Models, Reasoning, and Inference, Cambridge, England, Cambridge University Press, New York, NY, 2000.
Google Scholar
Y. Xiang, Probabilistic Reasoning in Multiagent Systems: A graphical models approach, Cambridge University Press, Cambridge, ISBN 0-521-81308-5, 2002.
I. Katzela and M. Schwarz, Schemes for fault identification in communication networks, IEEE Transactions on Networking, Vol. 3, No. 6, pp. 733–764, 1995.
Article Google Scholar
S. Klinger, S. Yemini, Y. Yemini, D. Ohsie and S. Stolfo, A Coding Approach to Event Correlation, Proceedings of the fourth international symposium on Integrated network management IV, pp. 266–277, 1995.
D. Heckerman and M. P. Wellman, Bayesian networks, Communications of the ACM, Vol. 38, No. 3, pp. 27–30, 1995.
Article Google Scholar
M. Gupta, A. Neogi, M. K. Agarwal and G. Kar, Discovering Dynamic Dependencies in Enterprise Environments for Problem Determination, 14th IEEE/IFIP International Workshop on Distributed Systems Operations and Management, Heidelberg, Germany, 2003.
A. Keller, U. Blumenthal and G. Kar, Classification and Computation of Dependencies for Distributed Management, Proceedings of 5th IEEE Symposium on Computers and Communications, Antibes-Juan-les-Pins, France, 2000.
J. Gao, G. Kar and P. Kermani, Approaches to Building Self Healing Systems using Dependency Analysis, Proceedings of the IEEE/IFIP Network Operations and Management Symposium, April, 2004.
M. Matsumoto and Y. Kurita, Twisted GFSR generators, ACM Transactions on Modeling and Computer Simulation, Vol. 2, pp. 179–194, 1992.
Article MATH Google Scholar
M. Matsumoto and Y. Kurita, Twisted GFSR generators II, ACM Transactions on Modeling and Computer Simulation, Vol. 4, pp. 254–266, 1994.
Article MATH Google Scholar
S. L. Lauritzen and D. J. Spiegelhalter, Local computations with probabilities on graphical structures and their application to expert systems, Journal of the Royal Statistical Society, Series B, Vol. 50, pp. 157–224, 1988.
MATH MathSciNet Google Scholar
J. Pearl, A Constraint-Propagation Approach to Probabilistic Reasoning, Uncertainty in Artificial Intelligence, North-Holland, Amsterdam, pp. 357–369, 1986.
Google Scholar
R. M. Neal, Probabilistic inference using Markov chain Monte Carlo methods, Technical Report CRG-TR93-1, University of Toronto, Department of Computer Science, 1993.
G. Cooper, Computational complexity of probabilistic inference using Bayesian belief networks, Artificial Intelligence, Vol. 42, pp. 393–405, 1990.
Article MathSciNet Google Scholar
F. L. Koch, and C. B. Westphall, Decentralized network management using distributed artificial intelligence, Journal of Network and Systems Management, Vol. 9, No. 4, December 2001.
C. F. Aliferis and G. F. Cooper, A Structurally and Temporally Extended Bayesian Belief Network Model: Definitions, Properties, and Modeling Techniques, Proceedings of the 12th Conference on Uncertainty in Artificial Intelligence, pp. 28–39, 1996.
J. D. Young and E. Santos, Introduction to Temporal Bayesian Networks, Presented at the 7th Midwest AI and Cognitive Science Conference, 1996.
A. S. Weigend and N. A. Gershenfeld, Time Series Prediction: Forecasting the Future and Understanding the Past, Addison-Wesley, ISBN: 0-201-62602-0, 1994.
H. J. Suermondt and G. F. Cooper, Probabilistic inference in multiply connected belief network using loop cutsets, International Journal of Approximate Reasoning, Vol. 4, pp. 283–306, 1990.
Article MATH MathSciNet Google Scholar

Download references

ACKNOWLEDGMENTS

This research is part of international quality networks (IQN) project and is supported by DAAD (the German Academic Exchange Service). The authors would like to thank Carsten Schippang for providing the sample data of the campus network of FernUniversität Hagen for a whole year. Also many thanks are due to the anonymous reviewers for their valuable comments.

Author information

Authors and Affiliations

Software Engineering Institute, East China Normal University, Shanghai, 200062, P. R. China
Jianguo Ding
Department of Electrical Engineering and Information Engineering, FernUniversität Hagen, Hagen, Germany
Bernd Krämer
Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, P. R. China
Yingcai Bai
East-china Institute of Computer Technology, Shanghai, P. R. China
Hansheng Chen

Authors

Jianguo Ding
View author publications
You can also search for this author inPubMed Google Scholar
Bernd Krämer
View author publications
You can also search for this author inPubMed Google Scholar
Yingcai Bai
View author publications
You can also search for this author inPubMed Google Scholar
Hansheng Chen
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Jianguo Ding.

Additional information

Jianguo Ding received his M. Sc. in computer science from Hefei University of Technology, P. R. China, in 1999. He obtained a joint Ph.D. in computer science between Shanghai Jiao Tong University in P. R. China and FernUniversität Hagen in Germany in 2005. He was supported by DAAD (the German Academic Exchange Service) scholarship. He is a member of the IEEE. His current research interests include distributed systems management, intelligent technology and probabilistic reasoning.

Bernd Krämer is a full professor at FernUniversität in Hagen, Germany. He obtained his diploma and doctorate in computer science from the Technical University of Berlin. He is the president of the international Society for Process and Design Sciences. His research interests include distributed software engineering, e-learning technology, distributed systems management, and dependable software.

Yingcai Bai graduated from Tsinghua University, P. R. China. He is a full professor at Shanghai Jiao Tong University, P. R. China. He is also the president of Shanghai Engineering Center of GOLDEN Network and the president of Shanghai Computer Open System Association. His research interests include network architecture, network security, and distributed systems management.

Hansheng Chen graduated in mathematics from Fudan University, P. R. China. He is a professor at East-China Institute of Computer Technology. He is a visiting professor at FernUniversität in Hagen, Germany. His research focuses on software engineering and distributed systems.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ding, J., Krämer, B., Bai, Y. et al. Backward Inference in Bayesian Networks for Distributed Systems Management. J Netw Syst Manage 13, 409–427 (2005). https://doi.org/10.1007/s10922-005-9003-8

Download citation

Published: 10 December 2005
Issue Date: December 2005
DOI: https://doi.org/10.1007/s10922-005-9003-8

KEY WORDS:

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Backward Inference in Bayesian Networks for Distributed Systems Management

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Checking Causal Consistency of MongoDB

The Transition from A Priori to A Posteriori Information: Bayesian Procedures in Distributed Large-Scale Data Processing Systems

A Novel Fault Diagnosis and Recovery Mechanism Based on Events Prediction in Distributed Network

REFERENCES

ACKNOWLEDGMENTS

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

KEY WORDS:

Subscribe and save

Buy Now