Abstract
Learning belief networks from large domains can be expensive even with single-link lookahead search (SLLS). Since a SLLS cannot learn correctly in a class of problem domains, multi-link lookahead search (MLLS) is needed which further increases the computational complexity. In our experiment, learning in some difficult domains over more than a dozen variables took days. In this paper, we study how to use parallelism to speed up SLLS for learning in large domains and to tackle the increased complexity of MLLS for learning in difficult domains. We propose a natural decomposition of the learning task for parallel processing. We investigate two strategies for job allocation among processors to further improve load balancing and efficiency of the parallel system. For learning from very large datasets, we present a regrouping of the available processors such that slow data access through the file system can be replaced by fast memory access. Experimental results in a distributed memory MIMD computer demonstrate the effectiveness of the proposed algorithms.
Similar content being viewed by others
References
Beinlich, I.A., Suermondt, H.J., Chavez, R.M., and Cooper, G.F., 1989. The alarm monitoring system: a case study with two probabilistic inference techniques for belief networks. Technical Report KSL-88-84, Knowledge Systems Lab, Medical Computer Science, Stanford University.
Chickering, D., Geiger, D., and Heckerman, D. 1995. Learning Bayesian networks: serach methods and experimental results. In Proc. of 5th Conf. on Artificial Intelligence and Statistics. Ft. Lauderdale, Society for AI and Statistics, pp. 112–128.
Cooper, G.F. and Herskovits, E. 1992. A Bayesian method for the induction of probabilistic networks from data. Machine Learning, (9):309–347.
Heckerman, D., Geiger, D., and Chickering, D.M. 1995. Learning Bayesian networks: the combination of knowledge and statistical data. Machine Learning, 20:197–243.
Herskovits, E.H. and Cooper, G.F. 1990. Kutato: an entropy-driven system for construction of probabilistic expert systems from database. Proc. 6th Conf. on Uncertainty in Artificial Intelligence. Cambridge, pp. 54–62.
Hu, J. 1997. Learning belief networks in pseudo indeependent domains. Master's thesis. University of Regina.
Jensen, F.V. 1996. An Introduction to Bayesian Networks. UCL Press.
Kullback, S. and Leibler, R.A. 1951. On information and sufficiency. Annals of Mathematical Statistics, 22:79–86.
Lam, W. and Bacchus, F. 1994. Learning Bayesian networks: an approach based on the MDL principle. Computational Intelligence, 10(3):269–293.
Lewis, T.G. and El-Rewini, H. 1992. Introduction to Parallel Computing. Prentice Hall.
Moldovan, D.I. 1993. Parallel Processing: From Applications To Systems. Morgan Kaufman.
Pearl, J. 1988. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann.
Spirtes, P. and Glymour, C. 1991. An algorithm for fast recovery of sparse causal graphs. Social Science Computer Review, 9(1):62–73.
Xiang, Y. 1997. Towards understanding of pseudo-independent domains. Poster Proc. 10th Inter. Symposium on Methodologies for Intelligent Systems.
Xiang, Y., Wong, S.K.M., and Cercone, N. 1996. Critical remarks on single link search in learning belief networks. Proc. 12th Conf. on Uncertainty in Artificial Intelligence, Portland, pp. 564–571.
Xiang, Y., Wong, S.K.M., and Cercone, N. 1997. A ‘microscopic’ study of minimum entropy search in learning decomposable Markov networks. Machine Learning, 26(1):65–92.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Xiang, Y., Chu, T. Parallel Learning of Belief Networks in Large and Difficult Domains. Data Mining and Knowledge Discovery 3, 315–339 (1999). https://doi.org/10.1023/A:1009888910252
Issue Date:
DOI: https://doi.org/10.1023/A:1009888910252