Skip to main content
Log in

Bayesian learning of Bayesian networks with informative priors

  • Published:
Annals of Mathematics and Artificial Intelligence Aims and scope Submit manuscript

Abstract

This paper presents and evaluates an approach to Bayesian model averaging where the models are Bayesian nets (BNs). A comprehensive study of the literature on structural priors for BNs is conducted. A number of prior distributions are defined using stochastic logic programs and the MCMC Metropolis-Hastings algorithm is used to (approximately) sample from the posterior. We use proposals which are tightly coupled to the priors which give rise to cheaply computable acceptance probabilities. Experiments using data generated from known BNs have been conducted to evaluate the method. The experiments used 6 different BNs and varied: the structural prior, the parameter prior, the Metropolis-Hasting proposal and the data size. Each experiment was repeated three times with different random seeds to test the robustness of the MCMC-produced results. Our results show that with effective priors (i) robust results are produced and (ii) informative priors improve results significantly.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Abramson, B., Brown, J., Murphy, A., Winker, R.L.: Hailfinder: a Bayesian system for forecasting severe weather. Int. J. Forecast. 12, 57–71 (1996)

    Article  Google Scholar 

  2. Acid, S., de Campos, L.M.: Searching for Bayesian network structures in the space of restricted acyclic partially directed graphs. J. Artif. Intell. Res. 18, 445–490 (2003)

    MATH  Google Scholar 

  3. Andrieu, C., de Freitas, N., Doucet, A., Jordan, M.I.: An introduction to MCMC for machine learning. Mach. Learn. 50, 5–43 (2003)

    Article  MATH  Google Scholar 

  4. Angelopoulos, N., Cussens, J.: Markov chain Monte Carlo using tree-based priors on model structure. In: Breese, J., Koller, D. (eds.) Proceedings of the Seventeenth Annual Conference on Uncertainty in Artificial Intelligence (UAI–2001), Seattle, August 2001. Morgan Kaufmann, San Francisco (2001)

    Google Scholar 

  5. Angelopoulos, N., Cussens, J.: Extended stochastic logic programs for informative priors over C&RTs. In: Camacho, R., King, R., Srinivasan, A. (eds.) Proceedings of the work-in-progress track of the Fourteenth International Conference on Inductive Logic Programming (ILP04), pp. 7–11, Porto, September 2004

  6. Angelopoulos, N., Cussens, J.: On the implementation of MCMC proposals over stochastic logic programs. In: Colloquium on Implementation of Constraint and LOgic Programming Systems. Satellite workshop to ICLP’04, Saint-Malo, September 2004

  7. Angelopoulos, N., Cussens, J.: Exploiting informative priors for Bayesian classification and regression trees. In: Proc. 19th International Joint Conference on AI (IJCAI-05), Edinburgh, August 2005

  8. Angelopoulos, N., Cussens, J.: MCMCMS 0.3.4 User Guide. University of York (2005)

  9. Angelopoulos, N., Cussens, J.: Tempering for Bayesian C&RT. In: Proceedings of the 22nd International Conference on Machine Learning (ICML05), Bonn, 7–11 August 2005

  10. Beinlich, I.A., Suermondt, H.J., Chavez, R.M., Cooper, G.F.: The alarm monitoring system: a case study with two probabilistic inference techniques for belief networks. In: Proceedings of the European Conference on Artificial Intelligence in Medicine, pp. 247–256, London, 29–31 August 1989

  11. Binder, J., Koller, D., Russell, S., Kanazawa, K.: Adaptive probabilistic networks with hidden variables. Mach. Learn. 29, 213–244 (1997)

    Article  MATH  Google Scholar 

  12. Bøttcher, S.G., Dethlefsen, C.: Deal: a package for learning Bayesian networks. J. Stat. Softw. 8(20), 1–40 (2003)

    Google Scholar 

  13. Buntine, W.L.: Theory refinement of Bayesian networks. In: D’Ambrosio, B., Smets, P., Bonissone, P. (eds.) Proceedings of the Seventh Annual Conference on Uncertainty in Artificial Intelligence (UAI–1991), pp. 52–60, San Mateo, 13–15 July 1991

  14. Cameron, P.J.: First-order logic. In: Beineke, L.W., Wilson R.J. (eds.) Graph Connections: Relationships between Graph Theory and other Areas of Mathematics, pp. 70–85. Clarendon, Oxford (1997)

    Google Scholar 

  15. Castelo, R., Kočka, T.: On inclusion-driven learning of Bayesian networks. J. Mach. Learn. Res. 4, 527–574 (2003)

    Article  Google Scholar 

  16. Cooper, G., Herskovits, E.: A Bayesian method for the induction of probabilistic networks from data. Mach. Learn. 9, 309–347 (1992). Appeared as 1991 Technical Report KSL-91-02 for the Knowledge Systems Laboratory, Stanford University (also SMI-91-0355)

    MATH  Google Scholar 

  17. Cussens, J.: Stochastic logic programs: sampling, inference and applications. In: Proc. UAI-00, pp. 115–122. Morgan Kaufmann, San Francisco (2000)

    Google Scholar 

  18. Cussens, J.: Parameter estimation in stochastic logic programs. Mach. Learn. 44(3), 245–271 (2001)

    Article  MATH  Google Scholar 

  19. Dobra, A., Jones B., Hans, C., Nevins J., West, M.: Sparse graphical models for exploring gene expression data. J. Multivar. Anal. 90, 196–212 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  20. Egeland, T., Mostad, P., Mevåg, B., Stenersen, M.: Beyond traditional paternity and identification cases. Selecting the most probable pedigree. Forensic Sci. Int. 110(1), 47–59 (2000)

    Article  Google Scholar 

  21. Feller, W.: An Introduction to Probability Theory and Its Applications, vol. 1, 3rd edn. Wiley, New York (1950)

    Google Scholar 

  22. Frege, G.: Begriffsschrift, eine der arithmetischen nachgebildete Formelsprache des reinen Denkens (1879)

  23. Friedman, N., Koller, D.: Being Bayesian about network structure: a Bayesian approach to structure discovery in Bayesian networks. Mach. Learn. 50, 95–126 (2003)

    Article  MATH  Google Scholar 

  24. Gelman, A.: Parameterization and Bayesian modeling. J. Am. Stat. Assoc. 99(466), 537–545 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  25. Gilks, W.R., Richardson, S., Spiegelhalter, D.J., (eds.).: Markov Chain Monte Carlo in Practice. Chapman & Hall, London (1996)

    MATH  Google Scholar 

  26. Häggström, O.: Finite Markov Chains and Algorithmic Applications. London Mathematical Society Student Texts, vol. 52. Cambridge University Press, Cambridge (2002)

    MATH  Google Scholar 

  27. Heckerman, D., Geiger, D., Chickering, D.: Learning Bayesian networks: the combination of knowledge and statistical data. Mach. Learn. 20, 197–243 (1995). Also appears as Technical Report MSR-TR-94-09, Microsoft Research, March, 1994 (revised December, 1994)

    MATH  Google Scholar 

  28. Heckerman, D., Chickering, D.M., Meek, C., Rounthwaite, R., Kadie, C.: Dependency networks for inference, collaborative filtering, and data visualization. J. Mach. Learn. Res. 1, 49–75 (2000)

    Article  Google Scholar 

  29. Heckerman, D., Geiger, D., Chickering, D.M.: Learning Bayesian networks: the combination of knowledge and statistical data. Mach. Learn. 20(3), 197–243 (1995)

    MATH  Google Scholar 

  30. Højsgaard, S., Thiesson, B.: BIFROST—block recursive models induced from relevant knowledge, observations, and statistical techniques. Comput. Stat. Data Anal. 19, 155–175 (1995)

    Article  Google Scholar 

  31. Howson, C., Urbach, P.: Scientific Reasoning: The Bayesian Approach. Open Court, La Salle (1989)

    Google Scholar 

  32. Koivisto, M., Sood, K.: Exact Bayesian structure discovery in Bayesian networks. J. Mach. Learn. Res. 5, 549–573 (2004)

    MathSciNet  Google Scholar 

  33. Langseth, H., Nielsen, T.D.: Fusion of domain knowledge with data for structural learning in object oriented domains. J. Mach. Learn. Res. 4, 339–368 (2003)

    Article  MathSciNet  Google Scholar 

  34. Laskey, K.B., Myers, J.W.: Population Markov chain Monte Carlo. Mach. Learn. 50, 175–196 (2003)

    Article  MATH  Google Scholar 

  35. Lauritzen, S.L., Richardson, T.S.: Chain graph models and their causal interpretations. J. R. Stat. Soc. B 64(3), 321–361 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  36. Lauritzen, S.L., Spiegelhalter, D.J.: Local computations with probabilities on graphical structures and their applications to expert systems. J. R. Stat. Soc. A 50(2), 157–224 (1988)

    MATH  MathSciNet  Google Scholar 

  37. Madigan, D., York, J.: Bayesian graphical models for discrete data. Int. Stat. Rev. 63, 215–232 (1995)

    Article  MATH  Google Scholar 

  38. Madigan, D., Gavrin, J., Raftery, A.E.: Eliciting prior information to enhance the predictive performance of Bayesian graphical models. Commun. Stat. Theory Methods 24, 2271–2292 (1995). Appeared as 1994 Technical Report 270, University of Washington.

    Article  MATH  MathSciNet  Google Scholar 

  39. Madigan, D., Raftery, A.E.: Model selection and accounting for model uncertainty in graphical models using Occam’s window. J. Am. Stat. Assoc. 89, 1535–1546 (1994). First version was 1991 Technical Report 213, University of Washington.

    Article  MATH  Google Scholar 

  40. Muggleton, S.: Stochastic logic programs. In: De Raedt, L. (ed.) Advances in Inductive Logic Programming. Frontiers in Artificial Intelligence and Applications, vol. 32, pp. 254–264. IOS, Amsterdam (1996)

    Google Scholar 

  41. Nilsson, U., Małuszyński, J.: Logic, Programming and Prolog, 2nd edn. Wiley, Chichester (1995)

    Google Scholar 

  42. Richardson, M., Domingos, P.: Learning with knowledge from multiple experts. In: Proceedings of the Twentieth International Conference on Machine Learning. Morgan Kaufmann, Washington, DC (2003)

    Google Scholar 

  43. Robert, C.P., Casella, R.: Monte Carlo Statistical Methods, 2nd edn. Springer, New York (2004)

    MATH  Google Scholar 

  44. Sato, T., Kameya, Y.: Parameter learning of logic programs for symbolic-statistical modeling. J. Artif. Intell. Res. 15, 391–454 (2001)

    MATH  MathSciNet  Google Scholar 

  45. Segal, E., Pe’er, D., Regev, A., Koller, D., Friedman, N.: Learning module networks. J. Mach. Learn. Res. 6, 557–588 (2005)

    MathSciNet  Google Scholar 

  46. Sheehan, N., Sorensen, D.: Graphical models for mapping continuous traits. In: Green, P.J., Hjort, N.L., Richardson, S. (eds.) Highly Structured Stochastic Systems, pp. 382–386. Oxford University Press, Oxford (2003)

    Google Scholar 

  47. Srinivas, S., Russell, S., Agogino, A.M.: Automated construction of sparse Bayesian networks from unstructured probabilistic models and domain information. In: Henrion, M., Schachter, R., Kanal, L., Flemmer, J. (eds.) Uncertainty in Artificial Intelligence: Proceedings of the Fifth Conference (UAI-1989), pp. 295–308. Elsevier Science, New York (1990)

    Google Scholar 

  48. Stephens, M., Donelly, P.: A comparison of Bayesian methods for haplotype reconstruction from population genotype data. Am. J. Hum. Genet. 73, 1162–1169 (2003)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to James Cussens.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Angelopoulos, N., Cussens, J. Bayesian learning of Bayesian networks with informative priors. Ann Math Artif Intell 54, 53–98 (2008). https://doi.org/10.1007/s10472-009-9133-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10472-009-9133-x

Keywords

Mathematics Subject Classifications (2000)

Navigation