Bayesian learning of Bayesian networks with informative priors

Angelopoulos, Nicos; Cussens, James

doi:10.1007/s10472-009-9133-x

Bayesian learning of Bayesian networks with informative priors

Published: 14 April 2009

Volume 54, pages 53–98, (2008)
Cite this article

Annals of Mathematics and Artificial Intelligence Aims and scope Submit manuscript

Nicos Angelopoulos¹ &
James Cussens²

390 Accesses
17 Citations
Explore all metrics

Abstract

This paper presents and evaluates an approach to Bayesian model averaging where the models are Bayesian nets (BNs). A comprehensive study of the literature on structural priors for BNs is conducted. A number of prior distributions are defined using stochastic logic programs and the MCMC Metropolis-Hastings algorithm is used to (approximately) sample from the posterior. We use proposals which are tightly coupled to the priors which give rise to cheaply computable acceptance probabilities. Experiments using data generated from known BNs have been conducted to evaluate the method. The experiments used 6 different BNs and varied: the structural prior, the parameter prior, the Metropolis-Hasting proposal and the data size. Each experiment was repeated three times with different random seeds to test the robustness of the MCMC-produced results. Our results show that with effective priors (i) robust results are produced and (ii) informative priors improve results significantly.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Abramson, B., Brown, J., Murphy, A., Winker, R.L.: Hailfinder: a Bayesian system for forecasting severe weather. Int. J. Forecast. 12, 57–71 (1996)
Article Google Scholar
Acid, S., de Campos, L.M.: Searching for Bayesian network structures in the space of restricted acyclic partially directed graphs. J. Artif. Intell. Res. 18, 445–490 (2003)
MATH Google Scholar
Andrieu, C., de Freitas, N., Doucet, A., Jordan, M.I.: An introduction to MCMC for machine learning. Mach. Learn. 50, 5–43 (2003)
Article MATH Google Scholar
Angelopoulos, N., Cussens, J.: Markov chain Monte Carlo using tree-based priors on model structure. In: Breese, J., Koller, D. (eds.) Proceedings of the Seventeenth Annual Conference on Uncertainty in Artificial Intelligence (UAI–2001), Seattle, August 2001. Morgan Kaufmann, San Francisco (2001)
Google Scholar
Angelopoulos, N., Cussens, J.: Extended stochastic logic programs for informative priors over C&RTs. In: Camacho, R., King, R., Srinivasan, A. (eds.) Proceedings of the work-in-progress track of the Fourteenth International Conference on Inductive Logic Programming (ILP04), pp. 7–11, Porto, September 2004
Angelopoulos, N., Cussens, J.: On the implementation of MCMC proposals over stochastic logic programs. In: Colloquium on Implementation of Constraint and LOgic Programming Systems. Satellite workshop to ICLP’04, Saint-Malo, September 2004
Angelopoulos, N., Cussens, J.: Exploiting informative priors for Bayesian classification and regression trees. In: Proc. 19th International Joint Conference on AI (IJCAI-05), Edinburgh, August 2005
Angelopoulos, N., Cussens, J.: MCMCMS 0.3.4 User Guide. University of York (2005)
Angelopoulos, N., Cussens, J.: Tempering for Bayesian C&RT. In: Proceedings of the 22nd International Conference on Machine Learning (ICML05), Bonn, 7–11 August 2005
Beinlich, I.A., Suermondt, H.J., Chavez, R.M., Cooper, G.F.: The alarm monitoring system: a case study with two probabilistic inference techniques for belief networks. In: Proceedings of the European Conference on Artificial Intelligence in Medicine, pp. 247–256, London, 29–31 August 1989
Binder, J., Koller, D., Russell, S., Kanazawa, K.: Adaptive probabilistic networks with hidden variables. Mach. Learn. 29, 213–244 (1997)
Article MATH Google Scholar
Bøttcher, S.G., Dethlefsen, C.: Deal: a package for learning Bayesian networks. J. Stat. Softw. 8(20), 1–40 (2003)
Google Scholar
Buntine, W.L.: Theory refinement of Bayesian networks. In: D’Ambrosio, B., Smets, P., Bonissone, P. (eds.) Proceedings of the Seventh Annual Conference on Uncertainty in Artificial Intelligence (UAI–1991), pp. 52–60, San Mateo, 13–15 July 1991
Cameron, P.J.: First-order logic. In: Beineke, L.W., Wilson R.J. (eds.) Graph Connections: Relationships between Graph Theory and other Areas of Mathematics, pp. 70–85. Clarendon, Oxford (1997)
Google Scholar
Castelo, R., Kočka, T.: On inclusion-driven learning of Bayesian networks. J. Mach. Learn. Res. 4, 527–574 (2003)
Article Google Scholar
Cooper, G., Herskovits, E.: A Bayesian method for the induction of probabilistic networks from data. Mach. Learn. 9, 309–347 (1992). Appeared as 1991 Technical Report KSL-91-02 for the Knowledge Systems Laboratory, Stanford University (also SMI-91-0355)
MATH Google Scholar
Cussens, J.: Stochastic logic programs: sampling, inference and applications. In: Proc. UAI-00, pp. 115–122. Morgan Kaufmann, San Francisco (2000)
Google Scholar
Cussens, J.: Parameter estimation in stochastic logic programs. Mach. Learn. 44(3), 245–271 (2001)
Article MATH Google Scholar
Dobra, A., Jones B., Hans, C., Nevins J., West, M.: Sparse graphical models for exploring gene expression data. J. Multivar. Anal. 90, 196–212 (2004)
Article MATH MathSciNet Google Scholar
Egeland, T., Mostad, P., Mevåg, B., Stenersen, M.: Beyond traditional paternity and identification cases. Selecting the most probable pedigree. Forensic Sci. Int. 110(1), 47–59 (2000)
Article Google Scholar
Feller, W.: An Introduction to Probability Theory and Its Applications, vol. 1, 3rd edn. Wiley, New York (1950)
Google Scholar
Frege, G.: Begriffsschrift, eine der arithmetischen nachgebildete Formelsprache des reinen Denkens (1879)
Friedman, N., Koller, D.: Being Bayesian about network structure: a Bayesian approach to structure discovery in Bayesian networks. Mach. Learn. 50, 95–126 (2003)
Article MATH Google Scholar
Gelman, A.: Parameterization and Bayesian modeling. J. Am. Stat. Assoc. 99(466), 537–545 (2004)
Article MATH MathSciNet Google Scholar
Gilks, W.R., Richardson, S., Spiegelhalter, D.J., (eds.).: Markov Chain Monte Carlo in Practice. Chapman & Hall, London (1996)
MATH Google Scholar
Häggström, O.: Finite Markov Chains and Algorithmic Applications. London Mathematical Society Student Texts, vol. 52. Cambridge University Press, Cambridge (2002)
MATH Google Scholar
Heckerman, D., Geiger, D., Chickering, D.: Learning Bayesian networks: the combination of knowledge and statistical data. Mach. Learn. 20, 197–243 (1995). Also appears as Technical Report MSR-TR-94-09, Microsoft Research, March, 1994 (revised December, 1994)
MATH Google Scholar
Heckerman, D., Chickering, D.M., Meek, C., Rounthwaite, R., Kadie, C.: Dependency networks for inference, collaborative filtering, and data visualization. J. Mach. Learn. Res. 1, 49–75 (2000)
Article Google Scholar
Heckerman, D., Geiger, D., Chickering, D.M.: Learning Bayesian networks: the combination of knowledge and statistical data. Mach. Learn. 20(3), 197–243 (1995)
MATH Google Scholar
Højsgaard, S., Thiesson, B.: BIFROST—block recursive models induced from relevant knowledge, observations, and statistical techniques. Comput. Stat. Data Anal. 19, 155–175 (1995)
Article Google Scholar
Howson, C., Urbach, P.: Scientific Reasoning: The Bayesian Approach. Open Court, La Salle (1989)
Google Scholar
Koivisto, M., Sood, K.: Exact Bayesian structure discovery in Bayesian networks. J. Mach. Learn. Res. 5, 549–573 (2004)
MathSciNet Google Scholar
Langseth, H., Nielsen, T.D.: Fusion of domain knowledge with data for structural learning in object oriented domains. J. Mach. Learn. Res. 4, 339–368 (2003)
Article MathSciNet Google Scholar
Laskey, K.B., Myers, J.W.: Population Markov chain Monte Carlo. Mach. Learn. 50, 175–196 (2003)
Article MATH Google Scholar
Lauritzen, S.L., Richardson, T.S.: Chain graph models and their causal interpretations. J. R. Stat. Soc. B 64(3), 321–361 (2002)
Article MATH MathSciNet Google Scholar
Lauritzen, S.L., Spiegelhalter, D.J.: Local computations with probabilities on graphical structures and their applications to expert systems. J. R. Stat. Soc. A 50(2), 157–224 (1988)
MATH MathSciNet Google Scholar
Madigan, D., York, J.: Bayesian graphical models for discrete data. Int. Stat. Rev. 63, 215–232 (1995)
Article MATH Google Scholar
Madigan, D., Gavrin, J., Raftery, A.E.: Eliciting prior information to enhance the predictive performance of Bayesian graphical models. Commun. Stat. Theory Methods 24, 2271–2292 (1995). Appeared as 1994 Technical Report 270, University of Washington.
Article MATH MathSciNet Google Scholar
Madigan, D., Raftery, A.E.: Model selection and accounting for model uncertainty in graphical models using Occam’s window. J. Am. Stat. Assoc. 89, 1535–1546 (1994). First version was 1991 Technical Report 213, University of Washington.
Article MATH Google Scholar
Muggleton, S.: Stochastic logic programs. In: De Raedt, L. (ed.) Advances in Inductive Logic Programming. Frontiers in Artificial Intelligence and Applications, vol. 32, pp. 254–264. IOS, Amsterdam (1996)
Google Scholar
Nilsson, U., Małuszyński, J.: Logic, Programming and Prolog, 2nd edn. Wiley, Chichester (1995)
Google Scholar
Richardson, M., Domingos, P.: Learning with knowledge from multiple experts. In: Proceedings of the Twentieth International Conference on Machine Learning. Morgan Kaufmann, Washington, DC (2003)
Google Scholar
Robert, C.P., Casella, R.: Monte Carlo Statistical Methods, 2nd edn. Springer, New York (2004)
MATH Google Scholar
Sato, T., Kameya, Y.: Parameter learning of logic programs for symbolic-statistical modeling. J. Artif. Intell. Res. 15, 391–454 (2001)
MATH MathSciNet Google Scholar
Segal, E., Pe’er, D., Regev, A., Koller, D., Friedman, N.: Learning module networks. J. Mach. Learn. Res. 6, 557–588 (2005)
MathSciNet Google Scholar
Sheehan, N., Sorensen, D.: Graphical models for mapping continuous traits. In: Green, P.J., Hjort, N.L., Richardson, S. (eds.) Highly Structured Stochastic Systems, pp. 382–386. Oxford University Press, Oxford (2003)
Google Scholar
Srinivas, S., Russell, S., Agogino, A.M.: Automated construction of sparse Bayesian networks from unstructured probabilistic models and domain information. In: Henrion, M., Schachter, R., Kanal, L., Flemmer, J. (eds.) Uncertainty in Artificial Intelligence: Proceedings of the Fifth Conference (UAI-1989), pp. 295–308. Elsevier Science, New York (1990)
Google Scholar
Stephens, M., Donelly, P.: A comparison of Bayesian methods for haplotype reconstruction from population genotype data. Am. J. Hum. Genet. 73, 1162–1169 (2003)
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Biological Sciences, University of Edinburgh, Edinburgh, EH9 3JR, UK
Nicos Angelopoulos
Department of Computer Science & York Centre for Complex Systems Analysis, University of York, Heslington, York, YO10 5DD, UK
James Cussens

Authors

Nicos Angelopoulos
View author publications
You can also search for this author in PubMed Google Scholar
James Cussens
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to James Cussens.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Angelopoulos, N., Cussens, J. Bayesian learning of Bayesian networks with informative priors. Ann Math Artif Intell 54, 53–98 (2008). https://doi.org/10.1007/s10472-009-9133-x

Download citation

Received: 19 March 2009
Accepted: 19 March 2009
Published: 14 April 2009
Issue Date: November 2008
DOI: https://doi.org/10.1007/s10472-009-9133-x

Keywords

Mathematics Subject Classifications (2000)

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bayesian learning of Bayesian networks with informative priors

Abstract

Access this article

Similar content being viewed by others

Approaches to Bayesian Network Model Construction

How long, O Bayesian network, will I sample thee?

A survey of Bayesian Network structure learning

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classifications (2000)

Navigation

Bayesian learning of Bayesian networks with informative priors

Abstract

Access this article

Similar content being viewed by others

Approaches to Bayesian Network Model Construction

How long, O Bayesian network, will I sample thee?

A survey of Bayesian Network structure learning

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classifications (2000)

Search

Navigation