Learning Bayesian networks by hill climbing: efficient methods based on progressive restriction of the neighborhood

Gámez, José A.; Mateo, Juan L.; Puerta, José M.

doi:10.1007/s10618-010-0178-6

Learning Bayesian networks by hill climbing: efficient methods based on progressive restriction of the neighborhood

Published: 11 May 2010

Volume 22, pages 106–148, (2011)
Cite this article

Data Mining and Knowledge Discovery Aims and scope Submit manuscript

José A. Gámez¹,
Juan L. Mateo¹ &
José M. Puerta¹

1956 Accesses
117 Citations
Explore all metrics

Abstract

Learning Bayesian networks is known to be an NP-hard problem and that is the reason why the application of a heuristic search has proven advantageous in many domains. This learning approach is computationally efficient and, even though it does not guarantee an optimal result, many previous studies have shown that it obtains very good solutions. Hill climbing algorithms are particularly popular because of their good trade-off between computational demands and the quality of the models learned. In spite of this efficiency, when it comes to dealing with high-dimensional datasets, these algorithms can be improved upon, and this is the goal of this paper. Thus, we present an approach to improve hill climbing algorithms based on dynamically restricting the candidate solutions to be evaluated during the search process. This proposal, dynamic restriction, is new because other studies available in the literature about restricted search in the literature are based on two stages rather than only one as it is presented here. In addition to the aforementioned advantages of hill climbing algorithms, we show that under certain conditions the model they return is a minimal I-map of the joint probability distribution underlying the training data, which is a nice theoretical property with practical implications. In this paper we provided theoretical results that guarantee that, under these same conditions, the proposed algorithms also output a minimal I-map. Furthermore, we experimentally test the proposed algorithms over a set of different domains, some of them quite large (up to 800 variables), in order to study their behavior in practice.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An exhaustive review of the metaheuristic algorithms for search and optimization: taxonomy, applications, and open challenges

Article 09 April 2023

Geyser Inspired Algorithm: A New Geological-inspired Meta-heuristic for Real-parameter and Constrained Engineering Optimization

Article 26 September 2023

Puma optimizer (PO): a novel metaheuristic optimization algorithm and its application in machine learning

Article 19 January 2024

References

Abellán J, Gómez-Olmedo M, Moral S (2006) Some variations on the PC algorithm. In: Proceedings of the 3rd European workshop on probabilistic graphical models (PGM-06), pp 1–8
Acid S, de Campos LM (2001) A hybrid methodology for learning belief networks: benedict. Int J Approx Reason 27(3): 235–262
Article MATH Google Scholar
Acid S, de Campos LM (2003) Searching for Bayesian network structures in the space of restricted acyclic partially directed graphs. J Artif Intell Res 18: 445–490
MATH Google Scholar
Andreassen S, Jensen FV, Andersen SK, Falck B, Kjærulff U, Woldbye M, Sørensen AR, Rosenfalck A, Jensen F (1989) MUNIN—an expert EMG assistant. In: Desmedt JE (eds) Computer-aided electromyography and expert systems, chap 21. Elsevier, Amsterdam
Google Scholar
Beinlich IA, Suermondt HJ, Chavez RM, Cooper GF (1989) The ALARM monitoring system: a case study with two probabilistic inference techniques for belief networks. In: Second European conference on artificial intelligence in medicine, vol 38. Springer-Verlag, Berlin, pp 247–256
Binder J, Koller D, Russell SJ, Kanazawa K (1997) Adaptive probabilistic networks with hidden variables. Mach Learn 29(2-3): 213–244
Article MATH Google Scholar
Blanco R, Inza I, Larrañaga P (2003) Learning Bayesian networks in the space of structures by estimation of distribution algorithms. Int J Intell Syst 18(2): 205–220
Article MATH Google Scholar
Buntine WL (1991) Theory refinement on bayesian networks. In: Proceedings of the seventh annual conference on uncertainty in artificial intelligence, pp 52–60
Buntine W (1996) A guide to the literature on learning probabilistic networks from data. IEEE Trans Knowl Data Eng 8(2): 195–210
Article Google Scholar
Cano R, Sordo C, Gutiérrez JM (2004) Applications of bayesian networks in meteorology. In: Gámez JA, Moral S, Salmerón A (eds) Advances in Bayesian networks. Springer-Verlag, pp 309–327
Chickering DM (1996) Learning Bayesian networks is NP-complete. In: Fisher D, Lenz H (eds) Learning from data: artificial intelligence and statistics V. Springer-Verlag, pp 121–130
Chickering DM (2002) Optimal structure identification with greedy search. J Mach Learn Res 3: 507–554
Article MathSciNet Google Scholar
Chickering DM, Geiger D, Heckerman D (1995) Learning bayesian networks: search methods and experimental results. In: Proceedings of the fifth international workshop on artificial intelligence and statistics, pp 112–128
Cooper G, Herskovits E (1992) A Bayesian method for the induction of probabilistic networks from data. Mach Learn 9: 309–347
MATH Google Scholar
Cowell RG, Dawid AP, Lauritzen S, Spiegelhalter D (2003) Probabilistic networks and expert systems (Information Science and Statistics). Springer, New York
Google Scholar
Dash D, Druzdzel MJ (1999) A hybrid anytime algorithm for the construction of causal models from sparse data. In: Proceedings of the sixth annual conference on uncertainty in artificial intelligence (UAI’99), pp 142–149
de Campos LM (2006) A scoring function for learning Bayesian networks based on mutual information and conditional independence tests. J Mach Learn Res 7: 2149–2187
MathSciNet Google Scholar
de Campos LM, Puerta JM (2001) Stochastic local algorithms for learning belief networks: searching in the space of the orderings. In: 6th European conference on symbolic and quantitative approaches to reasoning with uncertainty (ECSQARU’01), pp 228–239
de Campos LM, Fernández-Luna JM, Gámez JA, Puerta JM (2002) Ant colony optimization for learning bayesian networks. Int J Approx Reason 31(3): 291–311
Article MATH Google Scholar
Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32(200): 675–701
Article Google Scholar
Friedman N, Nachman I, Pe’er D (1999) Learning Bayesian network structure from massive datasets: the “sparse candidate” algorithm. In: Proceedings of the fifteenth conference on uncertainty in artificial intelligence (UAI’99), pp 206–215
Friedman N, Linial M, Nachman I, Pe’er D (2000) Using bayesian network to analyze expression data. Comput Biol 7: 601–620
Article Google Scholar
Gámez JA, Puerta JM (2005) Constrained score+(local)search methods for learning bayesian networks. In: 8th European conference on symbolic and quantitative approaches to reasoning with uncertainty (ECSQARU-05). LNCS, vol. 3571, pp 161–173
Geiger D, Heckerman D, King H, Meek C (2001) Stratified exponential families: graphical models and model selection. Ann Stat 29(2): 505–529
Article MATH MathSciNet Google Scholar
Haughton DMA (1988) On the choice of a model to fit data from an exponential family. Ann Stat 16(1): 342–355
Article MATH MathSciNet Google Scholar
Heckerman D (1997) Bayesian networks for data mining. Data Min Knowl Disc 1: 79–119
Article Google Scholar
Heckerman D, Geiger D, Chickering DM (1995) Learning Bayesian networks: the combination of knowledge and statistical data. Mach Learn 20(3): 197–243
MATH Google Scholar
Holm S (1979) A simple sequential rejective multiple Bonferroni procedures to pairwise multiple comparisons in balanced repeated measures designs. Comput Stat Q 6: 219–231
MathSciNet Google Scholar
Jensen CS (1997) Blocking Gibbs sampling for inference in large and complex Bayesian networks with applications in genetics. PhD thesis, Aalborg University, Denmark
Jensen A, Jensen F (1996) Midas–an influence diagram for management of mildew in winter wheat. In: Proceedings of the 12th annual conference on uncertainty in artificial intelligence (UAI-96), pp 349–356
Jensen FV, Nielsen TD (2007) Bayesian networks and decision graphs, 2nd edn. Springer, New York
Book MATH Google Scholar
Kristensen K, Rasmussen IA (2002) The use of a Bayesian network in the design of a decision support system for growing malting barley without use of pesticides. Comput Electron Agric 33: 197–217
Article Google Scholar
Larrañaga P, Poza M, Yurramendi Y, Murga RH, Kuijpers CMH (1996) Structure learning of Bayesian networks by genetic algorithms: a performance analysis of control parameters. IEEE Trans Pattern Anal Mach Intell 18(9): 912–926
Article Google Scholar
Margaritis D (2003) Learning bayesian model structure from data. PhD thesis, Carnegie Mellon University
Moral S (2004) An empirical comparison of score measures for independence. In: Proceedings of the 10th IPMU international conference, pp 1307–1314
Nägele A, Dejori M, Stetter M (2007) Bayesian substructure learning—approximate learning of very large network structures. In: Proceedings of the 18th European conference on machine learning (ECML ’07), pp 238–249
Neapolitan R (2003) Learning Bayesian networks. Prentice Hall, Upper Saddle River
Google Scholar
Pearl J (1988) Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann, San Francisco
Google Scholar
Peña JM, Nilsson R, Björkegren J, Tegnér J (2007) Towards scalable and data efficient learning of Markov boundaries. Int J Approx Reason 45(2): 211–232
Article MATH Google Scholar
Robinson R (1977) Counting unlabeled acyclic digraphs. In: Combinatorial mathematics, vol 622. Springer-Verlag, Berlin, pp 28–43
Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6(2): 461–464
Article MATH Google Scholar
Spellman PT, Sherlock G, Zhang MQ, Iyer VR, Anders K, Eisen M, Brown P, Botstein D, Futcher B (1998) Comprehensive identification of cell cycle-regulated genes of the yeast sacccharomyces cerevisiae by microarray hybridization. Mol Biol Cell 9: 3273–3297
Google Scholar
Spirtes P, Glymour C, Scheines R (1993) Causation, prediction and search. In: Lecture notes in statistics, vol 81. Springer Verlag, New York
Statnikov A, Tsamardinos I, Aliferis CF (2003) An algorithm for generation of large Bayesian networks. Tech Rep DSL TR-03-01, Vanderbilt University
Tsamardinos I, Brown LE, Aliferis CF (2006a) The max- min hill-climbing bayesian network structure learning algorithm. Mach Learn 65(1): 31–78
Article Google Scholar
Tsamardinos I, Statnikov A, Brown LE, Aliferis CF (2006b) Generating realistic large Bayesian networks by tiling. In: Proceedings of the Nineteenth International Florida Artificial Intelligence Research Society FLAIRS conference, pp 592–597
van Dijk S, van der Gaag LC, Thierens D (2003) A skeleton-based approach to learning bayesian networks from data. In: In proceedings of the 7th European conference on principles and practice of knowledge discovery in databases (PKDD’03), pp 132–143
Verma T, Pearl J (1991) Equivalence and synthesis of causal models. In: Proceedings of the sixth annual conference on uncertainty in artificial intelligence (UAI’90). Elsevier Science Inc., pp 255–270
WenChen X, Anantha G, Lin X (2008) Improving Bayesian network structure learning with mutual information-based node ordering in the k2 algorithm. IEEE Trans Knowl Data Eng 20(5): 628–640
Article Google Scholar
Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco
MATH Google Scholar
Wong ML, Leung KS (2004) An efficient data mining method for learning Bayesian networks using an evolutionary algorithm-based hybrid approach. IEEE Trans Evol Comput 8(4): 378–404
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computing Systems, Intelligent Systems and Data Mining Group-i3A, University of Castilla-La Mancha, 02071, Albacete, Spain
José A. Gámez, Juan L. Mateo & José M. Puerta

Authors

José A. Gámez
View author publications
You can also search for this author in PubMed Google Scholar
Juan L. Mateo
View author publications
You can also search for this author in PubMed Google Scholar
José M. Puerta
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Juan L. Mateo.

Additional information

Responsible editor: Charles Elkan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gámez, J.A., Mateo, J.L. & Puerta, J.M. Learning Bayesian networks by hill climbing: efficient methods based on progressive restriction of the neighborhood. Data Min Knowl Disc 22, 106–148 (2011). https://doi.org/10.1007/s10618-010-0178-6

Download citation

Received: 03 November 2009
Accepted: 28 April 2010
Published: 11 May 2010
Issue Date: January 2011
DOI: https://doi.org/10.1007/s10618-010-0178-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning Bayesian networks by hill climbing: efficient methods based on progressive restriction of the neighborhood

Abstract

Access this article

Similar content being viewed by others

An exhaustive review of the metaheuristic algorithms for search and optimization: taxonomy, applications, and open challenges

Geyser Inspired Algorithm: A New Geological-inspired Meta-heuristic for Real-parameter and Constrained Engineering Optimization

Puma optimizer (PO): a novel metaheuristic optimization algorithm and its application in machine learning

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Learning Bayesian networks by hill climbing: efficient methods based on progressive restriction of the neighborhood

Abstract

Access this article

Similar content being viewed by others

An exhaustive review of the metaheuristic algorithms for search and optimization: taxonomy, applications, and open challenges

Geyser Inspired Algorithm: A New Geological-inspired Meta-heuristic for Real-parameter and Constrained Engineering Optimization

Puma optimizer (PO): a novel metaheuristic optimization algorithm and its application in machine learning

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation