An Efficient Bayesian Network Structure Learning Strategy

Suzuki, Joe

doi:10.1007/s00354-016-0007-6

An Efficient Bayesian Network Structure Learning Strategy

Special Feature
Published: 12 December 2016

Volume 35, pages 105–124, (2017)
Cite this article

New Generation Computing Aims and scope Submit manuscript

Joe Suzuki¹

615 Accesses
Explore all metrics

Abstract

This paper addresses the problem of efficiently finding an optimal Bayesian network structure for maximizing the posterior probability. In particular, we focus on the B& B strategy to save the computational effort associated with finding the largest score. To make the search more efficient, we need a tighter upper bound so that the current score can exceed it more easily. We find two upper bounds and prove that they are tighter than the existing one (Campos and Ji, J Mach Learn Res 12(3):663–689, 2011). Finally, we demonstrate that the proposed two bounds render the search to be much more efficient using the Alarm and Insurance data sets. For example, the search is twice to three times faster for $n=100$ and almost twice faster for $n=500$. We also experimentally verify that the overhead due to replacing the existing pruning rule by the proposed one is negligible.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Stepwise Structure Learning Using Probabilistic Pruning for Bayesian Networks: Improving Efficiency and Comparing Characteristics

A theoretical analysis of the BDeu scores in Bayesian network structure learning

Article 29 November 2016

Approximate structure learning for large Bayesian networks

Article 07 May 2018

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Beinlich, I.A., Suermondt, H.J., Chavez, R.M., Cooper, G.F.: The ALARM monitoring system: a case study with two probabilistic inference techniques for belief networks. In: The 2nd European Conference on Artificial Intelligence in Medicine, pp. 247–256. Springer, London (1989)
Binder, J., Koller, D., Russell, S., Kanazawa, K.: Adaptive probabilistic networks with hidden variables. Mach. Learn. 29(2–3):213–244 (1997)
Buntine, W.: Theory refinement on Bayesian networks. In: Uncertainty in Artificial Intelligence, pp. 52–60. Morgan Kaufmann, Los Angels (1991)
Campos, C.P., Ji, Q.: Efficient structure learning of Bayesian networks using constraints. J. Mach. Learn. Res. 12(3), 663–689 (2011)
Chickering, D.M., Meek, C., Heckerman, D.: Large-sample learning of Bayesian networks is NP-hard. In: Uncertainty in Artificial Intelligence, pp. 124–133. Morgan Kaufmann, Acapulco (2003)
Cooper, G.F., Herskovits, E.: A Bayesian method for the induction of probabilistic networks from data. Mach. Learn. 9(4), 309–347 (1992)
MATH Google Scholar
Cussens, J., Bartlett, M.: GOBNILP 1.6.2 User/Developer Manual1. University of York, York (2015)
Fan, X., Malone, B., Yuan, C.: Finding optimal Bayesian network structures with constraints learned from data. In: Uncertainty in Artificial Intelligence, pp. 200–209. AUAI Press, Corvallis (2014)
Jeffreys, H.: Theory of Probability. Oxford University Press, Oxford (1939)
Krichevsky, R.E., Trofimov, V.K.: The performance of universal encoding. IEEE Trans. Inf. Theory IT-27(2), 199–207 (1981)
Ott, S., Imoto, S., Miyano, S.: Finding optimal models for small gene networks. Pac. Symp. Biocomput. 9, 557–567 (2004)
Google Scholar
Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference (Representation and Reasoning), 2nd ed. Morgan Kaufmann, Burlington (1988)
Rissanen, J.: Modeling by shortest data description. Automatica 14, 465–471 (1978)
Article MATH Google Scholar
Silander, T., Myllymaki, P.: A simple approach for finding the globally optimal Bayesian network structure. In: Uncertainty in Artificial Intelligence, pp. 445–452. Morgan Kaufmann, Arlington (2006)
Singh, A.P., Moore, A.W.: Finding optimal Bayesian networks by dynamic programming. Technical Report, Carnegie Mellon University (2005)
Spirtes, P., Glymour, C., Scheines, R.: Causation. Prediction and Search. Springer, Berlin (1993)
Book MATH Google Scholar
Suzuki, J.: A construction of Bayesian networks from databases based on an MDL principle. In: Uncertainty in Artificial Intelligence, pp. 266–273. Morgan Kaufmann, Washington DC (1993)
Suzuki, J.: Learning Bayesian belief networks based on the minimum description length principle: an efficient algorithm using the b & b technique. In: International Conference on Machine Learning, pp. 462–470. Morgan Kaufmann, Bari (1996)
Suzuki, J.: Efficiently learning Bayesian network structures based on the b&b strategy: a theoretical analysis. In: Advanced Methodologies for Bayesian Networks, Yokohama, Japan (2015). Published also as Lecture Notes on Artificial Intelligence 9095. Springer, Berlin (2016)
Tian, J.: A branch-and-bound algorithm for MDL learning Bayesian networks. In: Uncertainty in Artificial Intelligence, pp. 580–588. Morgan Kaufmann, Stanford (2000)
Ueno, M.: Learning networks determined by the ratio of prior and data. In: Uncertainty in Artificial Intelligence, pp. 598–605 (2010)

Download references

Acknowledgements

The author wishes to express his gratitude to Dr. Jun Kawahara, Nara Institute of Science and Technology for correcting the program of the proposed algorithm.

Author information

Authors and Affiliations

Department of Mathematics, Osaka University, 1-1 Machikaneyamacho, Toyonaka-shi, Osaka, 5600043, Japan
Joe Suzuki

Authors

Joe Suzuki
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Joe Suzuki.

Appendix

We assume that z takes on a value in $\{1,\ldots ,\gamma \}$ with $\gamma \ge 2$.

Proof of Lemma 1

In general,

$$\begin{aligned} c(z)+a-j\le \sum _{z'=z}^{\gamma }c(z')+a-j \end{aligned}$$

for $j=1,\ldots ,c(z)$ and $z=1,\ldots ,\gamma $. By multiplying both sides over all $j=1,\ldots ,c(z)$, we have

$$\begin{aligned} \frac{\Gamma (c(z)+a)}{\Gamma (a)}\le \frac{\Gamma (\sum _{z'=z}^{\gamma }c(z')+a)}{\Gamma (\sum _{z'=z+1}^{\gamma }c(z')+a)} \end{aligned}$$

for $z=1,\ldots ,\gamma $. By further multiplying both sides over all $z=1,\ldots ,\gamma $, we obtain the lemma.

Proof of Lemma 2

Note that in general, for $0<p<q$, the function $f_{p,q}(u):=\frac{u+p}{u+q}<1$ is monotonously increasing with $u\ge 0$. Thus,

$$\begin{aligned} \frac{c(z)+a-j}{c(z)+b-j}\le \frac{\sum _{z'=z}^\gamma c(z')+a-j}{\sum _{z'=z}^\gamma c(z')+b-j} \end{aligned}$$

for $j=1,\ldots ,c(z)$ and $z=1,\ldots ,\gamma $. By multiplying both sides over all $j=1,\ldots ,c(z)$, we have

$$\begin{aligned}&\frac{\Gamma (c(z)+a)}{\Gamma (a)}\cdot \frac{\Gamma (b)}{\Gamma (c(z)+b)}\\&\quad \le \frac{\Gamma (\sum _{z'=z}^\gamma c(z')+a)}{\Gamma (\sum _{z'=z+1}^\gamma c(z')+a)}\cdot \frac{\Gamma (\sum _{z'=z+1}^\gamma c(z')+b)}{\Gamma (\sum _{z'=z}^\gamma c(z')+b)} \\ \end{aligned}$$

for $z=1,\ldots ,\gamma $. By further multiplying both sides over all $z=1,\ldots ,\gamma $, we obtain the lemma.

Proof of Lemma 3

Similar to the proof of Lemma 2, we have

$$\begin{aligned} \frac{c(z)+a(z)-j}{c(z)+\sum _{z'=1}^\gamma a(z')-j}\le & {} \frac{\sum _{z'=z}^\gamma c(z')+a(z)-j}{\sum _{z'=z}^\gamma c(z')+\sum _{z'=1}^\gamma a(z')-j}\\\le & {} \frac{\sum _{z'=z}^\gamma c(z')+\max _{z'} a(z')-j}{\sum _{z'=z}^\gamma c(z')+\sum _{z'=1}^\gamma a(z')-j} \end{aligned}$$

for $j=1,\ldots ,c(z)$ and $z=1,\ldots ,\gamma $. By multiplying both sides over all $j=1,\ldots ,c(z)$, we have

$$\begin{aligned}&\frac{\Gamma (c(z)+a(z))}{\Gamma (a(z))}\cdot \frac{\Gamma (\sum _{z'=1}^\gamma a(z'))}{\Gamma (c(z)+\sum _{z'=1}^\gamma a(z'))}\\&\quad \le \frac{\sum _{z'=z}^\gamma c(z')+\max _{z'} a(z')-1}{\sum _{z'=z}^\gamma c(z')+\sum _{z'} a(z')-1}\cdots \frac{\sum _{z'=z+1}^\gamma c(z')+\max _{z'} a(z')}{\sum _{z'=z+1}^\gamma c(z')+\sum _{z'=1}^\gamma a(z')}\\&\quad = \frac{\Gamma (\sum _{z'=z}^\gamma c(z')+\max _{z'}a(z'))}{\Gamma (\sum _{z'=z+1}^{\gamma } c(z')+\max _{z'} a(z'))} \frac{\Gamma (\sum _{z'=z+1}^{\gamma } c(z')+\sum _{z'=1}^\gamma a(z'))}{\Gamma (\sum _{z'=z}^\gamma c(z')+\sum _{z'=1}^\gamma a(z'))} \end{aligned}$$

for $z=1,\ldots ,\gamma $. By further multiplying both sides over all $z=1,\ldots ,\gamma $, we obtain the lemma.

Proof of Theorem 5

We prove

$$\begin{aligned}&\frac{\Gamma (\beta /2)}{\Gamma (n+\beta /2)}\prod _y \frac{\Gamma (c(y)+1/2)}{\Gamma (1/2)}\cdot \prod _y \left\{ \frac{\Gamma (\alpha /2)}{\Gamma (c(y)+\alpha /2)} \prod _x \frac{\Gamma (c(x,y)+1/2)}{\Gamma (1/2)}\right\} \\&\quad \ge \frac{\Gamma (\alpha \beta /2)}{\Gamma (n+\alpha \beta /2)}\prod _x\prod _y \frac{\Gamma (c(x,y)+1/2)}{\Gamma (1/2)}, \end{aligned}$$

which is equivalent to

$$\begin{aligned} \frac{\prod _y \Gamma (c(y)+1/2)}{\Gamma (n+\beta /2)}\cdot \frac{\Gamma (\beta /2)}{\Gamma (1/2)^\beta } \ge \frac{\prod _y \Gamma (c(y)+\alpha /2)}{\Gamma (n+\alpha \beta /2)}\cdot \frac{\Gamma (\alpha \beta /2)}{\Gamma (1/2)^{\alpha \beta }} \end{aligned}$$

(27)

We regard (27) as a function of $\alpha \ge 1$. We find that the both sides are equal when $\alpha =1$, that $\Gamma (1/2)^{\alpha \beta }/\Gamma (\alpha \beta )$ decreases with $\alpha \ge 1$, and that $B(x+r_1,\ldots ,x+r_m)$ with constants $r_1,\ldots ,r_m>0$ decreases with $x>0$, where $B(r_1,\ldots ,r_n)$ is the Beta function defined by $\frac{\Gamma (\sum _{i=1}^m r_i)}{\prod _{i=1}^m \Gamma (r_i)}$. Those three facts imply the theorem.

About this article

Cite this article

Suzuki, J. An Efficient Bayesian Network Structure Learning Strategy. New Gener. Comput. 35, 105–124 (2017). https://doi.org/10.1007/s00354-016-0007-6

Download citation

Received: 20 March 2016
Accepted: 25 July 2016
Published: 12 December 2016
Issue Date: January 2017
DOI: https://doi.org/10.1007/s00354-016-0007-6

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Efficient Bayesian Network Structure Learning Strategy

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Stepwise Structure Learning Using Probabilistic Pruning for Bayesian Networks: Improving Efficiency and Comparing Characteristics

A theoretical analysis of the BDeu scores in Bayesian network structure learning

Approximate structure learning for large Bayesian networks

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Proof of Lemma 1

Proof of Lemma 2

Proof of Lemma 3

Proof of Theorem 5

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now