Skip to main content
Log in

Marginal information for structure learning

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

Structure learning for Bayesian networks has been made in a heuristic mode in search of an optimal model to avoid an explosive computational burden. In the learning process, a structural error which occurred at a point of learning may deteriorate its subsequent learning. We proposed a remedial approach to this error-for-error process by using marginal model structures. The remedy is made by fixing local errors in structure in reference to the marginal structures. In this sense, we call the remedy a marginally corrective procedure. We devised a new score function for the procedure which consists of two components, the likelihood function of a model and a discrepancy measure in marginal structures. The proposed method compares favourably with a couple of the most popular algorithms as shown in experiments with benchmark data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  • Abramson, B., Brown, J., Edwards, W., Murphy, A., Winkler, R.: Hailfinder: a Bayesian system for forecasting severe weather. Int. J. Forecast. 12(1), 57–71 (1996)

    Google Scholar 

  • Acid, S., de Campos, L.: Searching for Bayesian network structures in the space of restricted acyclic partially directed graphs. J. Artif. Intell. Res. 18, 445–490 (2003)

    MathSciNet  MATH  Google Scholar 

  • Amirkhani, H., Rahmati, M., Lucas, P., Hommersom, A.: Exploiting experts knowledge for structure learning of Bayesian networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(11), 2154–2170 (2017)

    Google Scholar 

  • Beinlich, I., Suermondt, H., Chavez, R., Cooper, G.: The alarm monitoring system: a case study with two probabilistic inference techniques for belief networks. Second European Conference on Artificial Intelligence in Medicine 38, 247–256 (1989)

    Google Scholar 

  • Binder, J., Koller, D., Russell, S., Kanazawa, K.: Adaptive probabilistic networks with hidden variables. Mach. Learn. 29(2–3), 213–244 (1997)

    MATH  Google Scholar 

  • Birch, M.: Maximum likelihood in three-way contingency tables. J. R. Stat. Soc. 25, 220–223 (1963)

    MathSciNet  MATH  Google Scholar 

  • Birch, M.: The detection of partial association I: the \(2\times 2\) case. J. R. Stat. Soc. 26, 313–324 (1964)

    MATH  Google Scholar 

  • Buntine, W.: Theory refinement on Bayesian networks. Proc. Uncertain. Artif. Intell. 7, 52–60 (1991)

    Google Scholar 

  • Chen, X., Anantha, G., Wang, X.: An effective structure learning method for constructing gene networks. Bioinformatics 22, 1367–1374 (2006)

    Google Scholar 

  • Chickering, D.: Learning Bayesian networks is NP-complete. In: Learning from Data: Artificial Intelligence and Statistics V, pp. 121–130 (1996)

    Google Scholar 

  • Chickering, D.: Optimal structure identification with greedy search. J. Mach. Learn. Res. 3, 507–554 (2002)

    MathSciNet  MATH  Google Scholar 

  • Chickering, D., Geiger, D., Heckerman, D.: Learning Bayesian networks: search methods and experimental results. In: Proceedings of the Fifth International Workshop on Artificial Intelligence and Statistics, pp. 112–128 (1995)

  • Cochran, W.: The chi-square test of goodness of fit. Ann. Math. Stat. 23, 315–345 (1952)

    MATH  Google Scholar 

  • Cooper, G., Herskovitz, E.: A Bayesian method for the induction of probabilistic networks from data. Mach. Learn. 9, 309–347 (1992)

    Google Scholar 

  • Darroach, J., Lauritzen, S., Speed, T.: Markov fields and log-linear interaction models for contingency tables. Ann. Stat. 8(3), 522–539 (1980)

    MathSciNet  MATH  Google Scholar 

  • Dawid, A., Lauritzen, S.: Hyper Markov laws in the statistical analysis of decomposable graphical models. Ann. Stat. 21(3), 1272–1317 (1993)

    MathSciNet  MATH  Google Scholar 

  • Dempster, A., Laird, N., Rubin, D.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B Stat. Methodol. 39(1), 1–38 (1977)

    MathSciNet  MATH  Google Scholar 

  • Diez, F., Mira, J., Iturralde, E., Zybillaga, S.: DIAVAL, a Bayesian expert system for echocardiography. Artif Intell. Med. 10(1), 59–73 (1997)

    Google Scholar 

  • Fienberg, S., Kim, S.: Combining conditional log-linear structures. J. Am. Stat. Assoc. 94(455), 229–239 (1999)

    MathSciNet  MATH  Google Scholar 

  • Friedman, M.: The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J. Am. Stat. Assoc. 32(200), 675–701 (1937)

    MATH  Google Scholar 

  • Friedman, N., Goldszmidt, M., Wyner, A.: Data analysis with Bayesian networks: a bootstrap approach. Proc. Uncertain. Artif. Intell. 15, 196–201 (1999)

    Google Scholar 

  • Friedman, N., Linial, M., Nachman, I., Pe’er, D.: Using Bayesian networks to analyze expression data. J. Comput. Biol. 7(3–4), 601–620 (2000)

    Google Scholar 

  • Gámez, J., Mateo, J., Puerta, J.: Learning Bayesian networks by hill climbing: efficient methods based on progressive restriction of the neighborhood. Data Min. Knowl. Discov. 22(1–2), 106–148 (2011)

    MathSciNet  MATH  Google Scholar 

  • Goh, K., Cusick, M., Valle, D., Childs, B., Vidal, M., Barabasi, A.: The human disease network. Proc. Natl. Acad. Sci. 104, 8685–8690 (2007)

    Google Scholar 

  • Heckerman, D., Geiger, D., Chickering, D.: Learning Bayesian networks: the combination of knowledge and statistical data. Proc. Uncertain. Artif. Intell. 10, 293–301 (1994)

    MATH  Google Scholar 

  • Jiang, C., Leong, T., Poh, K.: PGMC: a framework for probabilistic graphical model combination. In: AMIA Annual Symposium Proceedings, pp. 370–374 (2005)

  • Kim, S.: Conditional log-linear structures for log-linear modelling. Comput. Stat. Data Anal. 50(8), 2044–2064 (2006a)

    MathSciNet  MATH  Google Scholar 

  • Kim, S.: Properties of Markovian subgraphs of a decomposable graph. In: Gelbukh, A., Reyes-Garcia, C.A. (eds.) MICAI 2006, Lecture Notes in Artificial Intelligence, LNAI 4293 Advances in Artificial Intelligence, pp. 15–26 (2006b)

    Google Scholar 

  • Kim, S., Lee, S.: Searching model structures based on marginal model structures. In: Lazinica, A. (ed.) New Developments in Robotics, Automation and Control, pp. 355–376 (2008)

    Google Scholar 

  • Koller, D., Friedman, N.: Probabilistic Graphical Models: Principles and Techniques. MIT Press, Cambridge (2009)

    MATH  Google Scholar 

  • Koller, D., Sahami, M.: Toward optimal feature selection. In: The 13th International Conference on Machine Learning, pp. 284–292 (1996)

  • Koster, J.: Marginalizing and conditioning in graphical models. Bernoulli 8, 817–840 (2002)

    MathSciNet  MATH  Google Scholar 

  • Kullback, S., Leibler, R.: Information and sufficiency. Ann. Math. Stat. 22, 79–86 (1951)

    MathSciNet  MATH  Google Scholar 

  • Larrañaga, P., Poza, M., Yurramendi, Y., Murga, R., Kuijpers, C.: Structure learning of Bayesian networks by genetic algorithms: a performance analysis of control parameters. IEEE Trans. Pattern Anal. Mach. Intell. 18(9), 912–926 (1996)

    Google Scholar 

  • Lauritzen, S.: Graphical Models. Clarendon Press, Oxford (1996)

    MATH  Google Scholar 

  • Lauritzen, S., Spiegelhalter, D.: Local computation with probabilities on graphical structures and their application to expert systems (with discussion). J. R. Stat. Soc. Ser. B Stat. Methodol. 50(2), 157–224 (1988)

  • Margaritis, D., Thrun, S.: Bayesian network induction via local neighborhoods. Adv. Neural Inf. Process. Syst. 12, 505–511 (1999)

    Google Scholar 

  • Massa, M., Lauritzen, S.: Combining statistical models. Contemp. Math. 516, 239–259 (2010)

    MathSciNet  MATH  Google Scholar 

  • Pearl, J.: Bayesian networks: a model of self-activated memory for evidential reasoning. In: Proceedings of the 7th Conference of the Cognitive Science Society, vol. 7, pp. 329–334 (1985)

  • Pearl, J.: Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmann, San Mateo, CA (1988)

    MATH  Google Scholar 

  • Richardson, M., Domingos, P.: Learning with knowledge from multiple experts. In: Proceedings of the 20th International Conference on Machine Learning, vol. 20, pp. 624–631 (2003)

  • Richardson, T., Spirtes, P.: Ancestral graph Markov models. Ann. Stat. 30(4), 962–1030 (2002)

    MathSciNet  MATH  Google Scholar 

  • Robinson, R.: Counting labeled acyclic digraphs. In: New Directions in the Theory of Graphs, pp. 239–273 (1973)

  • Scutari, M.: Learning Bayesian networks with the bnlearn R package. J. Stat. Softw. 35(3), 1–22 (2010). https://doi.org/10.18637/jss.v035.i03

    Article  Google Scholar 

  • Spirtes, P., Glymour, C., Scheines, R.: Causality from probability. In: Tiles, J., McKee, G., Dean, G. (eds.) Evolving Knowledge in the Natural and Behavioral Sciences, pp. 181–199. Pitman, London (1990)

    Google Scholar 

  • Tillman, R., Danks, D., Glymour, C.: Integrating locally learned causal structures with overlapping variables. In: Advances in Neural Information Processing Systems (NIPS 2008), vol. 21, pp. 1–8 (2008)

  • Trucco, P., Cagno, E., Ruggeri, F., Grande, O.: A Bayesian belief network modelling of organisational factors in risk analysis: a case study in maritime transportation. Reliab. Eng. Syst. Saf. 93(6), 845–856 (2008)

    Google Scholar 

  • Tsamardinos, I., Aliferis, C., Statnikov, A.: Algorithms for large scale Markov blanket discovery. In: The 16th International FLAIRS Conference, pp. 376–381 (2003)

  • Tsamardinos, I., Brown, L., Aliferis, C.: The max-min hill-climbing Bayesian network structure learning algorithm. Mach. Learn. 65(1), 31–78 (2006)

    Google Scholar 

  • Tsamardinos, I., Triantafillou, S., Lagani, V.: Towards integrative causal analysis of heterogeneous data sets and studies. J. Mach. Learn. Res. 13, 1097–1157 (2012)

    MathSciNet  MATH  Google Scholar 

  • Verma, T., Pearl, J.: Causal networks: semantics and expressiveness. Uncertain. Artif. Intell. 5, 69–76 (1990)

    MathSciNet  Google Scholar 

  • Verma, T., Pearl, J.: Equivalence and synthesis of causal models. Uncertain. Artif. Intell. 6, 220–227 (1991)

    Google Scholar 

  • Whittaker, J.: Graphical Models. Wiley, New York (1990)

    MATH  Google Scholar 

  • Wilcoxon, F.: Individual comparisons by ranking methods. Biom. Bull. 1(6), 80–83 (1945)

    Google Scholar 

  • Yang, L., Lee, J.: Bayesian belief network-based approach for diagnostics and prognostics of semiconductor manufacturing systems. Robot. Comput. Integr. Manuf. 28(28), 66–74 (2012)

    Google Scholar 

Download references

Acknowledgements

This work was supported by a grant from the National Research Foundation of Korea (Grant No. 2016R1D1A1B03936155).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sung-Ho Kim.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: The numeric version of Figs. 9, 10 and 11

Appendix: The numeric version of Figs. 9, 10 and 11

See Tables 7, 8, and 9.

Table 9 (Numeric version of Fig. 11) The structural distances in moral graph (SDM) for the results with sample sizes 500, 1000, and 3000

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kim, GH., Kim, SH. Marginal information for structure learning. Stat Comput 30, 331–349 (2020). https://doi.org/10.1007/s11222-019-09877-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-019-09877-x

Keywords

Navigation