Abstract
Structure learning for Bayesian networks has been made in a heuristic mode in search of an optimal model to avoid an explosive computational burden. In the learning process, a structural error which occurred at a point of learning may deteriorate its subsequent learning. We proposed a remedial approach to this error-for-error process by using marginal model structures. The remedy is made by fixing local errors in structure in reference to the marginal structures. In this sense, we call the remedy a marginally corrective procedure. We devised a new score function for the procedure which consists of two components, the likelihood function of a model and a discrepancy measure in marginal structures. The proposed method compares favourably with a couple of the most popular algorithms as shown in experiments with benchmark data sets.
Similar content being viewed by others
References
Abramson, B., Brown, J., Edwards, W., Murphy, A., Winkler, R.: Hailfinder: a Bayesian system for forecasting severe weather. Int. J. Forecast. 12(1), 57–71 (1996)
Acid, S., de Campos, L.: Searching for Bayesian network structures in the space of restricted acyclic partially directed graphs. J. Artif. Intell. Res. 18, 445–490 (2003)
Amirkhani, H., Rahmati, M., Lucas, P., Hommersom, A.: Exploiting experts knowledge for structure learning of Bayesian networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(11), 2154–2170 (2017)
Beinlich, I., Suermondt, H., Chavez, R., Cooper, G.: The alarm monitoring system: a case study with two probabilistic inference techniques for belief networks. Second European Conference on Artificial Intelligence in Medicine 38, 247–256 (1989)
Binder, J., Koller, D., Russell, S., Kanazawa, K.: Adaptive probabilistic networks with hidden variables. Mach. Learn. 29(2–3), 213–244 (1997)
Birch, M.: Maximum likelihood in three-way contingency tables. J. R. Stat. Soc. 25, 220–223 (1963)
Birch, M.: The detection of partial association I: the \(2\times 2\) case. J. R. Stat. Soc. 26, 313–324 (1964)
Buntine, W.: Theory refinement on Bayesian networks. Proc. Uncertain. Artif. Intell. 7, 52–60 (1991)
Chen, X., Anantha, G., Wang, X.: An effective structure learning method for constructing gene networks. Bioinformatics 22, 1367–1374 (2006)
Chickering, D.: Learning Bayesian networks is NP-complete. In: Learning from Data: Artificial Intelligence and Statistics V, pp. 121–130 (1996)
Chickering, D.: Optimal structure identification with greedy search. J. Mach. Learn. Res. 3, 507–554 (2002)
Chickering, D., Geiger, D., Heckerman, D.: Learning Bayesian networks: search methods and experimental results. In: Proceedings of the Fifth International Workshop on Artificial Intelligence and Statistics, pp. 112–128 (1995)
Cochran, W.: The chi-square test of goodness of fit. Ann. Math. Stat. 23, 315–345 (1952)
Cooper, G., Herskovitz, E.: A Bayesian method for the induction of probabilistic networks from data. Mach. Learn. 9, 309–347 (1992)
Darroach, J., Lauritzen, S., Speed, T.: Markov fields and log-linear interaction models for contingency tables. Ann. Stat. 8(3), 522–539 (1980)
Dawid, A., Lauritzen, S.: Hyper Markov laws in the statistical analysis of decomposable graphical models. Ann. Stat. 21(3), 1272–1317 (1993)
Dempster, A., Laird, N., Rubin, D.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B Stat. Methodol. 39(1), 1–38 (1977)
Diez, F., Mira, J., Iturralde, E., Zybillaga, S.: DIAVAL, a Bayesian expert system for echocardiography. Artif Intell. Med. 10(1), 59–73 (1997)
Fienberg, S., Kim, S.: Combining conditional log-linear structures. J. Am. Stat. Assoc. 94(455), 229–239 (1999)
Friedman, M.: The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J. Am. Stat. Assoc. 32(200), 675–701 (1937)
Friedman, N., Goldszmidt, M., Wyner, A.: Data analysis with Bayesian networks: a bootstrap approach. Proc. Uncertain. Artif. Intell. 15, 196–201 (1999)
Friedman, N., Linial, M., Nachman, I., Pe’er, D.: Using Bayesian networks to analyze expression data. J. Comput. Biol. 7(3–4), 601–620 (2000)
Gámez, J., Mateo, J., Puerta, J.: Learning Bayesian networks by hill climbing: efficient methods based on progressive restriction of the neighborhood. Data Min. Knowl. Discov. 22(1–2), 106–148 (2011)
Goh, K., Cusick, M., Valle, D., Childs, B., Vidal, M., Barabasi, A.: The human disease network. Proc. Natl. Acad. Sci. 104, 8685–8690 (2007)
Heckerman, D., Geiger, D., Chickering, D.: Learning Bayesian networks: the combination of knowledge and statistical data. Proc. Uncertain. Artif. Intell. 10, 293–301 (1994)
Jiang, C., Leong, T., Poh, K.: PGMC: a framework for probabilistic graphical model combination. In: AMIA Annual Symposium Proceedings, pp. 370–374 (2005)
Kim, S.: Conditional log-linear structures for log-linear modelling. Comput. Stat. Data Anal. 50(8), 2044–2064 (2006a)
Kim, S.: Properties of Markovian subgraphs of a decomposable graph. In: Gelbukh, A., Reyes-Garcia, C.A. (eds.) MICAI 2006, Lecture Notes in Artificial Intelligence, LNAI 4293 Advances in Artificial Intelligence, pp. 15–26 (2006b)
Kim, S., Lee, S.: Searching model structures based on marginal model structures. In: Lazinica, A. (ed.) New Developments in Robotics, Automation and Control, pp. 355–376 (2008)
Koller, D., Friedman, N.: Probabilistic Graphical Models: Principles and Techniques. MIT Press, Cambridge (2009)
Koller, D., Sahami, M.: Toward optimal feature selection. In: The 13th International Conference on Machine Learning, pp. 284–292 (1996)
Koster, J.: Marginalizing and conditioning in graphical models. Bernoulli 8, 817–840 (2002)
Kullback, S., Leibler, R.: Information and sufficiency. Ann. Math. Stat. 22, 79–86 (1951)
Larrañaga, P., Poza, M., Yurramendi, Y., Murga, R., Kuijpers, C.: Structure learning of Bayesian networks by genetic algorithms: a performance analysis of control parameters. IEEE Trans. Pattern Anal. Mach. Intell. 18(9), 912–926 (1996)
Lauritzen, S.: Graphical Models. Clarendon Press, Oxford (1996)
Lauritzen, S., Spiegelhalter, D.: Local computation with probabilities on graphical structures and their application to expert systems (with discussion). J. R. Stat. Soc. Ser. B Stat. Methodol. 50(2), 157–224 (1988)
Margaritis, D., Thrun, S.: Bayesian network induction via local neighborhoods. Adv. Neural Inf. Process. Syst. 12, 505–511 (1999)
Massa, M., Lauritzen, S.: Combining statistical models. Contemp. Math. 516, 239–259 (2010)
Pearl, J.: Bayesian networks: a model of self-activated memory for evidential reasoning. In: Proceedings of the 7th Conference of the Cognitive Science Society, vol. 7, pp. 329–334 (1985)
Pearl, J.: Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmann, San Mateo, CA (1988)
Richardson, M., Domingos, P.: Learning with knowledge from multiple experts. In: Proceedings of the 20th International Conference on Machine Learning, vol. 20, pp. 624–631 (2003)
Richardson, T., Spirtes, P.: Ancestral graph Markov models. Ann. Stat. 30(4), 962–1030 (2002)
Robinson, R.: Counting labeled acyclic digraphs. In: New Directions in the Theory of Graphs, pp. 239–273 (1973)
Scutari, M.: Learning Bayesian networks with the bnlearn R package. J. Stat. Softw. 35(3), 1–22 (2010). https://doi.org/10.18637/jss.v035.i03
Spirtes, P., Glymour, C., Scheines, R.: Causality from probability. In: Tiles, J., McKee, G., Dean, G. (eds.) Evolving Knowledge in the Natural and Behavioral Sciences, pp. 181–199. Pitman, London (1990)
Tillman, R., Danks, D., Glymour, C.: Integrating locally learned causal structures with overlapping variables. In: Advances in Neural Information Processing Systems (NIPS 2008), vol. 21, pp. 1–8 (2008)
Trucco, P., Cagno, E., Ruggeri, F., Grande, O.: A Bayesian belief network modelling of organisational factors in risk analysis: a case study in maritime transportation. Reliab. Eng. Syst. Saf. 93(6), 845–856 (2008)
Tsamardinos, I., Aliferis, C., Statnikov, A.: Algorithms for large scale Markov blanket discovery. In: The 16th International FLAIRS Conference, pp. 376–381 (2003)
Tsamardinos, I., Brown, L., Aliferis, C.: The max-min hill-climbing Bayesian network structure learning algorithm. Mach. Learn. 65(1), 31–78 (2006)
Tsamardinos, I., Triantafillou, S., Lagani, V.: Towards integrative causal analysis of heterogeneous data sets and studies. J. Mach. Learn. Res. 13, 1097–1157 (2012)
Verma, T., Pearl, J.: Causal networks: semantics and expressiveness. Uncertain. Artif. Intell. 5, 69–76 (1990)
Verma, T., Pearl, J.: Equivalence and synthesis of causal models. Uncertain. Artif. Intell. 6, 220–227 (1991)
Whittaker, J.: Graphical Models. Wiley, New York (1990)
Wilcoxon, F.: Individual comparisons by ranking methods. Biom. Bull. 1(6), 80–83 (1945)
Yang, L., Lee, J.: Bayesian belief network-based approach for diagnostics and prognostics of semiconductor manufacturing systems. Robot. Comput. Integr. Manuf. 28(28), 66–74 (2012)
Acknowledgements
This work was supported by a grant from the National Research Foundation of Korea (Grant No. 2016R1D1A1B03936155).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Kim, GH., Kim, SH. Marginal information for structure learning. Stat Comput 30, 331–349 (2020). https://doi.org/10.1007/s11222-019-09877-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11222-019-09877-x