Abstract
Learning Bayesian Networks (BNs) from high-dimensional data is a complex and time-consuming task. Although there are approaches based on horizontal (instances) or vertical (variables) partitioning in the literature, none can guarantee the same theoretical properties as the Greedy Equivalence Search (GES) algorithm, except those based on the GES algorithm itself. In this paper, we propose a directed ring-based distributed method that uses GES as the local learning algorithm, ensuring the same theoretical properties as GES but requiring less CPU time. The method involves partitioning the set of possible edges and constraining each processor in the ring to work only with its received subset. The global learning process is an iterative algorithm that carries out several rounds until a convergence criterion is met. In each round, each processor receives a BN from its predecessor in the ring, fuses it with its own BN model, and uses the result as the starting solution for a local learning process constrained to its set of edges. Subsequently, it sends the model obtained to its successor in the ring. Experiments were carried out on three large domains (400–1000 variables), demonstrating our proposal’s effectiveness compared to GES and its fast version (fGES).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
In this paper, we only consider the case of complete data, i.e., no missing values in the dataset.
- 2.
- 3.
- 4.
References
Alonso-Barba, J.I., delaOssa, L., Gámez, J.A., Puerta, J.M.: Scaling up the greedy equivalence search algorithm by constraining the search space of equivalence classes. Int. J. Approximate Reasoning 54(4), 429–451 (2013). https://doi.org/10.1016/j.ijar.2012.09.004
Arias, J., Gámez, J.A., Puerta, J.M.: Structural learning of Bayesian networks via constrained hill climbing algorithms: adjusting trade-off between efficiency and accuracy. Int. J. Intell. Syst. 30(3), 292–325 (2015). https://doi.org/10.1002/int.21701
de Campos, C.P., Ji, Q.: Efficient structure learning of Bayesian networks using constraints. J. Mach. Learn. Res. 12, 663–689 (2011). http://jmlr.org/papers/v12/decampos11a.html
de Campos, L.M.: A scoring function for learning Bayesian networks based on mutual information and conditional independence tests. J. Mach. Learn. Res. 7, 2149–2187 (2006). http://jmlr.org/papers/v7/decampos06a.html
Chickering, D.M.: Optimal structure identification with greedy search. J. Mach. Learn. Res. 3(Nov), 507–554 (2002). http://www.jmlr.org/papers/v3/chickering02b.html
Chickering, D.M., Heckerman, D., Meek, C.: Large-sample learning of bayesian networks is np-hard. J. Mach. Learn. Res. 5, 1287–1330 (2004). https://www.jmlr.org/papers/v5/chickering04a.html
Gámez, J.A., Mateo, J.L., Puerta, J.M.: Learning Bayesian networks by hill climbing: efficient methods based on progressive restriction of the neighborhood. Data Mining Knowl. Discov. 22(1), 106–148 (2011). https://doi.org/10.1007/s10618-010-0178-6
Heckerman, D., Geiger, D., Chickering, D.: Learning Bayesian networks: the combination of knowledge and statistical data. Mach. Learn. 20, 197–243 (1995). https://doi.org/10.1007/BF00994016
Jensen, F.V., Nielsen, T.D.: Bayesian Networks and Decision Graphs, 2nd edn. Springer, New York (2007). https://doi.org/10.1007/978-0-387-68282-2
de Jongh, M., Druzdzel, M.J.: A comparison of structural distance measures for causal Bayesian network models. In: Klopotek, M., Przepiorkowski, A., Wierzchon, S.T., Trojanowski, K. (eds.) Recent Advances in Intelligent Information Systems, Challenging Problems of Science, Computer Science series, pp. 443–456. Academic Publishing House EXIT (2009). https://doi.org/10.1007/978-3-030-34152-7
Kim, G.-H., Kim, S.-H.: Marginal information for structure learning. Stat. Comput. 30(2), 331–349 (2019). https://doi.org/10.1007/s11222-019-09877-x
Kjaerulff, U.B., Madsen, A.L.: Bayesian Networks and Influence Diagrams: A Guide to Construction and Analysis, 2nd edn. Springer, New York (2013). https://doi.org/10.1007/978-1-4614-5104-4
Koller, D., Friedman, N.: Probabilistic Graphical Models: Principles and Techniques - Adaptive Computation and Machine Learning. The MIT Press, Cambridge (2009)
Krier, C., François, D., Rossi, F., Verleysen, M.: Feature clustering and mutual information for the selection of variables in spectral data, pp. 157–162 (2007)
Li, T., Sahu, A.K., Talwalkar, A., Smith, V.: Federated learning: challenges, methods, and future directions. IEEE Signal Process. Mag. 37(3), 50–60 (2020). https://doi.org/10.1109/MSP.2020.2975749
Lin, Y., Druzdzel, M.J.: Computational advantages of relevance reasoning in Bayesian belief networks. In: Proceedings of the Thirteenth Conference on Uncertainty in Artificial Intelligence, pp. 342–350. UAI 1997, Morgan Kaufmann Publishers Inc. (1997)
Peña, J.: Finding consensus Bayesian network structures. J. Artif. Intell. Res. (JAIR) 42, 661–687 (2011). https://doi.org/10.1613/jair.3427
Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann Publishers Inc., San Francisco (1988)
Puerta, J.M., Aledo, J.A., Gámez, J.A., Laborda, J.D.: Efficient and accurate structural fusion of Bayesian networks. Inf. Fusion 66, 155–169 (2021). https://doi.org/10.1016/j.inffus.2020.09.003
Ramsey, J., Glymour, M., Sanchez-Romero, R., Glymour, C.: A million variables and more: the fast greedy equivalence search algorithm for learning high-dimensional graphical causal models, with an application to functional magnetic resonance images. Int. J. Data Sci. Anal. 3(2), 121–129 (2016). https://doi.org/10.1007/s41060-016-0032-z
Scanagatta, M., Campos, C.P.D., Corani, G., Zaffalon, M.: Learning Bayesian networks with thousands of variables. In: Proceedings of the 28th International Conference on Neural Information Processing Systems, vol. 2, pp. 1864–1872. NIPS 2015, MIT Press (2015)
Scanagatta, M., Salmerón, A., Stella, F.: A survey on Bayesian network structure learning from data. Progress Artif. Intell. 8(4), 425–439 (2019). https://doi.org/10.1007/s13748-019-00194-y
Scutari, M.: Learning Bayesian networks with the bnlearn R package. J. Stat. Softw. 35(3), 1–22 (2010). https://doi.org/10.18637/jss.v035.i03
Teyssier, M., Koller, D.: Ordering-based search: a simple and effective algorithm for learning Bayesian networks, pp. 584–590. UAI 2005, AUAI Press, Arlington, Virginia, USA (2005)
Tsamardinos, I., Brown, L.E., Aliferis, C.F.: The max-min hill-climbing Bayesian network structure learning algorithm. Mach. Learn. 65(1), 31–78 (2006). https://doi.org/10.1007/s10994-006-6889-7
Acknowledgements
This work has been funded by the Government of Castilla-La Mancha and “ERDF A way of making Europe” under project SBPLY/21/180225/000062. It is also partially funded by MCIN/AEI/10.13039/501100011033 and “ESF Investing your future” through PID2019–106758GB–C33, TED2021-131291B-I00 and FPU21/01074 projects. Furthermore, this work has been supported by the University of Castilla-La Mancha and “ERDF A Way of Making Europe” under project 2023-GRIN-34437. Finally, this work has also been funded by the predoctoral contract with code 2019-PREDUCLM-10188, granted by the University of Castilla-La Mancha.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Laborda, J.D., Torrijos, P., Puerta, J.M., Gámez, J.A. (2024). A Ring-Based Distributed Algorithm for Learning High-Dimensional Bayesian Networks. In: Bouraoui, Z., Vesic, S. (eds) Symbolic and Quantitative Approaches to Reasoning with Uncertainty. ECSQARU 2023. Lecture Notes in Computer Science(), vol 14294. Springer, Cham. https://doi.org/10.1007/978-3-031-45608-4_10
Download citation
DOI: https://doi.org/10.1007/978-3-031-45608-4_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-45607-7
Online ISBN: 978-3-031-45608-4
eBook Packages: Computer ScienceComputer Science (R0)