Skip to main content

A Ring-Based Distributed Algorithm for Learning High-Dimensional Bayesian Networks

  • Conference paper
  • First Online:
Symbolic and Quantitative Approaches to Reasoning with Uncertainty (ECSQARU 2023)

Abstract

Learning Bayesian Networks (BNs) from high-dimensional data is a complex and time-consuming task. Although there are approaches based on horizontal (instances) or vertical (variables) partitioning in the literature, none can guarantee the same theoretical properties as the Greedy Equivalence Search (GES) algorithm, except those based on the GES algorithm itself. In this paper, we propose a directed ring-based distributed method that uses GES as the local learning algorithm, ensuring the same theoretical properties as GES but requiring less CPU time. The method involves partitioning the set of possible edges and constraining each processor in the ring to work only with its received subset. The global learning process is an iterative algorithm that carries out several rounds until a convergence criterion is met. In each round, each processor receives a BN from its predecessor in the ring, fuses it with its own BN model, and uses the result as the starting solution for a local learning process constrained to its set of edges. Subsequently, it sends the model obtained to its successor in the ring. Experiments were carried out on three large domains (400–1000 variables), demonstrating our proposal’s effectiveness compared to GES and its fast version (fGES).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    In this paper, we only consider the case of complete data, i.e., no missing values in the dataset.

  2. 2.

    https://github.com/cmu-phil/tetrad/releases/tag/v7.1.2-2.

  3. 3.

    https://github.com/JLaborda/cges.

  4. 4.

    https://www.openml.org/search?type=data &uploader_id=%3D_33148 &tags.tag=bnlearn.

References

  1. Alonso-Barba, J.I., delaOssa, L., Gámez, J.A., Puerta, J.M.: Scaling up the greedy equivalence search algorithm by constraining the search space of equivalence classes. Int. J. Approximate Reasoning 54(4), 429–451 (2013). https://doi.org/10.1016/j.ijar.2012.09.004

  2. Arias, J., Gámez, J.A., Puerta, J.M.: Structural learning of Bayesian networks via constrained hill climbing algorithms: adjusting trade-off between efficiency and accuracy. Int. J. Intell. Syst. 30(3), 292–325 (2015). https://doi.org/10.1002/int.21701

    Article  Google Scholar 

  3. de Campos, C.P., Ji, Q.: Efficient structure learning of Bayesian networks using constraints. J. Mach. Learn. Res. 12, 663–689 (2011). http://jmlr.org/papers/v12/decampos11a.html

  4. de Campos, L.M.: A scoring function for learning Bayesian networks based on mutual information and conditional independence tests. J. Mach. Learn. Res. 7, 2149–2187 (2006). http://jmlr.org/papers/v7/decampos06a.html

  5. Chickering, D.M.: Optimal structure identification with greedy search. J. Mach. Learn. Res. 3(Nov), 507–554 (2002). http://www.jmlr.org/papers/v3/chickering02b.html

  6. Chickering, D.M., Heckerman, D., Meek, C.: Large-sample learning of bayesian networks is np-hard. J. Mach. Learn. Res. 5, 1287–1330 (2004). https://www.jmlr.org/papers/v5/chickering04a.html

  7. Gámez, J.A., Mateo, J.L., Puerta, J.M.: Learning Bayesian networks by hill climbing: efficient methods based on progressive restriction of the neighborhood. Data Mining Knowl. Discov. 22(1), 106–148 (2011). https://doi.org/10.1007/s10618-010-0178-6

    Article  MathSciNet  MATH  Google Scholar 

  8. Heckerman, D., Geiger, D., Chickering, D.: Learning Bayesian networks: the combination of knowledge and statistical data. Mach. Learn. 20, 197–243 (1995). https://doi.org/10.1007/BF00994016

    Article  MATH  Google Scholar 

  9. Jensen, F.V., Nielsen, T.D.: Bayesian Networks and Decision Graphs, 2nd edn. Springer, New York (2007). https://doi.org/10.1007/978-0-387-68282-2

    Book  MATH  Google Scholar 

  10. de Jongh, M., Druzdzel, M.J.: A comparison of structural distance measures for causal Bayesian network models. In: Klopotek, M., Przepiorkowski, A., Wierzchon, S.T., Trojanowski, K. (eds.) Recent Advances in Intelligent Information Systems, Challenging Problems of Science, Computer Science series, pp. 443–456. Academic Publishing House EXIT (2009). https://doi.org/10.1007/978-3-030-34152-7

  11. Kim, G.-H., Kim, S.-H.: Marginal information for structure learning. Stat. Comput. 30(2), 331–349 (2019). https://doi.org/10.1007/s11222-019-09877-x

    Article  MathSciNet  MATH  Google Scholar 

  12. Kjaerulff, U.B., Madsen, A.L.: Bayesian Networks and Influence Diagrams: A Guide to Construction and Analysis, 2nd edn. Springer, New York (2013). https://doi.org/10.1007/978-1-4614-5104-4

    Book  MATH  Google Scholar 

  13. Koller, D., Friedman, N.: Probabilistic Graphical Models: Principles and Techniques - Adaptive Computation and Machine Learning. The MIT Press, Cambridge (2009)

    Google Scholar 

  14. Krier, C., François, D., Rossi, F., Verleysen, M.: Feature clustering and mutual information for the selection of variables in spectral data, pp. 157–162 (2007)

    Google Scholar 

  15. Li, T., Sahu, A.K., Talwalkar, A., Smith, V.: Federated learning: challenges, methods, and future directions. IEEE Signal Process. Mag. 37(3), 50–60 (2020). https://doi.org/10.1109/MSP.2020.2975749

    Article  Google Scholar 

  16. Lin, Y., Druzdzel, M.J.: Computational advantages of relevance reasoning in Bayesian belief networks. In: Proceedings of the Thirteenth Conference on Uncertainty in Artificial Intelligence, pp. 342–350. UAI 1997, Morgan Kaufmann Publishers Inc. (1997)

    Google Scholar 

  17. Peña, J.: Finding consensus Bayesian network structures. J. Artif. Intell. Res. (JAIR) 42, 661–687 (2011). https://doi.org/10.1613/jair.3427

    Article  MathSciNet  MATH  Google Scholar 

  18. Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann Publishers Inc., San Francisco (1988)

    MATH  Google Scholar 

  19. Puerta, J.M., Aledo, J.A., Gámez, J.A., Laborda, J.D.: Efficient and accurate structural fusion of Bayesian networks. Inf. Fusion 66, 155–169 (2021). https://doi.org/10.1016/j.inffus.2020.09.003

    Article  Google Scholar 

  20. Ramsey, J., Glymour, M., Sanchez-Romero, R., Glymour, C.: A million variables and more: the fast greedy equivalence search algorithm for learning high-dimensional graphical causal models, with an application to functional magnetic resonance images. Int. J. Data Sci. Anal. 3(2), 121–129 (2016). https://doi.org/10.1007/s41060-016-0032-z

    Article  Google Scholar 

  21. Scanagatta, M., Campos, C.P.D., Corani, G., Zaffalon, M.: Learning Bayesian networks with thousands of variables. In: Proceedings of the 28th International Conference on Neural Information Processing Systems, vol. 2, pp. 1864–1872. NIPS 2015, MIT Press (2015)

    Google Scholar 

  22. Scanagatta, M., Salmerón, A., Stella, F.: A survey on Bayesian network structure learning from data. Progress Artif. Intell. 8(4), 425–439 (2019). https://doi.org/10.1007/s13748-019-00194-y

    Article  Google Scholar 

  23. Scutari, M.: Learning Bayesian networks with the bnlearn R package. J. Stat. Softw. 35(3), 1–22 (2010). https://doi.org/10.18637/jss.v035.i03

    Article  MathSciNet  Google Scholar 

  24. Teyssier, M., Koller, D.: Ordering-based search: a simple and effective algorithm for learning Bayesian networks, pp. 584–590. UAI 2005, AUAI Press, Arlington, Virginia, USA (2005)

    Google Scholar 

  25. Tsamardinos, I., Brown, L.E., Aliferis, C.F.: The max-min hill-climbing Bayesian network structure learning algorithm. Mach. Learn. 65(1), 31–78 (2006). https://doi.org/10.1007/s10994-006-6889-7

    Article  MATH  Google Scholar 

Download references

Acknowledgements

This work has been funded by the Government of Castilla-La Mancha and “ERDF A way of making Europe” under project SBPLY/21/180225/000062. It is also partially funded by MCIN/AEI/10.13039/501100011033 and “ESF Investing your future” through PID2019–106758GB–C33, TED2021-131291B-I00 and FPU21/01074 projects. Furthermore, this work has been supported by the University of Castilla-La Mancha and “ERDF A Way of Making Europe” under project 2023-GRIN-34437. Finally, this work has also been funded by the predoctoral contract with code 2019-PREDUCLM-10188, granted by the University of Castilla-La Mancha.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jorge D. Laborda .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Laborda, J.D., Torrijos, P., Puerta, J.M., Gámez, J.A. (2024). A Ring-Based Distributed Algorithm for Learning High-Dimensional Bayesian Networks. In: Bouraoui, Z., Vesic, S. (eds) Symbolic and Quantitative Approaches to Reasoning with Uncertainty. ECSQARU 2023. Lecture Notes in Computer Science(), vol 14294. Springer, Cham. https://doi.org/10.1007/978-3-031-45608-4_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-45608-4_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-45607-7

  • Online ISBN: 978-3-031-45608-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics