RBNets: A Reinforcement Learning Approach for Learning Bayesian Network Structure

Zheng, Zuowu; Wang, Chao; Gao, Xiaofeng; Chen, Guihai

doi:10.1007/978-3-031-43418-1_12

Zuowu Zheng¹²,
Chao Wang¹²,
Xiaofeng Gao¹² &
…
Guihai Chen¹²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14171))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

900 Accesses

Abstract

Bayesian networks are graphical models that are capable of encoding complex statistical and causal dependencies, thereby facilitating powerful probabilistic inferences. To apply these models to real-world problems, it is first necessary to determine the Bayesian network structure, which represents the dependencies. Classic methods for this problem typically employ score-based search techniques, which are often heuristic in nature and have limited running times and performances that do not scale well for larger problems. In this paper, we propose a novel technique called RBNets, which uses deep reinforcement learning along with an exploration strategy guided by Upper Confidence Bound for learning Bayesian Network structures. RBNets solves the highest-value path problem and progressively finds better solutions. We demonstrate the efficiency and effectiveness of our approach against several state-of-the-art methods in extensive experiments using both real-world and synthetic datasets.

This work was supported by the National Key R &D Program of China [2020YFB1707900], the National Natural Science Foundation of China [62272302, 62202055, 62172276], Shanghai Municipal Science and Technology Major Project [2021SHZDZX0102], and CCF-Ant Research Fund [CCF-AFSG RF20220218].

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Mach. Learn. (ML) 47(2–3), 235–256 (2002)
Article MATH Google Scholar
de Campos, C.P., Scanagatta, M., Corani, G., Zaffalon, M.: Entropy-based pruning for learning Bayesian networks using BIC. Artif. Intell. (AI) 260, 42–50 (2018)
Article MathSciNet MATH Google Scholar
Campos, C.P.D., Ji, Q.: Efficient structure learning of Bayesian networks using constraints. J. Mach. Learn. Res. (JMLR) 12, 663–689 (2011)
MathSciNet MATH Google Scholar
de Campos, L.M., Fernández-Luna, J.M., Gámez, J.A., Puerta, J.M.: Ant colony optimization for learning Bayesian networks. Int. J. Approx. Reason. 31(3), 291–311 (2002)
Article MathSciNet MATH Google Scholar
Chen, C., Yuan, C.: Learning diverse Bayesian networks. In: AAAI Conference on Artificial Intelligence (AAAI), pp. 7793–7800 (2019)
Google Scholar
Chickering, D.M.: Learning Bayesian networks is NP-complete. Networks 112(2), 121–130 (1996)
MathSciNet Google Scholar
Cussens, J.: Bayesian network learning with cutting planes. In: Conference on Uncertainty in Artificial Intelligence (UAI), pp. 153–160 (2011)
Google Scholar
Cussens, J., Bartlett, M.: Advances in Bayesian network learning using integer programming. In: Conference on Uncertainty in Artificial Intelligence (UAI), pp. 182–191 (2013)
Google Scholar
Friedman, N., Nachman, I., Peér, D.: Learning Bayesian network structure from massive datasets: the “sparse candidate” algorithm. In: Conference on Uncertainty in Artificial Intelligence (UAI), pp. 206–215 (1999)
Google Scholar
Gasse, M., Aussem, A., Elghazel, H.: An experimental comparison of hybrid algorithms for Bayesian network structure learning. In: Flach, P.A., De Bie, T., Cristianini, N. (eds.) ECML PKDD 2012. LNCS (LNAI), vol. 7523, pp. 58–73. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33460-3_9
Chapter Google Scholar
van Hasselt, H., Guez, A., Silver, D.: Deep reinforcement learning with double q-learning. In: AAAI Conference on Artificial Intelligence (AAAI), pp. 2094–2100 (2016)
Google Scholar
Heckerman, D.: A tutorial on learning with Bayesian networks. In: NATO Advanced Study Institute on Learning in Graphical Models, pp. 301–354 (1998)
Google Scholar
Jaakkola, T., Sontag, D., Globerson, A., Meila, M.: Learning Bayesian network structure using LP relaxations. J. Mach. Learn. Res. (JMLR) 9, 358–365 (2010)
Google Scholar
Lee, C., van Beek, P.: Metaheuristics for score-and-search Bayesian network structure learning. In: Canadian Conference on Artificial Intelligence (Canadian AI), pp. 129–141 (2017)
Google Scholar
Liao, Z.A., Sharma, C., Cussens, J., van Beek, P.: Finding all Bayesian network structures within a factor of optimal. In: AAAI Conference on Artificial Intelligence (AAAI), pp. 7892–7899 (2019)
Google Scholar
Malone, B., Yuan, C., Hansen, E.A., Bridges, S.: Improving the scalability of optimal Bayesian network learning with external-memory frontier breadth-first branch and bound search. In: Conference on Uncertainty in Artificial Intelligence (UAI), pp. 479–488 (2011)
Google Scholar
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015)
Article Google Scholar
Osband, I., Blundell, C., Pritzel, A., Roy, B.V.: Deep exploration via bootstrapped DQN. In: Neural Information Processing Systems (NeurIPS), pp. 4026–4034 (2016)
Google Scholar
Scanagatta, M., de Campos, C.P., Corani, G., Zaffalon, M.: Learning Bayesian networks with thousands of variables. In: Neural Information Processing Systems (NeurIPS), pp. 1864–1872 (2015)
Google Scholar
Schaul, T., Quan, J., Antonoglou, I., Silver, D.: Prioritized experience replay. In: International Conference on Learning Representations (ICLR) (2016)
Google Scholar
Silander, T., Myllymaki, P.: A simple approach for finding the globally optimal Bayesian network structure. In: Conference on Uncertainty in Artificial Intelligence (UAI) (2006)
Google Scholar
Silver, D., et al.: A general reinforcement learning algorithm that masters chess, shogi, and go through self-play. Science 362(6419), 1140–1144 (2018)
Article MathSciNet MATH Google Scholar
Singh, A.P., Moore, A.W.: Finding optimal Bayesian networks by dynamic programming. In: USENIX Annual Technical Conference (USENIX ATC) (2005)
Google Scholar
Sutton, R., Barto, A.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Google Scholar
Teyssier, M., Koller, D.: Ordering-based search: a simple and effective algorithm for learning Bayesian networks. In: Conference on Uncertainty in Artificial Intelligence (UAI), pp. 548–549 (2005)
Google Scholar
Wang, X., et al.: Ordering-based causal discovery with reinforcement learning. In: International Joint Conference on Artificial Intelligence (IJCAI), pp. 3566–3573 (2021)
Google Scholar
Wang, Z., Schaul, T., Hessel, M., van Hasselt, H., Lanctot, M., de Freitas, N.: Dueling network architectures for deep reinforcement learning. In: International Conference on Machine Learning (ICML), pp. 1995–2003 (2016)
Google Scholar
Yuan, C., Malone, B.M., Wu, X.: Learning optimal Bayesian networks using A* search. In: International Joint Conference on Artificial Intelligence (IJCAI), pp. 2186–2191 (2011)
Google Scholar
Zhu, S., Ng, I., Chen, Z.: Causal discovery with reinforcement learning. In: International Conference on Learning Representations (ICLR) (2020)
Google Scholar

Download references

Author information

Authors and Affiliations

MoE Key Lab of Artificial Intelligence, Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China
Zuowu Zheng, Chao Wang, Xiaofeng Gao & Guihai Chen

Authors

Zuowu Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Chao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaofeng Gao
View author publications
You can also search for this author in PubMed Google Scholar
Guihai Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaofeng Gao .

Editor information

Editors and Affiliations

University of Michigan, Ann Arbor, MI, USA
Danai Koutra
University of Vienna, Vienna, Austria
Claudia Plant
Max Planck Institute for Software Systems, Kaiserslautern, Germany
Manuel Gomez Rodriguez
Politecnico di Torino, Turin, Italy
Elena Baralis
CENTAI, Turin, Italy
Francesco Bonchi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zheng, Z., Wang, C., Gao, X., Chen, G. (2023). RBNets: A Reinforcement Learning Approach for Learning Bayesian Network Structure. In: Koutra, D., Plant, C., Gomez Rodriguez, M., Baralis, E., Bonchi, F. (eds) Machine Learning and Knowledge Discovery in Databases: Research Track. ECML PKDD 2023. Lecture Notes in Computer Science(), vol 14171. Springer, Cham. https://doi.org/10.1007/978-3-031-43418-1_12

Download citation

DOI: https://doi.org/10.1007/978-3-031-43418-1_12
Published: 17 September 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43417-4
Online ISBN: 978-3-031-43418-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the ECML PKDD community (opens in a new tab)

RBNets: A Reinforcement Learning Approach for Learning Bayesian Network Structure