Abstract
The purpose of structure learning is to construct a qualitative relationship of Bayesian networks. Bayesian network with interpretability and logicality is widely applied in a lot of fields. With the extensive development of high-dimensional and low sample size data in some applications, structure learning of Bayesian networks for high dimension and low sample size data becomes a challenging problem. To handle this problem, we propose a method for learning high-dimensional Bayesian network structures based on multi-granularity information. First, an undirected independence graph construction method containing global structure information is designed to optimize the search space of network structure. Then, an improved agglomerative hierarchical clustering method is presented to cluster variables into sub-granules, which reduces the complexity of structure learning by considering the variable community characteristic in high-dimensional data. Finally, the corresponding sub-graphs are formed by learning the internal structure of sub-granules, and the final network structure is constructed based on the proposed construct link graph algorithm. To verify the proposed method, we conduct two types of comparison experiments: comparison experiment and embedded comparison experiment. The results of the experiments show that our approach is superior to the competitors. The results indicate that our method can not only learn structures of Bayesian network from high-dimensional data efficiently but also improve the efficiency and accuracy of network structure generated by other algorithms for high-dimensional data.
Similar content being viewed by others
References
Pearl J. Probabilistic reasoning in tntelligent systems: networks of plausible inference. San Francisco: Morgan Kaufmann; 1988.
Han JW, Kamber M, Pei J. Data mining: concepts and techniques, 3rd ed. San Francisco: Morgan Kaufmann; 2012.
Koller D, Friedman N. Probabilistic graphical models: principles and techniques. Cambridge: MIT press; 2009.
Adedipe T, Mahmood S, Zio E. Bayesian network modelling for the wind energy industry: an overview. Reliab Eng Syst Saf. 2020;202(1):107053.
Cai BP, Huang L, Xie M. Bayesian networks in fault diagnosis. IEEE Trans Ind Inform. 2017;13(5):2227–40.
Cai BP, Kong XD, Liu YH, Lin J, Yuan XB, Xu HQ, et al. Application of bayesian networks in reliability evaluation. IEEE Trans Ind Inform. 2019;15(4):2146–57.
Kabir S, Papadopoulos Y. Applications of bayesian networks and Petri nets in safety, reliability, and risk assessments: A review. Saf Sci. 2019;115:154–75.
Guo Y, Zhong ZM, Yang C, Hu JF, Jiang YL, Liang ZZ, et al. Epi-GTBN: an approach of epistasis mining based on genetic tabu algorithm and bayesian network. BMC Bioinform. 2019;20(1):1–18.
Onisko A, Druzdzel MJ, Austin RM. Application of bayesian network modeling to pathology informatics. Diagn Cytopathol. 2019;47(1):41–7.
Palaniappan SK, Akshay S, Liu B, Genest B, Thiagarajan PS. A hybrid factored frontier algorithm for dynamic bayesian networks with a biopathways application. IEEE-ACM Trans Comput Biol Bioinform. 2012;9(5):1352–65.
Meloni A, Ripoli A, Positano V, Landini L. Mutual information preconditioning improves structure learning of bayesian networks from medical databases. IEEE Trans Inf Technol Biomed. 2009;13(6):984–9.
Liu AH, Cheng Z, Jiang J. Bayesian network learning for classification via transfer method. In: 2019 IEEE 31st International Conference on Tools with Artificial Intelligence. 2019:1102-9.
Chen SH, Pollino CA. Good practice in bayesian network modelling. Environ Model Softw. 2012;37:134–45.
Dai JG, Ren J, Du WC, Shikhin V, Ma JX. An improved evolutionary approach-based hybrid algorithm for bayesian network structure learning in dynamic constrained search space. Neural Comput Appl. 2020;32(5):1413–34.
Aragam B, Gu JY, Zhou Q. Learning large-scale bayesian networks with the sparsebn package. J Stat Softw. 2019;91(11):1–38.
Jung S, Lee KH, Lee D. Enabling large-scale bayesian network learning by preserving intercluster directionality. IEICE Trans Inf Syst. 2007;90(7):1018–27.
Chickering DM. Learning bayesian networks is np-complete. Networks. 1995;112:121–30.
Chickering DM, Heckerman D, Meek C. Large-sample learning of bayesian networks is np-hard. J Mach Learn Res. 2004;5:1287–330.
Friedman N, Nachman I, Pe’er D. Learning bayesian network structure from massive datasets: the “sparse candidate” algorithm. CoRR abs/1301.6696. 2013:206-15.
Hong Y, Xia XL, Le JJ, Zhou XD. Learning bayesian network structure from large-scale datasets. In: International Conference on Advanced Cloud and Big Data. 2016:258-64.
Dai JG, Ren J, Du WC. Decomposition-based bayesian network structure learning algorithm using local topology information. Knowl-Based Syst. 2020;195:105602.
Xie X, Geng Z. A recursive method for structural learning of directed acyclic graphs. J Mach Learn Res. 2008;9(1):459–83.
Liu H, Zhou S, Lam W, Guan JH. A new hybrid method for learning bayesian networks: separation and reunion. Knowl-Based Syst. 2017;121:185–97.
Yao JT, Vasilakos AV, Pedrycz W. Granular computing: perspectives and challenges. IEEE Trans Cybern. 2013;43(6):1977–89.
Yao YY. Three-way granular computing, rough sets, and formal concept analysis. Int J Approx Reason. 2020;116(1):106–25.
Cover TM, Thomas JA. Elements of information theory. New Jersey: Wiley; 2006.
Li BH, Liu SY, Li ZG. Improved algorithm based on mutual information for learning bayesian network structures in the space of equivalence classes. Multimed Tools Appl. 2012;60(1):129–37.
Yu TW, Peng HS. Hierarchical clustering of high-throughput expression data based on general dependences. IEEE-ACM Trans Comput Biol Bioinform. 2013;10(4):1080–5.
Li GL, Xing L, Zhang ZS, Chen YW. A new bayesian network structure learning algorithm mechanism based on the decomposability of scoring functions. IEICE Trans Fundam Electron Commun Comput Sci. 2017;100(7):1541–51.
Zadeh LA. Fuzzy sets and information granularity. In: Gupta N, Ragade R, Yager R, editors. Advances in fuzzy set theory and applications. North-Holland: World Scientific Publishing; 1979:3–18.
Lin TY. Granular computing: from rough sets and neighborhood systems to information granulation and computing in words. In: European Congress on Intelligent Techniques and Soft Computing. 1997:1602-6.
Yao JT. Information granulation and granular relationships. In: IEEE International Conference on Granular Computing. 2005:326-9.
Wang MX, Wang LD, Wang CF, Gao X, Di R. Finding community structure of bayesian networks by improved K-means algorithm. In: 2018 IEEE 3rd International Conference on Image, Vision and Computing. 2018:865-9.
Zhang YK, Liu Y, Liu JM. Learning bayesian network structure by self-generating prior information: The two-step clustering-based strategy. In: The Workshops of the Thirty-Second AAAI Conference on Artificial Intelligence. 2018:530-7.
Heckerman D, Geiger D, Chickering DM. Learning bayesian networks: the combination of knowledge and statistical data. Mach Learn. 1995;20(3):197–243.
Tsamardinos I, Brown LE, Aliferis CF. The max-min hill-climbing bayesian network structure learning algorithm. Mach Learn. 2006;65(1): 31–78.
Husmeier D. Sensitivity and specificity of inferring genetic regulatory interactions from microarray experiments with dynamic bayesian networks. Bioinformatics. 2003;19(17):2271–82.
Villanueva E, Maciel CD. Effcient methods for learning bayesian network super-structures. Neurocomputing. 2014;123(1):3–12.
Spirtes P, Glymour C. An algorithm for fast recovery of sparse causal graphs. Soc Sci Comput Rev. 1991;9(1):62–72.
Cheng J, David AB, Liu WR. Learning belief networks from data: an information theory based approach. In: International Conference on Information and Knowledge Management. 1997:325-31.
Gheisari S, Meybodi MR. BNC-PSO: structure learning of bayesian networks by particle swarm optimization. Inf Sci. 2016;348:272–89.
Spirtes P, Meek C. Learning bayesian networks with discrete variables from data. In: Proceedings of the First International Conference on Knowledge Discovery and Data Mining. 1995:294-9.
Funding
This work was jointly supported in part by the National Natural Science Foundation of China (618-76027), and the Natural Science Foundation of Chongqing (cstc2019jcyj-cxttX0002, cstc2019jscx-mbdxX0048).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of Interest
The authors declare that they have no conflict of interest.
Research Involving Human Participants and/or Animals
This article does not contain any studies with human participants or animals performed by any of the authors.
Rights and permissions
About this article
Cite this article
He, C., Yu, H., Gu, S. et al. A Multi-Granularity Information-Based Method for Learning High-Dimensional Bayesian Network Structures. Cogn Comput 14, 1805–1817 (2022). https://doi.org/10.1007/s12559-021-09891-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12559-021-09891-0