A fast calculation of metric scores for learning Bayesian network

Lv, Qiang; Xia, Xiao-Yan; Qian, Pei-De

doi:10.1007/s11633-012-0614-8

A fast calculation of metric scores for learning Bayesian network

Published: 22 February 2012

Volume 9, pages 37–44, (2012)
Cite this article

International Journal of Automation and Computing Aims and scope Submit manuscript

Qiang Lv^1,2,
Xiao-Yan Xia^1,2 &
Pei-De Qian^1,2

114 Accesses
3 Citations
Explore all metrics

Abstract

Frequent counting is a very so often required operation in machine learning algorithms. A typical machine learning task, learning the structure of Bayesian network (BN) based on metric scoring, is introduced as an example that heavily relies on frequent counting. A fast calculation method for frequent counting enhanced with two cache layers is then presented for learning BN. The main contribution of our approach is to eliminate comparison operations for frequent counting by introducing a multi-radix number system calculation. Both mathematical analysis and empirical comparison between our method and state-of-the-art solution are conducted. The results show that our method is dominantly superior to state-of-the-art solution in solving the problem of learning BN.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey on semi-supervised learning

Article Open access 15 November 2019

A survey on ensemble learning

Article 30 August 2019

Learning from positive and unlabeled data: a survey

Article 02 April 2020

References

D. J. Hand, H. Mannila, P. Smyth. Principles of Data Mining. USA: The MIT Press, 2001.
Google Scholar
G. F. Cooper, E. Herskovits. A Bayesian method for the induction of probabilistic networks from data. Machine Learning, vol. 9, no. 4, pp. 309–347, 1992.
MATH Google Scholar
V. Harinarayan, A. Rajaraman, J. D. Ullman. Implementing data cubes efficiently. In Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, ACM, New York, USA, vol. 25, no. 2, pp. 205–216, 1996.
Article Google Scholar
A. Moore, M. S. Lee. Cached sufficient statistics for efficient machine learning with large datasets. Journal of Artificial Intelligence Research, vol. 8, no. 1, pp. 67–91, 1998.
MathSciNet MATH Google Scholar
H. Mannila, H. Toivonen. Multiple uses of frequent sets and condensed representations. In Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, pp. 189–194, 1996. [Online], Available: http://www.aaai.org/Papers/KDD/1996/KDD96-031.pdf, June 22, 2011.
Y. Tsin, Y. Liu, V. Ramesh. Texture replacement in real images. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Hawaii, vol. 2, pp. 539–544, 2001.
Google Scholar
Q. Ding, Q. Ding, W. Perrizo. Association rule mining on remotely sensed images using p-trees. In Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, ACM, London, UK, pp. 66–79, 2002.
Google Scholar
A. Dobra, A. F. Karr, A. P. Sanil. Preserving confidentiality of high-dimensional tabulated data: Statistical and computational issues. Statistics and Computing, vol. 13, no. 4, pp. 363–370, 2003.
Article MathSciNet Google Scholar
S. Sanghai, P. Domingos, D. Weld. Dynamic probabilistic relational models. In Proceedings of the 18th International Joint Conference on Artificial Intelligence, ACM, San Francisco, USA, pp. 992–997, 2003.
Google Scholar
P. Komarek, A. W. Moore. A dynamic adaptation of adtrees for efficient machine learning on large data sets. In Proceedings of the 17th International Conference on Machine Learning, ACM, San Francisco, USA, pp. 495–502, 2000.
Google Scholar
A. W. Moore, J. G. Schneider. Real-valued all-dimensions search: Low-overhead rapid searching over subsets of attributes. In Proceedings of the 18th Conference on Uncertainty in Artificial Intelligence, Auton Lab, San Francisco, USA, pp. 360–369, 2002.
Google Scholar
S. M. Omohundro. Efficient algorithms with neural network behaviour. Journal of Complex Systems, vol. 1, no. 2, pp. 273–347, 1987.
MathSciNet MATH Google Scholar
A. W. Moore, J. Schneider, K. Deng. Efficient locally weighted polynomial regression predictions. In Proceedings of the 14th International Conference on Machine Learning, ACM, San Francisco, USA, pp. 236–244, 1997.
Google Scholar
R. Agrawal, H. Mannila, R. Srikant, H. Toivonen, A. I. Verkamo. Fast discovery of association rules. Advances in Knowledge Discovery and Data Mining, U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, R. Uthurusamy, Eds., USA: AAAI Press, pp. 307–328, 1996.
Google Scholar
D. Kumar, N. Ramakrishnan, R. F. Helm, M. Potts. Algorithms for storytelling. IEEE Transactions on Knowledge and Data Engineering, vol. 20, no. 6, pp. 736–751, 2008.
Article Google Scholar
A. A. B. Subramanian, R. Rajaram. Effective and efficient feature selection for large-scale data using Bayes’ theorem. International Journal of Automation and Computing, vol.6, no. 1, pp. 62–71, 2009.
Article Google Scholar
S. Nijssen, E. Fromont. Mining optimal decision trees from itemset lattices. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data mining, ACM, San Jose, USA, pp. 530–539, 2007.
Chapter Google Scholar
L. M. de Campos, J. M. Fernadez-Luna, J. A. Gámez, J. M. Puerta. Ant colony optimization for learning Bayesian networks. International Journal of Approximate Reasoning, vol. 31, no. 3, pp. 291–311, 2002.
Article MathSciNet MATH Google Scholar
L. M. de Campos, J. A. Gámez, J. M. Puerta. Learning Bayesian networks by ant colony optimization: Searching in the space of orderings. Mathware and Soft Computing, vol. 9, no. 2–3, pp. 251–268, 2002.
MathSciNet MATH Google Scholar
J. S. Pan, Q. Lv, H. L. Wang. A parallel ant colonies approach to learning Bayesian network. Journal of Chinese Computer systems, vol. 28, no. 4, pp. 651–655, 2007. (in Chinese)
Google Scholar
I. A. Beinlich, H. Suermondt, R. M. Chavez, G. F. Cooper. The alarm monitoring system: A case study with two probabilistic inference techniques for belief networks. In Proceedings of the 2nd European Conference on Artificial Intelligence in Medicine, Academic, Marseilles, France, vol. 38, pp. 247–256, 1989.
Google Scholar
L. M. de Campos, J. M. Puerta. Stochastic local algorithms for learning belief networks: Searching in the space of the orderings. In Proceedings of the 6th European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty, ACM, London, UK, pp. 228–239, 2001.
Chapter Google Scholar
D. Heckerman, D. Geiger, D. M. Chickering. Learning Bayesian networks: The combination of knowledge and statistical data. Machine Learning, vol. 20, no. 3, pp. 197–243, 1995.
MATH Google Scholar
A. Moore, W. K. Wong. Optimal reinsertion: A new search operator for accelerated and more accurate Bayesian network structure learning. In Proceedings of the 20th International Conference on Machine Learning, Auton Lab, California, USA, pp. 552–559, 2003.
Google Scholar
K. Das, J. Schneider, D. B. Neill. Anomaly pattern detection in categorical datasets. In Proceeding of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, Las Vegas, USA, pp. 169–176, 2008.
Chapter Google Scholar
I. Tsamardinos, L. E. Brown, C. F. Aliferis. The maxmin hill-climbing Bayesian network structure learning algorithm. Machine Learning, vol. 65, no. 1, pp. 31–78, 2006.
Article Google Scholar
Auton Lab. HC-ADtree, [online], Available: http://www.autonlab.org/autonweb/10530.html?branch=1&language=2, June 24,2011.

Download references

Author information

Authors and Affiliations

School of Computer Science and Technology, Soochow University, Suzhou, 215006, PRC
Qiang Lv, Xiao-Yan Xia & Pei-De Qian
Jiangsu Provincial Key Lab for Computer Information Processing Technology, Suzhou, 215006, PRC
Qiang Lv, Xiao-Yan Xia & Pei-De Qian

Authors

Qiang Lv
View author publications
You can also search for this author in PubMed Google Scholar
Xiao-Yan Xia
View author publications
You can also search for this author in PubMed Google Scholar
Pei-De Qian
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qiang Lv.

Additional information

This work was supported by National Natural Science Foundation of China (No. 60970055).

Qiang Lv graduated from Soochow University, PRC in 1988. He received the M. S. degree from China Eastern Institute of Technology in 1991 and the Ph.D. degree from Soochow University in 2006. He is currently a professor at the School of Computer Science and Technology, Soochow University.

His research interests include bioinformatics, meta heuristics search, and parallel and distributed computing.

Xiao-Yan Xia received the B. Sc. degree in computer science from the Soochow University, PRC in 2003. She is currently a research fellow of the Provincial Key Laboratory for Computer Information Processing Technology, Soochow University.

Her research interests include database system design and its application.

Pei-De Qian received the B. Sc. degree in computer science from Nanjing University, PRC in 1982. He is currently a professor at the School of Computer Science and Technology, Soochow University.

His research interests include Chinese information processing, distributed computing, and operating system.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lv, Q., Xia, XY. & Qian, PD. A fast calculation of metric scores for learning Bayesian network. Int. J. Autom. Comput. 9, 37–44 (2012). https://doi.org/10.1007/s11633-012-0614-8

Download citation

Received: 24 January 2010
Revised: 21 April 2011
Published: 22 February 2012
Issue Date: February 2012
DOI: https://doi.org/10.1007/s11633-012-0614-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A fast calculation of metric scores for learning Bayesian network

Abstract

Access this article

Similar content being viewed by others

A survey on semi-supervised learning

A survey on ensemble learning

Learning from positive and unlabeled data: a survey

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A fast calculation of metric scores for learning Bayesian network

Abstract

Access this article

Similar content being viewed by others

A survey on semi-supervised learning

A survey on ensemble learning

Learning from positive and unlabeled data: a survey

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation