Article

Tractable learning of large Bayes net structures from sparse data

Authors:

Anna Goldenberg,

Andrew MooreAuthors Info & Claims

ICML '04: Proceedings of the twenty-first international conference on Machine learning

Page 44

https://doi.org/10.1145/1015330.1015406

Published: 04 July 2004 Publication History

Abstract

This paper addresses three questions. Is it useful to attempt to learn a Bayesian network structure with hundreds of thousands of nodes? How should such structure search proceed practically? The third question arises out of our approach to the second: how can Frequent Sets (Agrawal et al., 1993), which are extremely popular in the area of descriptive data mining, be turned into a probabilistic model?Large sparse datasets with hundreds of thousands of records and attributes appear in social networks, warehousing, supermarket transactions and web logs. The complexity of structural search made learning of factored probabilistic models on such datasets unfeasible. We propose to use Frequent Sets to significantly speed up the structural search. Unlike previous approaches, we not only cache n-way sufficient statistics, but also exploit their local structure. We also present an empirical evaluation of our algorithm applied to several massive datasets.

References

[1]

Agrawal, R., Imielinski, T., & Swami, A. (1993). Mining association rules between sets of items in large databases. ACM SIGMOD 12 (pp. 207--216).

Digital Library

[2]

Agrawal, R., & Srikant, R. (1994). Fast algorithms for mining association rules. VLDB 20 (pp. 487--499).

Digital Library

[3]

Bishop, Y., Fienberg, S., & Holland, P. (1977). Discrete multivariate analysis: Theory and practice. MIT Press.

[4]

Breese, J., Heckerman, D., & Kadie, C. (1998). Empirical analysis of predictive algorithms for collaborative filtering. UAI14.

Digital Library

[5]

Breiger, R. (2003). Emergent themes in social network analysis: Results, challenges, opportunities. Dynamic Social Network Modeling and Analysis: Workshop Summary and Papers.

[6]

Callaway, D., Hopcroft, J., Kleinberg, J., Newman, M., & Strogatz, S. (2001). Are randomly grown graphs really random? Physical Review.

[7]

Chickering, D., & Heckerman, D. (1999). Fast learning from sparse data. UAI 15.

Digital Library

[8]

Cooper, G., & Herskovits, E. (1991). A Bayesian method for constructing Bayesian belief network from databases. UAI 7 (pp. 86--94).

Digital Library

[9]

Friedman, N. (2004). Inferring cellular networks using probabilistic graphical models. Science.

[10]

Friedman, N., Nachman, I., & Pe'er, D. (1999). Learning bayes network structure from massive datasets: The "sparse candidate" algorithm. UAI 15 (p. 206:215).

Digital Library

[11]

Goldenberg, A., & Moore, A. (2004). Tractable structural learning of large bayesian networks from sparse data (Technical Report CMU-CALD-04-103). CALD, CMU.

[12]

Han, J., & Kamber, M. (2000). Data mining: Concepts and techniques. Morgan Kaufmann Publishers.

Digital Library

[13]

Heckerman, D., Geiger, D., & Chickering, D. (1995). Learning Bayesian Netowrks: The combination of konwledge and statistical data. Machine Learning, 20, 197--243.

Digital Library

[14]

Hollmen, J., Seppanen, J., & Mannila, H. (2003). Mixture models and frequent sets: combining global and local methods for 0-1 data. SIAM ICDM.

[15]

Hulten, G., & Domingos, P. (2002). Mining complex models from arbitrarily large databases in constant time. ACM SIGKDD 8.

Digital Library

[16]

Mannila, H., & Toivonen, H. (1996). Multiple uses of frequent sets and condensed representations. KDD 2 (pp. 189--194).

[17]

Meila, M. (1999). An accelerated Chow and Liu algorithm: fitting tree distributions to high dimensional sparse data (Technical Report AIM-1652). MIT.

Digital Library

[18]

Moore, A., & Lee, M. S. (1998). Cached sufficient statistics for efficient machine learning with large datasets. Journal of Artificial Intelligence Research, 8, 67--91.

[19]

Moreno, J., & Jennings, H. (1938). Statistics of social configuration. Sociometry, 342--374.

[20]

Oates, T., & Jensen, D. (1998). Large datasets lead to overly complex models: An explanation and a solution. KDD 4.

[21]

Pavlov, D., Mannila, H., & Smyth, P. (2003). Beyond independence: probabilistic models for query approximation on binary transaction data. IEEE Transactions on Knowledge and Data Engineering.

Digital Library

[22]

Pelleg, D., & Moore, A. (2002). Using tarjan's red rule for fast dependency tree construction. NIPS 15.

Cited By

Bobu APeng AAgrawal PShah JDragan AGrollman DBroadbent EJu WSoh HWilliams T(2024)Aligning Human and Robot RepresentationsProceedings of the 2024 ACM/IEEE International Conference on Human-Robot Interaction10.1145/3610977.3634987(42-54)Online publication date: 11-Mar-2024
https://dl.acm.org/doi/10.1145/3610977.3634987
Scutari MVitolo CTucker A(2019)Learning Bayesian networks from big data with greedy search: computational complexity and efficient implementationStatistics and Computing10.1007/s11222-019-09857-1Online publication date: 15-Feb-2019
https://doi.org/10.1007/s11222-019-09857-1
Du SSong GHan LHong H(2018)Temporal causal inference with time lagNeural Computation10.1162/neco_a_0102830:1(271-291)Online publication date: 1-Jan-2018
https://dl.acm.org/doi/10.1162/neco_a_01028
Show More Cited By

Tractable learning of large Bayes net structures from sparse data
1. Computing methodologies
2. Information systems
  1. Information systems applications

Recommendations

Tractable Bayesian learning of tree augmented Naive Bayes models
ICML'03: Proceedings of the Twentieth International Conference on International Conference on Machine Learning

Bayesian classifiers such as Naive Bayes or Tree Augmented Naive Bayes (TAN) have shown excellent performance given their simplicity and heavy underlying independence assumptions. In this paper we introduce a classifier taking as basis the TAN model and ...
Latent-Space Variational Bayes

Variational Bayesian Expectation-Maximization (VBEM), an approximate inference method for probabilistic models based on factorizing over latent variables and model parameters, has been a standard technique for practical Bayesian inference. In this paper,...
Averaged collapsed variational bayes inference

This paper presents the Averaged CVB (ACVB) inference and oers convergence-guaranteed and practically useful fast Collapsed Variational Bayes (CVB) inferences. CVB inferences yield more precise inferences of Bayesian probabilistic models than ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICML '04: Proceedings of the twenty-first international conference on Machine learning

July 2004

934 pages

ISBN:1581138385

DOI:10.1145/1015330

Conference Chair:
Carla Brodley
Purdue University/Tufts University

Copyright © 2004 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 July 2004

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Acceptance Rates

Overall Acceptance Rate 140 of 548 submissions, 26%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

30
Total Citations
View Citations
787
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)1

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Bobu APeng AAgrawal PShah JDragan AGrollman DBroadbent EJu WSoh HWilliams T(2024)Aligning Human and Robot RepresentationsProceedings of the 2024 ACM/IEEE International Conference on Human-Robot Interaction10.1145/3610977.3634987(42-54)Online publication date: 11-Mar-2024
https://dl.acm.org/doi/10.1145/3610977.3634987
Scutari MVitolo CTucker A(2019)Learning Bayesian networks from big data with greedy search: computational complexity and efficient implementationStatistics and Computing10.1007/s11222-019-09857-1Online publication date: 15-Feb-2019
https://doi.org/10.1007/s11222-019-09857-1
Du SSong GHan LHong H(2018)Temporal causal inference with time lagNeural Computation10.1162/neco_a_0102830:1(271-291)Online publication date: 1-Jan-2018
https://dl.acm.org/doi/10.1162/neco_a_01028
Zhang JZhang CYu H(2018)Research on e-commerce intelligent service based on Data MiningMATEC Web of Conferences10.1051/matecconf/201817303012173(03012)Online publication date: 19-Jun-2018
https://doi.org/10.1051/matecconf/201817303012
Rao BMitra A(2016)Graph Mining and Its Applications in Studying Community-Based Graph under the Preview of Social NetworkBig Data10.4018/978-1-4666-9840-6.ch045(970-1022)Online publication date: 2016
https://doi.org/10.4018/978-1-4666-9840-6.ch045
Rao BMitra A(2016)Graph Mining and Its Applications in Studying Community-Based Graph under the Preview of Social NetworkProduct Innovation through Knowledge Management and Social Media Strategies10.4018/978-1-4666-9607-5.ch005(94-146)Online publication date: 2016
https://doi.org/10.4018/978-1-4666-9607-5.ch005
Xiabing Zhou Wenhao Huang Ni Zhang Weisong Hu Sizhen Du Song GXie K(2015)Probabilistic dynamic causal model for temporal data2015 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN.2015.7280468(1-8)Online publication date: Jul-2015
https://doi.org/10.1109/IJCNN.2015.7280468
Read JMartino LOlmos PLuengo D(2015)Scalable multi-output label predictionPattern Recognition10.1016/j.patcog.2015.01.00448:6(2096-2109)Online publication date: 1-Jun-2015
https://dl.acm.org/doi/10.1016/j.patcog.2015.01.004
Farasat ANikolaev ASrihari SBlair R(2015)Probabilistic graphical models in modern social network analysisSocial Network Analysis and Mining10.1007/s13278-015-0289-65:1Online publication date: 19-Oct-2015
https://doi.org/10.1007/s13278-015-0289-6
Ding FZhuang Y(2015)Computing contingency tables from sparse ADtreesApplied Intelligence10.1007/s10489-014-0624-z42:4(777-789)Online publication date: 1-Jun-2015
https://dl.acm.org/doi/10.1007/s10489-014-0624-z
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten