Abstract
To explore the Perturb and Combine idea for estimating probability densities, we study mixtures of tree structured Markov networks derived by bagging combined with the Chow and Liu maximum weight spanning tree algorithm, or by pure random sampling. We empirically assess the performances of these methods in terms of accuracy, with respect to mixture models derived by EM-based learning of Naive Bayes models, and EM-based learning of mixtures of trees. We find that the bagged ensembles outperform all other methods while the random ones perform also very well. Since the computational complexity of the former is quadratic and that of the latter is linear in the number of variables of interest, this paves the way towards the design of efficient density estimation methods that may be applied to problems with very large numbers of variables and comparatively very small sample sizes.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Cowell, R., Dawid, A., Lauritzen, S., Spiegelhalter, D.: Probabilistic Networks and Expert Systems. Springer, Heidelberg (1999)
Auvray, V., Wehenkel, L.: Learning inclusion-optimal chordal graphs. In: Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence (UAI 2008), pp. 18–25. Morgan Kaufmann, San Francisco (2008)
Geurts, P., Ernst, D., Wehenkel, L.: Extremely randomized trees. Machine Learning 63(1), 3–42 (2006)
Ridgeway, G.: Looking for lumps: boosting and bagging for density estimation. Computational Statistics & Data Analysis 38(4), 379–392 (2002)
Ammar, S., Leray, P., Defourny, B., Wehenkel, L.: High-dimensional probability density estimation with randomized ensembles of tree structured bayesian networks. In: Proceedings of the fourth European Workshop on Probabilistic Graphical Models (PGM 2008), pp. 9–16 (2008)
Chow, C., Liu, C.: Approximating discrete probability distributions with dependence trees. IEEE Transactions on Information Theory 14(3), 462–467 (1968)
Chickering, D., Heckerman, D.: Efficient approximations for the marginal likelihood of bayesian networks with hidden variables. Machine Learning 29(2-3), 181–212 (1997)
Robinson, R.: Counting unlabeled acyclic digraphs. In: Little, C.H.C. (ed.) Combinatorial Mathematics V. Lecture Notes in Mathematics, vol. 622, pp. 28–43. Springer, Berlin (1977)
Madigan, D., Raftery, A.: Model selection and accounting for model uncertainty in graphical models using occam’s window. The American Statistical Association 89, 1535–1546 (1994)
Madigan, D., York, J.: Bayesian graphical models for discrete data. International Statistical Review 63, 215–232 (1995)
Friedman, N., Koller, D.: Being bayesian about network structure. In: Boutilier, C., Goldszmidt, M. (eds.) Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence (UAI 2000), pp. 201–210. Morgan Kaufmann Publishers, San Francisco (2000)
Pearl, J.: Fusion, propagation, and structuring in belief networks. Artificial Intelligence 29, 241–288 (1986)
Meila-Predoviciu, M.: Learning with Mixtures of Trees. PhD thesis, MIT (1999)
Efron, B., Tibshirani, R.J.: An introduction to the Bootstrap. Monographs on Statistics and Applied Probability, vol. 57. Chapman and Hall, Boca Raton (1993)
Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)
Kullback, S., Leibler, R.: On information and sufficiency. Annals of Mathematical Statistics 22(1), 79–86 (1951)
Dempster, A., Laird, N., Rubin, D.: Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B (Methodological) 39(1), 1–38 (1977)
Lowd, D., Domingos, P.: Naive bayes models for probability estimation. In (ICML 2005) Proceedings of the 22nd international conference on Machine Learning, pp. 529–536. ACM Press, New York (2005)
Rubinstein, R., Kroese, D.: The Cross-Entropy Method. A Unified Approach to Combinatorial Optimization, Monte-Carlo Simulation, and Machine Learning. In: Information Science and Statistics. Springer, Heidelberg (2004)
Rosset, S., Segal, E.: Boosting density estimation. In: Proceedings of the 16th International Conference on Neural Information Processing Systems (NIPS), Vancouver, British Columbia, Canada, pp. 267–281 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ammar, S., Leray, P., Defourny, B., Wehenkel, L. (2009). Probability Density Estimation by Perturbing and Combining Tree Structured Markov Networks. In: Sossai, C., Chemello, G. (eds) Symbolic and Quantitative Approaches to Reasoning with Uncertainty. ECSQARU 2009. Lecture Notes in Computer Science(), vol 5590. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02906-6_15
Download citation
DOI: https://doi.org/10.1007/978-3-642-02906-6_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02905-9
Online ISBN: 978-3-642-02906-6
eBook Packages: Computer ScienceComputer Science (R0)