Skip to main content

Probability Density Estimation by Perturbing and Combining Tree Structured Markov Networks

  • Conference paper
Symbolic and Quantitative Approaches to Reasoning with Uncertainty (ECSQARU 2009)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5590))

  • 1341 Accesses

Abstract

To explore the Perturb and Combine idea for estimating probability densities, we study mixtures of tree structured Markov networks derived by bagging combined with the Chow and Liu maximum weight spanning tree algorithm, or by pure random sampling. We empirically assess the performances of these methods in terms of accuracy, with respect to mixture models derived by EM-based learning of Naive Bayes models, and EM-based learning of mixtures of trees. We find that the bagged ensembles outperform all other methods while the random ones perform also very well. Since the computational complexity of the former is quadratic and that of the latter is linear in the number of variables of interest, this paves the way towards the design of efficient density estimation methods that may be applied to problems with very large numbers of variables and comparatively very small sample sizes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Cowell, R., Dawid, A., Lauritzen, S., Spiegelhalter, D.: Probabilistic Networks and Expert Systems. Springer, Heidelberg (1999)

    MATH  Google Scholar 

  2. Auvray, V., Wehenkel, L.: Learning inclusion-optimal chordal graphs. In: Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence (UAI 2008), pp. 18–25. Morgan Kaufmann, San Francisco (2008)

    Google Scholar 

  3. Geurts, P., Ernst, D., Wehenkel, L.: Extremely randomized trees. Machine Learning 63(1), 3–42 (2006)

    Article  MATH  Google Scholar 

  4. Ridgeway, G.: Looking for lumps: boosting and bagging for density estimation. Computational Statistics & Data Analysis 38(4), 379–392 (2002)

    Article  MATH  Google Scholar 

  5. Ammar, S., Leray, P., Defourny, B., Wehenkel, L.: High-dimensional probability density estimation with randomized ensembles of tree structured bayesian networks. In: Proceedings of the fourth European Workshop on Probabilistic Graphical Models (PGM 2008), pp. 9–16 (2008)

    Google Scholar 

  6. Chow, C., Liu, C.: Approximating discrete probability distributions with dependence trees. IEEE Transactions on Information Theory 14(3), 462–467 (1968)

    Article  MATH  Google Scholar 

  7. Chickering, D., Heckerman, D.: Efficient approximations for the marginal likelihood of bayesian networks with hidden variables. Machine Learning 29(2-3), 181–212 (1997)

    Article  MATH  Google Scholar 

  8. Robinson, R.: Counting unlabeled acyclic digraphs. In: Little, C.H.C. (ed.) Combinatorial Mathematics V. Lecture Notes in Mathematics, vol. 622, pp. 28–43. Springer, Berlin (1977)

    Chapter  Google Scholar 

  9. Madigan, D., Raftery, A.: Model selection and accounting for model uncertainty in graphical models using occam’s window. The American Statistical Association 89, 1535–1546 (1994)

    Article  MATH  Google Scholar 

  10. Madigan, D., York, J.: Bayesian graphical models for discrete data. International Statistical Review 63, 215–232 (1995)

    Article  MATH  Google Scholar 

  11. Friedman, N., Koller, D.: Being bayesian about network structure. In: Boutilier, C., Goldszmidt, M. (eds.) Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence (UAI 2000), pp. 201–210. Morgan Kaufmann Publishers, San Francisco (2000)

    Google Scholar 

  12. Pearl, J.: Fusion, propagation, and structuring in belief networks. Artificial Intelligence 29, 241–288 (1986)

    Article  MATH  Google Scholar 

  13. Meila-Predoviciu, M.: Learning with Mixtures of Trees. PhD thesis, MIT (1999)

    Google Scholar 

  14. Efron, B., Tibshirani, R.J.: An introduction to the Bootstrap. Monographs on Statistics and Applied Probability, vol. 57. Chapman and Hall, Boca Raton (1993)

    Book  MATH  Google Scholar 

  15. Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)

    MATH  Google Scholar 

  16. Kullback, S., Leibler, R.: On information and sufficiency. Annals of Mathematical Statistics 22(1), 79–86 (1951)

    Article  MATH  Google Scholar 

  17. Dempster, A., Laird, N., Rubin, D.: Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B (Methodological) 39(1), 1–38 (1977)

    MATH  Google Scholar 

  18. Lowd, D., Domingos, P.: Naive bayes models for probability estimation. In (ICML 2005) Proceedings of the 22nd international conference on Machine Learning, pp. 529–536. ACM Press, New York (2005)

    Google Scholar 

  19. Rubinstein, R., Kroese, D.: The Cross-Entropy Method. A Unified Approach to Combinatorial Optimization, Monte-Carlo Simulation, and Machine Learning. In: Information Science and Statistics. Springer, Heidelberg (2004)

    Google Scholar 

  20. Rosset, S., Segal, E.: Boosting density estimation. In: Proceedings of the 16th International Conference on Neural Information Processing Systems (NIPS), Vancouver, British Columbia, Canada, pp. 267–281 (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ammar, S., Leray, P., Defourny, B., Wehenkel, L. (2009). Probability Density Estimation by Perturbing and Combining Tree Structured Markov Networks. In: Sossai, C., Chemello, G. (eds) Symbolic and Quantitative Approaches to Reasoning with Uncertainty. ECSQARU 2009. Lecture Notes in Computer Science(), vol 5590. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02906-6_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-02906-6_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-02905-9

  • Online ISBN: 978-3-642-02906-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics