Skip to main content

Data Dependent Risk Bounds for Hierarchical Mixture of Experts Classifiers

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3120))

Abstract

The hierarchical mixture of experts architecture provides a flexible procedure for implementing classification algorithms. The classification is obtained by a recursive soft partition of the feature space in a data-driven fashion. Such a procedure enables local classification where several experts are used, each of which is assigned with the task of classification over some subspace of the feature space. In this work, we provide data-dependent generalization error bounds for this class of models, which lead to effective procedures for performing model selection. Tight bounds are particularly important here, because the model is highly parameterized. The theoretical results are complemented with numerical experiments based on a randomized algorithm, which mitigates the effects of local minima which plague other approaches such as the expectation-maximization algorithm.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bartlett, P.L., Jordan, M.I., McAuliffe, J.D.: Convexity, classification, and risk bounds. Technical Report 638, Department of Statistics, U.C. Berkeley (2003)

    Google Scholar 

  2. Bartlett, P.L., Mendelson, S.: Rademacher and Gaussian complexities: Risk bounds and structural results. Journal of Machine Learning Research 3, 463–482 (2002)

    Article  MathSciNet  Google Scholar 

  3. Blake, C.L. Merz, C.J.: UCI repository of machine learning databases (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html

  4. Boucheron, S., Lugosi, G., Massart, P.: Concentration inequalities using the entropy method. The Annals of Probability 31, 1583–1614 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  5. de Boer, P.T., Kroese, D.P., Mannor, S.,Rubinstein, R.Y.: A tutorial on the cross-entropy method. Annals of Operations Research ( 2004) ( to appear)

    Google Scholar 

  6. Desyatnikov, I., Meir, R.: Data-dependent bounds for multi-category classification based on convex losses. In: Schölkopf, B., Warmuth, M.K. (eds.) COLT/Kernel 2003. LNCS (LNAI), vol. 2777, pp. 159–172. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  7. Ghaharamani, Z., Nakano, R., Ueda, N., Hinton, G.E.: Smem algorithm for mixture models. Neural Computation 12, 2109–2128 (2000)

    Article  Google Scholar 

  8. Jaakkola, T.: Tutorial on variational approximation methods. In: Opper, M., Saad, D. (eds.) Advanced Mean Field Methods: Theory and Practice, pp. 129–159. MIT Press, Cambridge (2001)

    Google Scholar 

  9. Jiang, W.: Complexity regularization via localized random penalties. Neural Computation 12(6)

    Google Scholar 

  10. Jordan, M.I., Jacobs, R.A.: Hierarchical mixtures of experts and the EM algorithm. Neural Computation 6(2), 181–214 (1994)

    Article  Google Scholar 

  11. Ledoux, M., Talgrand, M.: Probability in Banach Spaces: Isoperimetry and Processes. Springer Press, New York (1991)

    MATH  Google Scholar 

  12. Mannor, S., Meir, R., Zhang, T.: Greedy algorithms for classification - consistency, convergence rates, and adaptivity. Journal of Machine Learning Research 4, 713–741 (2003)

    Article  MathSciNet  Google Scholar 

  13. McCullach, P., Nelder, J.A.: Generalized Linear Models, 2nd edn. CRC Press, Boca Raton (1989)

    Google Scholar 

  14. McDiarmid, C.: On the method of bounded differences. In: Surveys in Combinatorics, pp. 148–188. Cambridge University Press, Cambridge (1989)

    Google Scholar 

  15. Meir, R., El-Yaniv, R., Ben-David, S.: Localized boosting. In: Cesa-Bianchi, N., Goldman, S. (eds.) Proc. Thirteenth Annual Conference on Computaional Learning Theory, pp. 190–199. Morgan Kaufman, San Francisco (2000)

    Google Scholar 

  16. Meir, R., Zhang, T.: Generalization bounds for Bayesian mixture algorithms. Journal of Machine Learning Research 4, 839–860 (2003)

    Article  MathSciNet  Google Scholar 

  17. Nakano, R., Ueda, N.N.: Determinisic annealing em algorithm. Neural Networks 11(2) (1998)

    Google Scholar 

  18. Rubinstein, R.Y.: The cross-entropy method for combinatorial and continuous optimization. Methodology and Computing in Applied Probability 1, 127–190 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  19. van der Vaart, A.W., Wellner, J.A.: Weak Convergence and Empirical Processes. Springer Verlag, New York (1996)

    MATH  Google Scholar 

  20. Zhang, T.: Statistical behavior and consistency of classification methods based on convex risk minimization. The Annals of Statistics 32(1) (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Azran, A., Meir, R. (2004). Data Dependent Risk Bounds for Hierarchical Mixture of Experts Classifiers. In: Shawe-Taylor, J., Singer, Y. (eds) Learning Theory. COLT 2004. Lecture Notes in Computer Science(), vol 3120. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-27819-1_30

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-27819-1_30

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-22282-8

  • Online ISBN: 978-3-540-27819-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics