Data Dependent Risk Bounds for Hierarchical Mixture of Experts Classifiers

Azran, Arik; Meir, Ron

doi:10.1007/978-3-540-27819-1_30

Data Dependent Risk Bounds for Hierarchical Mixture of Experts Classifiers

Arik Azran²⁰ &
Ron Meir²⁰

Conference paper

2135 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3120))

Abstract

The hierarchical mixture of experts architecture provides a flexible procedure for implementing classification algorithms. The classification is obtained by a recursive soft partition of the feature space in a data-driven fashion. Such a procedure enables local classification where several experts are used, each of which is assigned with the task of classification over some subspace of the feature space. In this work, we provide data-dependent generalization error bounds for this class of models, which lead to effective procedures for performing model selection. Tight bounds are particularly important here, because the model is highly parameterized. The theoretical results are complemented with numerical experiments based on a randomized algorithm, which mitigates the effects of local minima which plague other approaches such as the expectation-maximization algorithm.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bartlett, P.L., Jordan, M.I., McAuliffe, J.D.: Convexity, classification, and risk bounds. Technical Report 638, Department of Statistics, U.C. Berkeley (2003)
Google Scholar
Bartlett, P.L., Mendelson, S.: Rademacher and Gaussian complexities: Risk bounds and structural results. Journal of Machine Learning Research 3, 463–482 (2002)
Article MathSciNet Google Scholar
Blake, C.L. Merz, C.J.: UCI repository of machine learning databases (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html
Boucheron, S., Lugosi, G., Massart, P.: Concentration inequalities using the entropy method. The Annals of Probability 31, 1583–1614 (2003)
Article MATH MathSciNet Google Scholar
de Boer, P.T., Kroese, D.P., Mannor, S.,Rubinstein, R.Y.: A tutorial on the cross-entropy method. Annals of Operations Research ( 2004) ( to appear)
Google Scholar
Desyatnikov, I., Meir, R.: Data-dependent bounds for multi-category classification based on convex losses. In: Schölkopf, B., Warmuth, M.K. (eds.) COLT/Kernel 2003. LNCS (LNAI), vol. 2777, pp. 159–172. Springer, Heidelberg (2003)
Chapter Google Scholar
Ghaharamani, Z., Nakano, R., Ueda, N., Hinton, G.E.: Smem algorithm for mixture models. Neural Computation 12, 2109–2128 (2000)
Article Google Scholar
Jaakkola, T.: Tutorial on variational approximation methods. In: Opper, M., Saad, D. (eds.) Advanced Mean Field Methods: Theory and Practice, pp. 129–159. MIT Press, Cambridge (2001)
Google Scholar
Jiang, W.: Complexity regularization via localized random penalties. Neural Computation 12(6)
Google Scholar
Jordan, M.I., Jacobs, R.A.: Hierarchical mixtures of experts and the EM algorithm. Neural Computation 6(2), 181–214 (1994)
Article Google Scholar
Ledoux, M., Talgrand, M.: Probability in Banach Spaces: Isoperimetry and Processes. Springer Press, New York (1991)
MATH Google Scholar
Mannor, S., Meir, R., Zhang, T.: Greedy algorithms for classification - consistency, convergence rates, and adaptivity. Journal of Machine Learning Research 4, 713–741 (2003)
Article MathSciNet Google Scholar
McCullach, P., Nelder, J.A.: Generalized Linear Models, 2nd edn. CRC Press, Boca Raton (1989)
Google Scholar
McDiarmid, C.: On the method of bounded differences. In: Surveys in Combinatorics, pp. 148–188. Cambridge University Press, Cambridge (1989)
Google Scholar
Meir, R., El-Yaniv, R., Ben-David, S.: Localized boosting. In: Cesa-Bianchi, N., Goldman, S. (eds.) Proc. Thirteenth Annual Conference on Computaional Learning Theory, pp. 190–199. Morgan Kaufman, San Francisco (2000)
Google Scholar
Meir, R., Zhang, T.: Generalization bounds for Bayesian mixture algorithms. Journal of Machine Learning Research 4, 839–860 (2003)
Article MathSciNet Google Scholar
Nakano, R., Ueda, N.N.: Determinisic annealing em algorithm. Neural Networks 11(2) (1998)
Google Scholar
Rubinstein, R.Y.: The cross-entropy method for combinatorial and continuous optimization. Methodology and Computing in Applied Probability 1, 127–190 (1999)
Article MATH MathSciNet Google Scholar
van der Vaart, A.W., Wellner, J.A.: Weak Convergence and Empirical Processes. Springer Verlag, New York (1996)
MATH Google Scholar
Zhang, T.: Statistical behavior and consistency of classification methods based on convex risk minimization. The Annals of Statistics 32(1) (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical Engineering, Technion, Haifa, 3200, Israel
Arik Azran & Ron Meir

Authors

Arik Azran
View author publications
You can also search for this author in PubMed Google Scholar
Ron Meir
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

The Centre for Computational Statistics and Machine Learning Department of Computer Science, University College London, Gower St., WC1E 6BT, London
John Shawe-Taylor
Google, 1600 Amphitheater Parkway, CA 94043, Mountain View, USA
Yoram Singer

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Azran, A., Meir, R. (2004). Data Dependent Risk Bounds for Hierarchical Mixture of Experts Classifiers. In: Shawe-Taylor, J., Singer, Y. (eds) Learning Theory. COLT 2004. Lecture Notes in Computer Science(), vol 3120. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-27819-1_30

Download citation

DOI: https://doi.org/10.1007/978-3-540-27819-1_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22282-8
Online ISBN: 978-3-540-27819-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics