Abstract
In the present paper, we study the problem of aggregation under the squared loss in the model of regression with deterministic design. We obtain sharp oracle inequalities for convex aggregates defined via exponential weights, under general assumptions on the distribution of errors and on the functions to aggregate. We show how these results can be applied to derive a sparsity oracle inequality.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Audibert, J.-Y.: Une approche PAC-bayésienne de la théorie statistique de l’apprentissage. PhD Thesis. University of Paris 6 (2004)
Audibert, J.-Y.: A randomized online learning algorithm for better variance control. In: COLT 2006. Proceedings of the 19th Annual Conference on Learning Theory. LNCS (LNAI), vol. 4005, pp. 392–407. Springer, Heidelberg (2006)
Bunea, F., Nobel, A.B.: Sequential Procedures for Aggregating Arbitrary Estimators of a Conditional Mean. Preprint Florida State University (2005), http://www.stat.fsu.edu/~flori
Bunea, F., Tsybakov, A.B., Wegkamp, M.H.: Aggregation and sparsity via ℓ1-penalized least squares. In: Lugosi, G., Simon, H.U. (eds.) COLT 2006. LNCS (LNAI), vol. 4005, pp. 379–391. Springer, Heidelberg (2006)
Bunea, F., Tsybakov, A.B., Wegkamp, M.H.: Aggregation for gaussian regression. Annals of Statistics, to appear (2007), http://www.stat.fsu.edu/~wegkamp
Bunea, F., Tsybakov, A.B., Wegkamp, M.H.: Sparsity oracle inequalities for the Lasso, Submitted (2006)
Candes, E., Tao, T.: The Dantzig selector: statistical estimation when p is much larger than n. Annals of Statistics, to appear (2007)
Catoni, O.: Universal. aggregation rules with exact bias bounds. Preprint n.510, Laboratoire de Probabilités et Modèles Aléatoires, Universités Paris 6 and Paris 7 (1999), http://www.proba.jussieu.fr/mathdoc/preprints/index.html#1999
Catoni, O.: Statistical Learning Theory and Stochastic Optimization. In: Ecole d’été de Probabilités de Saint-Flour 2001. Lecture Notes in Mathematics, Springer, Heidelberg (2004)
Cesa-Bianchi, N., Conconi, A., Gentile, G.: On the generalization ability of on-line learning algorithms. IEEE Trans. on Information Theory 50, 2050–2057 (2004)
Cesa-Bianchi, N., Lugosi, G.: Prediction, Learning, and Games. Cambridge University Press, New York (2006)
Donoho, D.L., Elad, M., Temlyakov, V.: Stable Recovery of Sparse Overcomplete Representations in the Presence of Noise. IEEE Trans. on Information Theory 52, 6–18 (2006)
Juditsky, A., Rigollet, P., Tsybakov, A.: Learning by mirror averaging. Preprint n, Laboratoire de Probabilités et Modèle aléatoires, Universités Paris 6 and Paris 7, (2005). n. 1034, https://hal.ccsd.cnrs.fr/ccsd-00014097
Juditsky, A.B., Nazin, A.V., Tsybakov, A.B., Vayatis, N.: Recursive aggregation of estimators via the Mirror Descent Algorithm with averaging. Problems of Information Transmission 41, 368–384 (2005)
Koltchinskii, V.: Sparsity in penalized empirical risk minimization, Submitted (2006)
Leung, G., Barron, A.: Information theory and mixing least-square regressions. IEEE Transactions on Information Theory 52, 3396–3410 (2006)
Littlestone, N., Warmuth, M.K.: The weighted majority algorithm. Information and Computation 108, 212–261 (1994)
Obloj, J.: The Skorokhod embedding problem and its offspring. Probability Surveys 1, 321–392 (2004)
Petrov, V.V.: Limit Theorems of Probability Theory. Clarendon Press, Oxford (1995)
Revuz, D., Yor, M.: Continuous Martingales and Brownian Motion. Springer, Heidelberg (1999)
Tsybakov, A.B.: Optimal rates of aggregation. In: Schölkopf, B., Warmuth, M. (eds.) Computational Learning Theory and Kernel Machines. LNCS (LNAI), vol. 2777, pp. 303–313. Springer, Heidelberg (2003)
Tsybakov, A.B.: Regularization, boosting and mirror averaging. Comments on “Regularization in Statistics”, by Bickel, P., Li, B., Test 15, 303–310 ( 2006)
van de Geer, S.A.: High dimensional generalized linear models and the Lasso. Research report No.133. Seminar für Statistik, ETH, Zürich (2006)
Vovk, V.: Aggregating Strategies. In: Proceedings of the 3rd Annual Workshop on Computational Learning Theory, COLT1990, pp. 371–386. Morgan Kaufmann, San Francisco, CA (1990)
Vovk, V.: Competitive on-line statistics. International Statistical Review 69, 213–248 (2001)
Yang, Y.: Combining different procedures for adaptive regression. Journal of Multivariate Analysis 74, 135–161 (2000)
Yang, Y.: Adaptive regression by mixing. Journal of the American Statistical Association 96, 574–588 (2001)
Yang, Y.: Regression with multiple candidate models: selecting or mixing? Statist. Sinica 13, 783–809 (2003)
Zhang, T.: From epsilon-entropy to KL-complexity: analysis of minimum information complexity density estimation. Annals of Statistics, to appear (2007)
Zhang, T.: Information theoretical upper and lower bounds for statistical estimation. IEEE Transactions on Information Theory, to appear (2007)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Dalalyan, A.S., Tsybakov, A.B. (2007). Aggregation by Exponential Weighting and Sharp Oracle Inequalities. In: Bshouty, N.H., Gentile, C. (eds) Learning Theory. COLT 2007. Lecture Notes in Computer Science(), vol 4539. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72927-3_9
Download citation
DOI: https://doi.org/10.1007/978-3-540-72927-3_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72925-9
Online ISBN: 978-3-540-72927-3
eBook Packages: Computer ScienceComputer Science (R0)