Abstract
This paper studies oracle properties of ℓ1-penalized estimators of a probability density. We show that the penalized least squares estimator satisfies sparsity oracle inequalities, i.e., bounds in terms of the number of non-zero components of the oracle vector. The results are valid even when the dimension of the model is (much) larger than the sample size. They are applied to estimation in sparse high-dimensional mixture models, to nonparametric adaptive density estimation and to the problem of aggregation of density estimators.
Research of F. Bunea and M. Wegkamp is supported in part by NSF grant DMS 0406049.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Abramovich, F., Benjamini, Y., Donoho, D.L., Johnstone, I.M: Adapting to unknown sparsity by controlling the False Discovery Rate. Annals of Statistics 34, 584–653 (2006)
Barron, A., Birgé, L., Massart, P.: Risk bounds for model selection via penalization. Probability Theory and Related Fields 113, 301–413 (1999)
Birgé, L., Massart, P.: From model selection to adaptive estimation. Festschrift for Lucien LeCam. In: Pollard, D., Torgersen, E., Yang, G. (eds.) Research Papers in Probability and Statistics, pp. 55–87. Springer, New York (1997)
Bunea, F., Tsybakov, A.B., Wegkamp, M.H: Aggregation for Gaussian regression. Preprint Department of Statistics, Florida State University. Annals of Statistics, to appear (2005)
Bunea, F., Tsybakov, A.B., Wegkamp, M.H.: Aggregation and sparsity via ℓ1-penalized least squares. In: Lugosi, G., Simon, H.U. (eds.) COLT 2006. LNCS (LNAI), vol. 4005, pp. 379–391. Springer, Heidelberg (2006a)
Bunea, F., Tsybakov, A.B., Wegkamp, M.H: b). Sparsity oracle inequalities for the Lasso. Submitted (2006)
Chen, S., Donoho, D., Saunders, M.: Atomic decomposition by basis pursuit. SIAM Review 43, 129–159 (2001)
Devroye, L., Lugosi, G.: Combinatorial Methods in density estimation. Springer, Heidelberg (2000)
Donoho, D.L.: Denoising via soft-thresholding. IEEE Trans. Info. Theory 41, 613–627 (1995)
Donoho, D.L., Elad, M., Temlyakov, V.: Stable Recovery of Sparse Overcomplete Representations in the Presence of Noise. Manuscript (2004)
Donoho, D.L., Huo, X.: Uncertainty principles and ideal atomic decomposition. IEEE Transactions Inform. Theory 47, 2845–2862 (2001)
Golubev, G.K.: Nonparametric estimation of smooth probability densties in L 2. Problems of Information Transmission 28, 44–54 (1992)
Golubev, G.K.: Reconstruction of sparse vectors in white Gaussian noise. Problems of Information Transmission 38, 65–79 (2002)
Greenshtein, E., Ritov, Y.: Persistency in high dimensional linear predictor-selection and the virtue of over-parametrization. Bernoulli 10, 971–988 (2004)
Hall, P., Kerkyacharian, G., Picard, D.: Block threshold rules for curve estimation using kernel and wavelet methods. Annals of Statistics 26, 922–942 (1998)
Härdle, W., Kerkyacharian, G., Picard, D., Tsybakov, A.: Wavelets, Approximation and Statistical Applications. Lecture Notes in Statistics, vol. 129, Springer, New York (1998)
Kerkyacharian, G., Picard, D., Tribouley, K.: L p adaptive density estimation. Bernoulli 2, 229–247 (1996)
Koltchinskii, V.: Model selection and aggregation in sparse classification problems. Mathematisches Forschungsinstitut Oberwolfach 2, 2663–2667 (2005)
Koltchinskii, V.: Sparsity in penalized empirical risk minimization. Submitted (2006)
Loubes, J.– M., van de Geer, S.A.: Adaptive estimation in regression, using soft thresholding type penalties. Statistica Neerlandica 56, 453–478 (2002)
Meinshausen, N., Bühlmann, P.: High-dimensional graphs and variable selection with the Lasso. Annals of Statistics 34, 1436–1462 (2006)
Nemirovski, A.: Topics in non-parametric statistics. In: Bernard, P. (ed.) Ecole d’Eté de Probabilités de Saint-Flour 1998. Lecture Notes in Mathematics, vol. XXVIII, Springer, New York (2000)
Rigollet, Ph.: Inégalités d’oracle, agrégation et adaptation. PhD thesis, University of Paris 6 (2006)
Rigollet, Ph.,Tsybakov, A. B.: Linear and convex aggregation of density estimators. ( 2004), https://hal.ccsd.cnrs.fr/ccsd-00068216.
Rudemo, M.: Empirical choice of histograms and kernel density estimato. Scandinavian Journal of Statistics 9, 65–78 (1982)
Tibshirani, R.: Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society, Series B 58, 267–288 (1996)
Tsybakov, A.B.: Optimal rates of aggregation. In: Schölkopf, B., Warmuth, M.K. (eds.) COLT/Kernel 2003. LNCS (LNAI), vol. 2777, Springer, Heidelberg (2003)
Vapnik, V.N.: Statistical Learning Theory. Wiley, New York (1998)
van de Geer, S.A.: High dimensional generalized linear models and the Lasso. Research report No.133. Seminar für Statistik, ETH, Zürich (2006)
Wegkamp, M.H.: Quasi-Universal Bandwidth Selection for Kernel Density Estimators. Canadian Journal of Statistics 27, 409–420 (1999)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Bunea, F., Tsybakov, A.B., Wegkamp, M.H. (2007). Sparse Density Estimation with ℓ1 Penalties. In: Bshouty, N.H., Gentile, C. (eds) Learning Theory. COLT 2007. Lecture Notes in Computer Science(), vol 4539. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72927-3_38
Download citation
DOI: https://doi.org/10.1007/978-3-540-72927-3_38
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72925-9
Online ISBN: 978-3-540-72927-3
eBook Packages: Computer ScienceComputer Science (R0)