Abstract
Let (X,Y) be a \(\mathcal{X}\) × 0,1 valued random pair and consider a sample (X 1,Y 1),...,(X n ,Y n ) drawn from the distribution of (X,Y). We aim at constructing from this sample a classifier that is a function which would predict the value of Y from the observation of X. The special case where \(\mathcal{X}\) is a functional space is of particular interest due to the so called curse of dimensionality. In a recent paper, Biau et al. [1] propose to filter the X i ’s in the Fourier basis and to apply the classical k–Nearest Neighbor rule to the first d coefficients of the expansion. The selection of both k and d is made automatically via a penalized criterion. We extend this study, and note here the penalty used by Biau et al. is too heavy when we consider the minimax point of view under some margin type assumptions. We prove that using a penalty of smaller order or equal to zero is preferable both in theory and practice. Our experimental study furthermore shows that the introduction of a small-order penalty stabilizes the selection process, while preserving rather good performances.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Biau, G., Bunea, F., Wegkamp, M.: Functional classification in Hilbert spaces. IEEE Trans. Inf. Theory 51, 2163–2172 (2005)
Devroye, L., Györfi, L., Lugosi, G.: A probabilistic theory of pattern recognition. In: Applications of Mathematics. Springer, New York (1996)
Boucheron, S., Bousquet, O., Lugosi, G.: Theory of classification: some recent advances. ESAIM Probability and Statistics 9, 323–375 (2005)
Stone, C.: Consistent nonparametric regression. Ann. Statist. 5, 595–645 (1977)
Devroye, L., Krzyzak, A.: An equivalence theorem for L1 convergence of the kernel regression estimate. Journal of the Statistical Planning and Inference 23, 71–82 (1989)
Dabo-Niang, S., Rhomari, N.: Nonparametric regression estimation when the regressor takes its values in a metric space. Technical report, Université Paris VI (2001), http://www.ccr.jussieu.fr/lsta
Abraham, C., Biau, G., Cadre, B.: On the kernel rule for function classification. Technical report, Université de Montpellier (2003), http://www.math.univ-montp2.fr/~biau/publications.html
Kulkarni, S., Posner, S.: Rates of convergence of nearest neighbor estimation under arbitrary sampling. IEEE Trans. Inf. Theory 41, 1028–1039 (1995)
Cerou, F., Guyader, A.: Nearest neighbor classification in infinite dimension. Technical report, IRISA, Rennes, France (2005)
Ramsay, J., Silverman, B.: Functional data analysis. Springer Series in Statistics. Springer, New York (1997)
Ramsay, J., Silverman, B.: Applied functional data analysis. Springer Series in Statistics. Springer, New York (2002) (Methods and case studies)
Hastie, T., Buja, A., Tibshirani, R.: Penalized discriminant analysis. Ann. Statist. 23, 73–102 (1995)
Hall, P., Poskitt, D., Presnell, B.: A functional data-analytic approach to signal discrimination. Technometrics 43, 1–9 (2001)
Ferraty, F., Vieu, P.: Curves discrimination: a nonparametric functional approach. Comput. Statist. Data Anal. 44, 161–173 (2003)
Hastie, T., Tibshirani, R.: Discriminant adaptive nearest neighbor classification. IEEE Trans. Pattern Anal. Mach. Intell. 18, 607–616 (1996)
Rossi, F., Villa, N.: Classification in Hilbert spaces with support vector machines. In: Proceedings of ASMDA 2005, Brest, France, pp. 635–642 (2005)
Hengartner, N., Matzner-Lober, E., Wegkamp, M.: Bandwidth selection for local linear regression. Journal of the Royal Statistical Society, Series B 64, 1–14 (2002)
Stone, C.: Consistent nonparametric regression. Ann. Statist. 5, 595–645 (1977)
Vapnik, V., Chervonenkis, A.: On the uniform convergence of relative frequencies of events to their probabilities. Theory Probab. Appl. 16, 264–280 (1971)
Lugosi, G.: Pattern classification and learning theory. In: Györfi, L. (ed.) Principles of Nonparametric Learning, pp. 1–56. Springer, Wien, New York (2002)
Vapnik, V., Chervonenkis, A.: Teoriya raspoznavaniya obrazov. Statisticheskie problemy obucheniya (Theory of pattern recognition. Statistical problems of learning), Nauka, Moscow (1974)
Devroye, L., Wagner, T.: Nonparametric discrimination and density estimation. Technical Report 183, Electronics Research Center, University of Texas (1976)
Vapnik, V.: Estimation of dependences based on empirical data. Springer, New York (1982)
Blumer, A., Ehrenfeucht, A., Haussler, D., Warmuth, M.: Learnability and the Vapnik-Chervonenkis dimension. J. Assoc. Comput. Mach. 36, 929–965 (1989)
Haussler, D., Littlestone, N., Warmuth, M.: Predicting {0,1}-functions on randomly drawn points. Inf. Comput. 115, 248–292 (1994)
Devroye, L., Lugosi, G.: Lower bounds in pattern recognition and learning. Pattern Recognition 28, 1011–1018 (1995)
Mammen, E., Tsybakov, A.: Smooth discrimination analysis. Ann. Statist. 27, 1808–1829 (1999)
Tsybakov, A.: Optimal aggregation of classifiers in statistical learning. Ann. Statist. 32, 135–166 (2004)
Massart, P., Nédélec, E.: Risk bounds for statistical learning. Ann. Statist (to appear, 2005)
Bartlett, P., Jordan, M., McAuliffe, J.: Convexity, classification, and risk bounds. Journal of the American Statistical Association 101, 138–156 (2006)
Audibert, J.Y., Tsybakov, A.: Fast learning rates for plug-in estimators under margin condition (preprint, 2005)
Massart, P.: Concentration inequalities and model selection. Lectures given at the Saint-Flour summer school of probability theory. Lect. Notes Math. (to appear, 2003)
Györfi, L.: On the rate of convergence of k-nearest-neighbor classification rule (preprint, 2005)
Birgé, L., Massart, P.: Minimum contrast estimators on sieves: exponential bounds and rates of convergence. Bernoulli 4, 329–375 (1998)
Yang, Y.: Can the strengths of AIC and BIC be shared? a conflict between model identification and regression estimation. Biometrika 92, 937–950 (2005)
Berlinet, A., Biau, G., Rouvière, L.: Functional learning with wavelets. IEEE Trans. Inf. Theory (to appear, 2005)
Tuleau, C.: Sélection de variables pour la discrimination en grande dimension et classification de données fonctionnelles. PhD thesis, Université Paris Sud (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fromont, M., Tuleau, C. (2006). Functional Classification with Margin Conditions. In: Lugosi, G., Simon, H.U. (eds) Learning Theory. COLT 2006. Lecture Notes in Computer Science(), vol 4005. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11776420_10
Download citation
DOI: https://doi.org/10.1007/11776420_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-35294-5
Online ISBN: 978-3-540-35296-9
eBook Packages: Computer ScienceComputer Science (R0)