Abstract
Feature reduction is often an essential part of solving a classification task. One common approach for doing this, is Principal Component Analysis. There the low variance directions in the data are removed and the high variance directions are retained. It is hoped that these high variance directions contain information about the class differences. For one-class classification or novelty detection, the classification task contains one ill-determined class, for which (almost) no information is available. In this paper we show that for one-class classification, the low-variance directions are most informative, and that in the feature reduction a bias-variance trade-off has to be considered which causes that retaining the high variance directions is often not optimal.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
T.W. Anderson. An introduction to multivariate statistical analysis. John Wiley & Sons, 2nd edition, 1984.
Sung-Bae Cho. Recognition of unconstrained handwritten numerals by doubly self-organizing neural network. In International Cconference on Pattern Recognition, 1996.
S. Geman, E. Bienenstock, and R. Doursat. Neural networks and the bias/variance dilemma. Neural Computation, 4:1–58, 1992.
T. Heskes. Bias/variance decomposition for likelihood-based estimators. Neural Computation, 10:1425–1433, 1998.
J. Hardin and D.M. Rocke. The distribution of robust distances. Technical report, University of California at Davis, 1999.
B. Heisele, Poggio. T., and M. Pontil. Face detection in still gray images. A.I. memo 1687, Center for Biological and Computational Learning, MIT, Cambridge, MA, 2000.
N Japkowicz, C. Myers, and M. Gluck. A novelty detection approach to classification. In Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, pages 518–523, 1995.
I.T. Jollife. Principal Component Analysis. Springer-Verlag, New York, 1986.
C.E. Metz. Basic principles of ROC analysis. Seminars in Nuclear Medicine, VIII(4), October 1978.
M.M. Moya and D.R. Hush. Network contraints and multi-objective optimization for one-class classification. Neural Networks, 9(3):463–474, 1996.
G. Ritter, M.T. Gallegos, and K. Gaggermeier. Automatic context-sensitive karyotyping of human elliptical symmetric statistical distributions. Pattern Recognition, 28(6):823–831, December 1995.
[SPST+99]_B Schölkopf, J. Platt, J. Shawe-Taylor, Smola A., and R. Williamson. Estimating the support of a high-dimensional distribution. Neural Computation, 13(7), 1999.
K.-K. Sung. Learning and Example Selection for Object and Pattern Recognition. PhD thesis, MIT, Artificial Intelligence Laboratory and Center for Biological and Computational Learning, Cambridge, MA, 1996.
D.M.J. Tax. One-class classification. PhD thesis, Delft University of Technology, http://www.ph.tn.tudelft.nl/~davidt/thesis.pdf, June 2001.
S. Wilks. Mathematical statistics. John Wiley, 1962.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Tax, D.M.J., Müller, KR. (2003). Feature Extraction for One-Class Classification. In: Kaynak, O., Alpaydin, E., Oja, E., Xu, L. (eds) Artificial Neural Networks and Neural Information Processing — ICANN/ICONIP 2003. ICANN ICONIP 2003 2003. Lecture Notes in Computer Science, vol 2714. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44989-2_41
Download citation
DOI: https://doi.org/10.1007/3-540-44989-2_41
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40408-8
Online ISBN: 978-3-540-44989-8
eBook Packages: Springer Book Archive