skip to main content
10.1145/1273496.1273512acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlConference Proceedingsconference-collections
Article

Feature selection in a kernel space

Published: 20 June 2007 Publication History

Abstract

We address the problem of feature selection in a kernel space to select the most discriminative and informative features for classification and data analysis. This is a difficult problem because the dimension of a kernel space may be infinite. In the past, little work has been done on feature selection in a kernel space. To solve this problem, we derive a basis set in the kernel space as a first step for feature selection. Using the basis set, we then extend the margin-based feature selection algorithms that are proven effective even when many features are dependent. The selected features form a subspace of the kernel space, in which different state-of-the-art classification algorithms can be applied for classification. We conduct extensive experiments over real and simulated data to compare our proposed method with four baseline algorithms. Both theoretical analysis and experimental results validate the effectiveness of our proposed method.

References

[1]
Aha, D. W. (1990). A study of instance-based algorithms for supervised learning tasks: Mathematical, empirical, and psychological evaluations. Doctoral dissertation, Department of Information & Computer Science, University of California, Irvine.
[2]
Baudat, G., & Anouar, F. (2000). Generalized discriminant analysis using a kernel approach. Neural Computation, 12, 2385--2404.
[3]
Baudat, G., & Anouar, F. (2003). Feature vector selection and projection using kernels. Neurocomputing, 55, 21--38.
[4]
Bradley, P. S., & Mangasarian, O. L. (1998). Feature selection via concave minimization and support vector machines. ICML '98 (pp. 82--90).
[5]
Fisher, R. A. (1936). The use of multiple measurements in taxonomic problems. Annals Eugen., 7, 179--188.
[6]
Fukunaga, K. (1990). Introduction to statistical pattern recognition (2nd ed.). San Diego, CA, USA: Academic Press Professional, Inc.
[7]
Gilad-Bachrach, R., Navot, A., & Tishby, N. (2004). Margin based feature selection - theory and algorithms. ICML '04 (p. 43).
[8]
Guyon, I., & Elisseeff, A. (2003). An introduction to variable and feature selection. J. Mach. Learn. Res., 3, 1157--1182.
[9]
Horn, R. A., & Johnson, C. R. (1985). Matrix analysis. Cambridge University Press.
[10]
I. T. Jolliffe. (2002). Principal components analysis. Springer Verlag.
[11]
Kira, K., & Rendell, L. A. (1992). A practical approach to feature selection. ML92 (pp. 249--256).
[12]
Kononenko, I. (1994). Estimating attributes: analysis and extensions of relief. ECML-94: Proceedings of the European conference on machine learning on Machine Learning (pp. 171--182).
[13]
Liang, Z., & Zhao, T. (2006). Feature selection for linear support vector machines. ICPR '06 (pp. 606--609).
[14]
Mika, S., Ratsch, G., Weston, J., Scholkopf, B., & Muller, K. (1999", a). Fisher discriminant analysis with kernels.
[15]
Mika, S., Schölkopf, B., Smola, A. J., Müller, K.-R., Scholz, M., & Räätsch, G. (1999b). Kernel PCA and de--noising in feature spaces. Advances in Neural Information Processing Systems 11. MIT Press.
[16]
Niijima, S., & Kuhara, S. (2006). Gene subset selection in kernel-induced feature space. Pattern Recognition Letters, 27, 1884--1892.
[17]
Scholkopf, B., & A. J. Smola. (2002). Learnin with kernels. Cambridge, MA,: The MIT Press.
[18]
Sun, Y., & Li, J. (2006). Iterative relief for feature weighting. ICML '06 (pp. 913--920).
[19]
Weston, J., Elisseeff, A., Schölkopf, B., & Tipping, M. (2003). Use of the zero norm with linear models and kernel methods. J. Mach. Learn. Res., 3, 1439--1461.
[20]
Wu, M., Schölkopf, B., & Bakir, G. (2005). Building sparse large margin classifiers. ICML '05 (pp. 996--1003).
[21]
Yan, J., Liu, N., Zhang, B., Yan, S., Chen, Z., Cheng, Q., Fan, W., & Ma, W.-Y. (2005). Ocfs: optimal orthogonal centroid feature selection for text categorization. SIGIR '05 (pp. 122--129).
[22]
Yang, Y., & Pedersen, J. O. (1997). A comparative study on feature selection in text categorization. ICML '97 (pp. 412--420).

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICML '07: Proceedings of the 24th international conference on Machine learning
June 2007
1233 pages
ISBN:9781595937933
DOI:10.1145/1273496
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

  • Machine Learning Journal

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 June 2007

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Conference

ICML '07 & ILP '07
Sponsor:

Acceptance Rates

Overall Acceptance Rate 140 of 548 submissions, 26%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)22
  • Downloads (Last 6 weeks)2
Reflects downloads up to 17 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Mixed-Integer Linear Optimization Formulations for Feature Subset Selection in Kernel SVM ClassificationIEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences10.1587/transfun.2023EAP1043E107.A:8(1151-1162)Online publication date: 1-Aug-2024
  • (2023)A Tweedie Compound Poisson Model in Reproducing Kernel Hilbert SpaceTechnometrics10.1080/00401706.2022.215661565:2(281-295)Online publication date: 3-Jan-2023
  • (2022)PRLNeurocomputing10.1016/j.neucom.2022.01.016479:C(106-120)Online publication date: 28-Mar-2022
  • (2021)An Efficient Fingertip Photoplethysmographic Signal Artifact Detection Method: A Machine Learning ApproachJournal of Sensors10.1155/2021/99250332021:1Online publication date: 4-Oct-2021
  • (2021)Forecasting Wireless Demand with Extreme Values using Feature Embedding in Gaussian Processes2021 IEEE 93rd Vehicular Technology Conference (VTC2021-Spring)10.1109/VTC2021-Spring51267.2021.9449040(1-6)Online publication date: Apr-2021
  • (2021)Application of Horse Herd Optimization Algorithm for medical problems2021 International Conference on INnovations in Intelligent SysTems and Applications (INISTA)10.1109/INISTA52262.2021.9548366(1-6)Online publication date: 25-Aug-2021
  • (2021)Global and Local Structure Preservation for Nonlinear High-dimensional Spectral ClusteringThe Computer Journal10.1093/comjnl/bxab020Online publication date: 14-May-2021
  • (2021)Performance analysis of machine learning algorithms on automated sleep staging feature setsCAAI Transactions on Intelligence Technology10.1049/cit2.120426:2(155-174)Online publication date: 20-Apr-2021
  • (2021)Distribution-dependent feature selection for deep neural networksApplied Intelligence10.1007/s10489-021-02663-1Online publication date: 22-Jul-2021
  • (2019)Fair Kernel Regression via Fair Feature Embedding in Kernel Space2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI)10.1109/ICTAI.2019.00200(1417-1421)Online publication date: Nov-2019
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media