Abstract
This paper presents an algorithmic framework for feature selection, which selects a subset of features by minimizing the nonparametric Bayes error. A set of existing algorithms as well as new ones can be derived naturally from this framework. For example, we show that the Relief algorithm greedily attempts to minimize the Bayes error estimated by k-Nearest-Neighbor method. This new interpretation not only reveals the secret behind Relief but also offers various opportunities to improve it or to establish new alternatives. In particular, we develop a new feature weighting algorithm, named Parzen-Relief, which minimizes the Bayes error estimated by Parzen method. Additionally, to enhance its ability to handle imbalanced and multiclass data, we integrate the class distribution with the max-margin objective function, leading to a new algorithm, named MAP-Relief. Comparison on benchmark data sets confirms the effectiveness of the proposed algorithms.
This work is supported in part by NSFC (#60275025, #60121302).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Heidelberg (2006)
Carneiro, G., Vasconcelos, N.: Minimum Bayes Error Features for Visual Recognition by Sequential Feature Selection and Extraction. In: Proc. of CRV (2005)
Choi, E.: Feature Extraction Based on the Bhattacharyya Distance. Pattern Recognition 36, 1703–1709 (2003)
Dash, M., Liu, H.: Feature Selection for Classification. Intelligent Data Analysis 1, 1131–1156 (1997)
Gilad-Bachrach, R., Navot, A., Tishby, N.: Margin Based Feature Selection - Theory and Algorithms. In: Proc. of 21th ICML (2004)
Guyon, I., Elissee, A.: An Introduction to Variable and Feature Selection. JMLR 3, 1157–1182 (2003)
Hild II, K.E., Erdogmus, D., Torkkola, K., Principe, C.: Feature Extraction Using Information-Theoretic Learning. IEEE Trans. PAMI 28(9), 1385–1392 (2006)
Jain, A., Zongker, D.: Feature Selection: Evaluation, Application, and Small Sample Performance. IEEE Trans. PAMI 19(2), 153–158 (1997)
Kira, K., Rendell, L.A.: A Practical Approach to Feature Selection. In: Proc. of 9th ICML, pp. 249–256 (1992)
Koller, D., Sahami, M.: Toward Optimal Feature Selection. In: Proc. of 13th ICML (1996)
Kononenko, I.: Estimating Attributes: Analysis and Extensions of RELIEF. In: Bergadano, F., De Raedt, L. (eds.) ECML 1994. LNCS, vol. 784, pp. 171–182. Springer, Heidelberg (1994)
Mangasarian, O.L., Musicant, D.R.: Lagrangian Support Vector Machines. JMLR 1, 161–177 (2001)
Robnik-Šikonja, M., Kononenko, I.: Theoretical and Empirical Analysis of ReliefF and RRlief. Machine Learning 53(1-2), 23–69 (2003)
Saon, G., Padmanabhan, M.: Minimum Bayes Error Feature Selection for Continuous Speech Recognition. In: Proc. of NIPS (2002)
Sun, Y.J.: Iterative Relief for Feature Weighting: Algorithms, Theories, and Applications. IEEE Trans. PAMI 29(6), 1035–1051 (2007)
Vapnik, V.: Statistical Learning Theory. John Wiley & Sons, Chichester (1998)
Vasconcelos, N.: Feature Selection by Maximum Marginal Diversity: Optimality and Implications for Visual Recognition. In: Proc. of IEEE CVPR (2003)
Weston, J., Elisseeff, A., Schölkopf, B., Tipping, M.: Use of Zero-Norm with Linear Models and Kernel Mothods. JMLR 3, 1439–1461 (2003)
Xuan, G., Zhu, X., Chai, P., Shi, Y., Fu, D.: Feature Selection Based on the Bhattacharyya Distance. In: Proc. of ICPR (2006)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yang, SH., Hu, BG. (2008). Feature Selection by Nonparametric Bayes Error Minimization. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2008. Lecture Notes in Computer Science(), vol 5012. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68125-0_37
Download citation
DOI: https://doi.org/10.1007/978-3-540-68125-0_37
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68124-3
Online ISBN: 978-3-540-68125-0
eBook Packages: Computer ScienceComputer Science (R0)