Abstract
Feature selection is a popular technology for reducing dimensionality. Commonly features are evaluated with univariate scores according to their classification abilities, and the high-score ones are preferred and selected. However, there are two flaws for this strategy. First, feature complementarity is ignored. A subspace constructed by the partially predominant but complementary features is suitable for recognition task, whereas this feature subset cannot be selected by this strategy. Second, feature redundancy for classification cannot be measured accurately. This redundancy weakens the subset’s discriminative performance, but it cannot be reduced by this strategy. In this paper, a new feature selection method is proposed. It assesses feature’s discriminative information for each class and vectorizes this information. Then, features are represented by their corresponding discriminative information vectors, and the most distinct ones are selected. Both feature complementarity and classification redundancy can be easily measured by comparing the differences between these new vectors. Experimental results on both low-dimensional and high-dimensional data testify the new method’s effectiveness.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
References
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)
Tang, J., Alelyani, S., Liu, H.: Feature selection for classification: a review. In: Aggarwal, C. (ed.) Data Classification: Algorithms and Applications. CRC Press, Chapman (2014)
Blum, A.L., Langley, P.: Selection of relevant features and examples in machine learning. Artif. Intell. 97(1), 245–271 (1997)
Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97(1), 273–324 (1997)
Inza, I., Larrañaga, P., Blanco, R., Cerrolaza, A.J.: Filter versus wrapper gene selection approaches in DNA microarray domains. Artif. Intell. Med. 31(2), 91–103 (2004)
Peng, H., Long, F., Ding, C.: Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–1238 (2005)
Brown, G., Pocock, A., Zhao, M., Luján, M.: Conditional likelihood maximisation: a unifying framework for information theoretic feature selection. J. Mach. Learn. Res. 12, 27–66 (2012)
Hua, J., Tembe, W.D., Dougherty, E.R.: Performance of feature-selection methods in the classification of high-dimension data. Pattern Recognit. 42(3), 409–424 (2009)
Gu, Q., Li, Z., Han, J.: Generalized Fisher Score for feature selection. In: Proceedings of the 27th UAI, pp. 266–273 (2011)
He, X., Cai, D., Niyogi, P.: Laplacian score for feature selection. In: Proceedings of NIPS 18, pp. 507–514 (2005)
Zhang, Y., Zhou, Z.H.: Multi-label dimensionality reduction via dependence maximization. ACM Trans. Knowl. Discov. Data 4(3), 1503–1505 (2010)
Zhao, Z., Liu, H.: Spectral feature selection for supervised and unsupervised learning. In: Proceedings of the 24th ICML, pp. 1151–1157 (2007)
Zhao, Z., Wang, L., Liu, H., Ye, J.: On similarity preserving feature selection. IEEE Trans. Knowl. Data Eng. 25(3), 619–632 (2013)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993)
Yu, L., Liu, H.: Efficient feature selection via analysis of relevance and redundancy. J. Mach. Learn. Res. 5, 1205–1224 (2004)
Mitra, P., Murthy, C.A., Pal, S.K.: Unsupervised feature selection using feature similarity. IEEE Trans. Pattern Anal. Mach. Intell. 24(4), 301–312 (2002)
Hall, M.A.: Correlation-based feature subset selection for machine learning. Ph.D. thesis, Dept. Computer Science, Waikato Univ., Hamilton, New Zealand (1999)
Fleuret, F.: Fast binary feature selection with conditional mutual information. J. Mach. Learn. Res. 5, 1531–1555 (2004)
Wasikowski, M., Chen, X.W.: Combating the small sample class imbalance problem using feature selection. IEEE Trans. Knowl. Data Eng. 22(10), 1388–1400 (2010)
Kim, H., Drake, B.L., Park, H.: Adaptive nonlinear discriminant analysis by regularized minimum squared errors. IEEE Trans. Knowl. Data Eng. 18(5), 603–612 (2006)
Wei, J.M., Wang, S.Q., Yuan, X.J.: Ensemble rough hypercuboid approach for classifying cancers. IEEE Trans. Knowl. Data Eng. 22(3), 381–391 (2010)
Yu, L., Liu, H.: Feature selection for high-dimensional data: a fast correlation-based filter solution. In: Proceedings of the 20th ICML, pp. 856–863 (2003)
Robnik-Šikonja, M., Kononenko, I.: Theoretical and empirical analysis of ReliefF and RReliefF. Mach. Learn. 53(1–2), 23–69 (2003)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. SIGKDD Explor. 11(1), 10–18 (2009)
Acknowledgments
This work was partially supported by the National Natural Science Foundation of China (61070089) and the Science Foundation of Tianjin (14JCYBJC15700).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Wang, J., Xu, H., Wei, J. (2016). Feature Selection via Vectorizing Feature’s Discriminative Information. In: Li, F., Shim, K., Zheng, K., Liu, G. (eds) Web Technologies and Applications. APWeb 2016. Lecture Notes in Computer Science(), vol 9931. Springer, Cham. https://doi.org/10.1007/978-3-319-45814-4_40
Download citation
DOI: https://doi.org/10.1007/978-3-319-45814-4_40
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-45813-7
Online ISBN: 978-3-319-45814-4
eBook Packages: Computer ScienceComputer Science (R0)