Feature Selection via Vectorizing Feature’s Discriminative Information

Wang, Jun; Xu, Hengpeng; Wei, Jinmao

doi:10.1007/978-3-319-45814-4_40

Jun Wang¹⁷,
Hengpeng Xu¹⁷ &
Jinmao Wei¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9931))

Included in the following conference series:

Asia-Pacific Web Conference

2214 Accesses
1 Citations

Abstract

Feature selection is a popular technology for reducing dimensionality. Commonly features are evaluated with univariate scores according to their classification abilities, and the high-score ones are preferred and selected. However, there are two flaws for this strategy. First, feature complementarity is ignored. A subspace constructed by the partially predominant but complementary features is suitable for recognition task, whereas this feature subset cannot be selected by this strategy. Second, feature redundancy for classification cannot be measured accurately. This redundancy weakens the subset’s discriminative performance, but it cannot be reduced by this strategy. In this paper, a new feature selection method is proposed. It assesses feature’s discriminative information for each class and vectorizes this information. Then, features are represented by their corresponding discriminative information vectors, and the most distinct ones are selected. Both feature complementarity and classification redundancy can be easily measured by comparing the differences between these new vectors. Experimental results on both low-dimensional and high-dimensional data testify the new method’s effectiveness.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Feature Selection Method Based on Feature’s Classification Bias and Performance

A Feature Selection Approach Based on Information Theory for Classification Tasks

Feature selection by combining subspace learning with sparse representation

Article 17 October 2015

Notes

1.
http://archive.ics.uci.edu/ml.

References

Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)
MATH Google Scholar
Tang, J., Alelyani, S., Liu, H.: Feature selection for classification: a review. In: Aggarwal, C. (ed.) Data Classification: Algorithms and Applications. CRC Press, Chapman (2014)
Google Scholar
Blum, A.L., Langley, P.: Selection of relevant features and examples in machine learning. Artif. Intell. 97(1), 245–271 (1997)
Article MathSciNet MATH Google Scholar
Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97(1), 273–324 (1997)
Article MATH Google Scholar
Inza, I., Larrañaga, P., Blanco, R., Cerrolaza, A.J.: Filter versus wrapper gene selection approaches in DNA microarray domains. Artif. Intell. Med. 31(2), 91–103 (2004)
Article Google Scholar
Peng, H., Long, F., Ding, C.: Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–1238 (2005)
Article Google Scholar
Brown, G., Pocock, A., Zhao, M., Luján, M.: Conditional likelihood maximisation: a unifying framework for information theoretic feature selection. J. Mach. Learn. Res. 12, 27–66 (2012)
MathSciNet MATH Google Scholar
Hua, J., Tembe, W.D., Dougherty, E.R.: Performance of feature-selection methods in the classification of high-dimension data. Pattern Recognit. 42(3), 409–424 (2009)
Article MATH Google Scholar
Gu, Q., Li, Z., Han, J.: Generalized Fisher Score for feature selection. In: Proceedings of the 27th UAI, pp. 266–273 (2011)
Google Scholar
He, X., Cai, D., Niyogi, P.: Laplacian score for feature selection. In: Proceedings of NIPS 18, pp. 507–514 (2005)
Google Scholar
Zhang, Y., Zhou, Z.H.: Multi-label dimensionality reduction via dependence maximization. ACM Trans. Knowl. Discov. Data 4(3), 1503–1505 (2010)
Article Google Scholar
Zhao, Z., Liu, H.: Spectral feature selection for supervised and unsupervised learning. In: Proceedings of the 24th ICML, pp. 1151–1157 (2007)
Google Scholar
Zhao, Z., Wang, L., Liu, H., Ye, J.: On similarity preserving feature selection. IEEE Trans. Knowl. Data Eng. 25(3), 619–632 (2013)
Article Google Scholar
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993)
Google Scholar
Yu, L., Liu, H.: Efficient feature selection via analysis of relevance and redundancy. J. Mach. Learn. Res. 5, 1205–1224 (2004)
MathSciNet MATH Google Scholar
Mitra, P., Murthy, C.A., Pal, S.K.: Unsupervised feature selection using feature similarity. IEEE Trans. Pattern Anal. Mach. Intell. 24(4), 301–312 (2002)
Article Google Scholar
Hall, M.A.: Correlation-based feature subset selection for machine learning. Ph.D. thesis, Dept. Computer Science, Waikato Univ., Hamilton, New Zealand (1999)
Google Scholar
Fleuret, F.: Fast binary feature selection with conditional mutual information. J. Mach. Learn. Res. 5, 1531–1555 (2004)
MathSciNet MATH Google Scholar
Wasikowski, M., Chen, X.W.: Combating the small sample class imbalance problem using feature selection. IEEE Trans. Knowl. Data Eng. 22(10), 1388–1400 (2010)
Article Google Scholar
Kim, H., Drake, B.L., Park, H.: Adaptive nonlinear discriminant analysis by regularized minimum squared errors. IEEE Trans. Knowl. Data Eng. 18(5), 603–612 (2006)
Article Google Scholar
Wei, J.M., Wang, S.Q., Yuan, X.J.: Ensemble rough hypercuboid approach for classifying cancers. IEEE Trans. Knowl. Data Eng. 22(3), 381–391 (2010)
Article Google Scholar
Yu, L., Liu, H.: Feature selection for high-dimensional data: a fast correlation-based filter solution. In: Proceedings of the 20th ICML, pp. 856–863 (2003)
Google Scholar
Robnik-Šikonja, M., Kononenko, I.: Theoretical and empirical analysis of ReliefF and RReliefF. Mach. Learn. 53(1–2), 23–69 (2003)
Article MATH Google Scholar
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. SIGKDD Explor. 11(1), 10–18 (2009)
Article Google Scholar

Download references

Acknowledgments

This work was partially supported by the National Natural Science Foundation of China (61070089) and the Science Foundation of Tianjin (14JCYBJC15700).

Author information

Authors and Affiliations

College of Computer and Control Engineering, Nankai University, Tianjin, 300350, China
Jun Wang, Hengpeng Xu & Jinmao Wei

Authors

Jun Wang
View author publications
You can also search for this author in PubMed Google Scholar
Hengpeng Xu
View author publications
You can also search for this author in PubMed Google Scholar
Jinmao Wei
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jinmao Wei .

Editor information

Editors and Affiliations

School of Computing, University of Utah, Salt Lake City, Utah, USA
Feifei Li
School of Electrical Engineering, Seoul National University, Seoul, Korea (Republic of)
Kyuseok Shim
Soochow University , Suzhou, China
Kai Zheng
Soochow University , Suzhou, China
Guanfeng Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, J., Xu, H., Wei, J. (2016). Feature Selection via Vectorizing Feature’s Discriminative Information. In: Li, F., Shim, K., Zheng, K., Liu, G. (eds) Web Technologies and Applications. APWeb 2016. Lecture Notes in Computer Science(), vol 9931. Springer, Cham. https://doi.org/10.1007/978-3-319-45814-4_40

Download citation

DOI: https://doi.org/10.1007/978-3-319-45814-4_40
Published: 17 September 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-45813-7
Online ISBN: 978-3-319-45814-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics