Skip to main content

Feature Selection via Vectorizing Feature’s Discriminative Information

  • Conference paper
  • First Online:
Web Technologies and Applications (APWeb 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9931))

Included in the following conference series:

Abstract

Feature selection is a popular technology for reducing dimensionality. Commonly features are evaluated with univariate scores according to their classification abilities, and the high-score ones are preferred and selected. However, there are two flaws for this strategy. First, feature complementarity is ignored. A subspace constructed by the partially predominant but complementary features is suitable for recognition task, whereas this feature subset cannot be selected by this strategy. Second, feature redundancy for classification cannot be measured accurately. This redundancy weakens the subset’s discriminative performance, but it cannot be reduced by this strategy. In this paper, a new feature selection method is proposed. It assesses feature’s discriminative information for each class and vectorizes this information. Then, features are represented by their corresponding discriminative information vectors, and the most distinct ones are selected. Both feature complementarity and classification redundancy can be easily measured by comparing the differences between these new vectors. Experimental results on both low-dimensional and high-dimensional data testify the new method’s effectiveness.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://archive.ics.uci.edu/ml.

References

  1. Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)

    MATH  Google Scholar 

  2. Tang, J., Alelyani, S., Liu, H.: Feature selection for classification: a review. In: Aggarwal, C. (ed.) Data Classification: Algorithms and Applications. CRC Press, Chapman (2014)

    Google Scholar 

  3. Blum, A.L., Langley, P.: Selection of relevant features and examples in machine learning. Artif. Intell. 97(1), 245–271 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  4. Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97(1), 273–324 (1997)

    Article  MATH  Google Scholar 

  5. Inza, I., Larrañaga, P., Blanco, R., Cerrolaza, A.J.: Filter versus wrapper gene selection approaches in DNA microarray domains. Artif. Intell. Med. 31(2), 91–103 (2004)

    Article  Google Scholar 

  6. Peng, H., Long, F., Ding, C.: Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–1238 (2005)

    Article  Google Scholar 

  7. Brown, G., Pocock, A., Zhao, M., Luján, M.: Conditional likelihood maximisation: a unifying framework for information theoretic feature selection. J. Mach. Learn. Res. 12, 27–66 (2012)

    MathSciNet  MATH  Google Scholar 

  8. Hua, J., Tembe, W.D., Dougherty, E.R.: Performance of feature-selection methods in the classification of high-dimension data. Pattern Recognit. 42(3), 409–424 (2009)

    Article  MATH  Google Scholar 

  9. Gu, Q., Li, Z., Han, J.: Generalized Fisher Score for feature selection. In: Proceedings of the 27th UAI, pp. 266–273 (2011)

    Google Scholar 

  10. He, X., Cai, D., Niyogi, P.: Laplacian score for feature selection. In: Proceedings of NIPS 18, pp. 507–514 (2005)

    Google Scholar 

  11. Zhang, Y., Zhou, Z.H.: Multi-label dimensionality reduction via dependence maximization. ACM Trans. Knowl. Discov. Data 4(3), 1503–1505 (2010)

    Article  Google Scholar 

  12. Zhao, Z., Liu, H.: Spectral feature selection for supervised and unsupervised learning. In: Proceedings of the 24th ICML, pp. 1151–1157 (2007)

    Google Scholar 

  13. Zhao, Z., Wang, L., Liu, H., Ye, J.: On similarity preserving feature selection. IEEE Trans. Knowl. Data Eng. 25(3), 619–632 (2013)

    Article  Google Scholar 

  14. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993)

    Google Scholar 

  15. Yu, L., Liu, H.: Efficient feature selection via analysis of relevance and redundancy. J. Mach. Learn. Res. 5, 1205–1224 (2004)

    MathSciNet  MATH  Google Scholar 

  16. Mitra, P., Murthy, C.A., Pal, S.K.: Unsupervised feature selection using feature similarity. IEEE Trans. Pattern Anal. Mach. Intell. 24(4), 301–312 (2002)

    Article  Google Scholar 

  17. Hall, M.A.: Correlation-based feature subset selection for machine learning. Ph.D. thesis, Dept. Computer Science, Waikato Univ., Hamilton, New Zealand (1999)

    Google Scholar 

  18. Fleuret, F.: Fast binary feature selection with conditional mutual information. J. Mach. Learn. Res. 5, 1531–1555 (2004)

    MathSciNet  MATH  Google Scholar 

  19. Wasikowski, M., Chen, X.W.: Combating the small sample class imbalance problem using feature selection. IEEE Trans. Knowl. Data Eng. 22(10), 1388–1400 (2010)

    Article  Google Scholar 

  20. Kim, H., Drake, B.L., Park, H.: Adaptive nonlinear discriminant analysis by regularized minimum squared errors. IEEE Trans. Knowl. Data Eng. 18(5), 603–612 (2006)

    Article  Google Scholar 

  21. Wei, J.M., Wang, S.Q., Yuan, X.J.: Ensemble rough hypercuboid approach for classifying cancers. IEEE Trans. Knowl. Data Eng. 22(3), 381–391 (2010)

    Article  Google Scholar 

  22. Yu, L., Liu, H.: Feature selection for high-dimensional data: a fast correlation-based filter solution. In: Proceedings of the 20th ICML, pp. 856–863 (2003)

    Google Scholar 

  23. Robnik-Šikonja, M., Kononenko, I.: Theoretical and empirical analysis of ReliefF and RReliefF. Mach. Learn. 53(1–2), 23–69 (2003)

    Article  MATH  Google Scholar 

  24. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. SIGKDD Explor. 11(1), 10–18 (2009)

    Article  Google Scholar 

Download references

Acknowledgments

This work was partially supported by the National Natural Science Foundation of China (61070089) and the Science Foundation of Tianjin (14JCYBJC15700).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jinmao Wei .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Wang, J., Xu, H., Wei, J. (2016). Feature Selection via Vectorizing Feature’s Discriminative Information. In: Li, F., Shim, K., Zheng, K., Liu, G. (eds) Web Technologies and Applications. APWeb 2016. Lecture Notes in Computer Science(), vol 9931. Springer, Cham. https://doi.org/10.1007/978-3-319-45814-4_40

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-45814-4_40

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-45813-7

  • Online ISBN: 978-3-319-45814-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics