Skip to main content

Ranking to Learn:

Feature Ranking and Selection via Eigenvector Centrality

  • Conference paper
  • First Online:
New Frontiers in Mining Complex Patterns (NFMCP 2016)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10312))

Included in the following conference series:

Abstract

In an era where accumulating data is easy and storing it inexpensive, feature selection plays a central role in helping to reduce the high-dimensionality of huge amounts of otherwise meaningless data. In this paper, we propose a graph-based method for feature selection that ranks features by identifying the most important ones into arbitrary set of cues. Mapping the problem on an affinity graph - where features are the nodes - the solution is given by assessing the importance of nodes through some indicators of centrality, in particular, the Eigenvector Centrality (EC). The gist of EC is to estimate the importance of a feature as a function of the importance of its neighbors. Ranking central nodes individuates candidate features, which turn out to be effective from a classification point of view, as proved by a thoroughly experimental section. Our approach has been tested on 7 diverse datasets from recent literature (e.g., biological data and object recognition, among others), and compared against filter, embedded and wrappers methods. The results are remarkable in terms of accuracy, stability and low execution time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The FSLib is publicly available on File Exchange - MATLAB Central at: https://it.mathworks.com/matlabcentral/fileexchange/56937-feature-selection-library.

References

  1. GINA digit recognition database. In: IEEE Conference International Joint Conference on Neural Networks (2007)

    Google Scholar 

  2. Alon, U., Barkai, N., Notterman, D.A., Gish, K., Ybarra, S., Mack, D., Levine, A.J.: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. PNAS 96(12), 6745–6750 (1999)

    Article  Google Scholar 

  3. Bamber, D.: The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. J. Math. Psychol. 12(4), 387–415 (1975)

    Article  MathSciNet  MATH  Google Scholar 

  4. Battiti, R.: Using mutual information for selecting features in supervised neural net learning. IEEE Trans. Neural Netw. 5(4), 537–550 (1994)

    Article  Google Scholar 

  5. Bólon-Canedo, V., Sánchez-Maroo, N., Alonso-Betanzos, A.: Recent advances and emerging challenges of feature selection in the context of big data. Knowl.-Based Syst. 86, 33–45 (2015)

    Article  Google Scholar 

  6. Bonacich, P.: Power and centrality: a family of measures. Am. J. Sociol. 92(5), 1170–1182 (1987)

    Article  Google Scholar 

  7. Bradley, P.S., Mangasarian, O.L.: Feature selection via concave minimization and support vector machines. In: Conference International Conference on Machine Learning (ICML) (1998)

    Google Scholar 

  8. Duch, W., Wieczorek, T., Biesiada, J., Blachnik, M.: Comparison of feature ranking methods based on information entropy. In: IJCNN, vol. 2. IEEE (2004)

    Google Scholar 

  9. Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2007 (VOC2007) Results (2007)

    Google Scholar 

  10. Garrison, W.L.: Connectivity of the interstate highway system. Pap. Reg. Sci. 6(1), 121–137 (1960)

    Article  Google Scholar 

  11. Golub, T.R.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(5439), 531–537 (1999)

    Article  Google Scholar 

  12. Grinblat, G.L., Izetta, J., Granitto, P.M.: SVM based feature selection: why are we using the dual? In: Conference Ibero-American Conference on AI (2010)

    Google Scholar 

  13. Gu, Q., Li, Z., Han, J.: Generalized fisher score for feature selection. In: Computing Research Repository (CoRR) (2012)

    Google Scholar 

  14. Guyon, I.: Feature Extraction: Foundations and Applications, vol. 207. Springer Science & Business Media, Berlin (2006)

    Google Scholar 

  15. Guyon, I., Gunn, S., Ben-Hur, A., Dror, G.: Result analysis of the nips 2003 feature selection challenge. In: NIPS, pp. 545–552 (2004)

    Google Scholar 

  16. Guyon, I., Li, J., Mader, T., Pletscher, P.A., Schneider, G., Uhr, M.: Competitive baseline methods set new standards for the NIPS 2003 feature selection benchmark. PRL 28(12), 1438–1444 (2007)

    Article  Google Scholar 

  17. Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Mach. Learn. J. 46(1), 389–422 (2002)

    Article  MATH  Google Scholar 

  18. Guzmán-Martínez, R., Alaiz-Rodríguez, R.: Feature selection stability assessment based on the Jensen-Shannon divergence. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds.) ECML PKDD 2011. LNCS, vol. 6911, pp. 597–612. Springer, Heidelberg (2011). doi:10.1007/978-3-642-23780-5_48

    Chapter  Google Scholar 

  19. He, X., Cai, D., Niyogi, P.: Laplacian score for feature selection. In: Advances in Neural Information Processing Systems, vol. 18 (2005)

    Google Scholar 

  20. Kang, U., Papadimitriou, S., Sun, J., Tong, H.: Centralities in large networks: algorithms and observations. In: Proceedings of the 2011 SIAM International Conference on Data Mining. Society for Industrial and Applied Mathematics, pp. 119–130 (2011)

    Google Scholar 

  21. Kuncheva, L.I.: A stability index for feature selection. In: Proceedings of the 25th Conference on Proceedings of the 25th IASTED International Multi-Conference: Artificial Intelligence and Applications, AIAP 2007, pp. 390–395. ACTA Press, Anaheim (2007)

    Google Scholar 

  22. Lehoucq, R.B., Sorensen, D.C., Yang, C.: ARPACK Users’ Guide: Solution of Large-Scale Eigenvalue Problems with Implicitly Restarted Arnoldi Methods, vol. 6. SIAM, Philadelphia (1998)

    Book  MATH  Google Scholar 

  23. Lerman, K., Ghosh, R., Kang, J.H.: Centrality metric for dynamic networks. In: Proceedings of the Eighth Workshop on Mining and Learning with Graphs, MLG 2010, pp. 70–77. ACM, New York (2010)

    Google Scholar 

  24. Liu, H., Motoda, H. (eds.): Computational Methods of Feature Selection. CRC Press, Boca Raton (2007)

    Google Scholar 

  25. Meyer, C.D. (ed.): Matrix Analysis and Applied Linear Algebra. Society for Industrial and Applied Mathematics, Philadelphia (2000)

    Google Scholar 

  26. Obertino, S., Roffo, G., Granziera, C., Menegaz, G.: Infinite feature selection on shore-based biomarkers reveals connectivity modulation after stroke. In: 2016 International Workshop on Pattern Recognition in Neuroimaging (PRNI), pp. 1–4, June 2016

    Google Scholar 

  27. Peng, H., Long, F., Ding, C.: Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 27(8), 1226–1238 (2005)

    Article  Google Scholar 

  28. Pitts, F.R.: A graph theoretic approach to historical geography. Prof. Geogr. 17(5), 15–20 (1965)

    Article  Google Scholar 

  29. Rawat, A., Saha, S., Ghrera, S.P.: Time efficient ranking system on map reduce framework. In: 2015 Third International Conference on Image Information Processing (ICIIP), pp. 496–501 (2015)

    Google Scholar 

  30. Roffo, G., Melzi, S.: Online feature selection for visual tracking. In: International Conference the British Machine Vision Conference (BMVC), September 2016

    Google Scholar 

  31. Roffo, G., Melzi, S., Cristani, M.: Infinite feature selection. In: IEEE Conference International Conference on Computer Vision (ICCV) (2015)

    Google Scholar 

  32. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014)

    Google Scholar 

  33. Singh, D., Febbo, P.G., Ross, K., Jackson, D.G., Manola, J., Ladd, C., Tamayo, P., Renshaw, A.A., D’Amico, A.V., Richie, J.P., Lander, E.S., Loda, M., Kantoff, P.W., Golub, T.R., Sellers, W.R.: Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1(2), 203–209 (2002)

    Article  Google Scholar 

  34. Wu, D.D., Deng, X., Li, Y.: Safety and emergency systems engineering mapreduce based betweenness approximation engineering in large scale graph. Syst. Eng. Procedia 5, 162–167 (2012)

    Article  Google Scholar 

  35. Zaffalon, M., Hutter, M.: Robust feature selection using distributions of mutual information. In: Conference International Conference on Uncertainty in Artificial Intelligence (UAI) (2002)

    Google Scholar 

  36. Zhang, Z., Hancock, E.R.: A graph-based approach to feature selection. In: Jiang, X., Ferrer, M., Torsello, A. (eds.) GbRPR 2011. LNCS, vol. 6658, pp. 205–214. Springer, Heidelberg (2011). doi:10.1007/978-3-642-20844-7_21

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Giorgio Roffo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Roffo, G., Melzi, S. (2017). Ranking to Learn:. In: Appice, A., Ceci, M., Loglisci, C., Masciari, E., RaÅ›, Z. (eds) New Frontiers in Mining Complex Patterns. NFMCP 2016. Lecture Notes in Computer Science(), vol 10312. Springer, Cham. https://doi.org/10.1007/978-3-319-61461-8_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-61461-8_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-61460-1

  • Online ISBN: 978-3-319-61461-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics