Skip to main content

On the Use of Spearman’s Rho to Measure the Stability of Feature Rankings

  • Conference paper
  • First Online:
Pattern Recognition and Image Analysis (IbPRIA 2017)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10255))

Included in the following conference series:

Abstract

Producing stable feature rankings is critical in many areas, such as in bioinformatics where the robustness of a list of ranked genes is crucial to interpretation by a domain expert. In this paper, we study Spearman’s rho as a measure of stability to training data perturbations - not just as a heuristic, but here proving that it is the natural measure of stability when using mean rank aggregation. We provide insights on the properties of this stability measure, allowing a useful interpretation of stability values - e.g. how close a stability value is to that of a purely random feature ranking process, and concepts such as the expected value of a stability estimator.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Available online at

    http://www.cs.man.ac.uk/~nogueirs/files/IbPRIA2017-supplementary-material.pdf.

References

  1. Abeel, T., Helleputte, T., Van de Peer, Y., Dupont, P., Saeys, Y.: Robust biomarker identification for cancer diagnosis with ensemble feature selection methods. Bioinformatics 26, 392–398 (2010)

    Article  Google Scholar 

  2. Berry, K.J., Mielke Jr., P.W., Johnston, J.E.: Permutation Statistical Methods: An Integrated Approach. Springer, Heidelberg (2016)

    Book  MATH  Google Scholar 

  3. Boulesteix, A.L., Slawski, M.: Stability and aggregation of ranked gene lists. Brief. Bioinform. 10, 556–568 (2009)

    Article  Google Scholar 

  4. Brown, G., Wyatt, J.L.: The use of the ambiguity decomposition in neural network ensemble learning methods. In: Fawcett, T., Mishra, N. (eds.) ICML (2003)

    Google Scholar 

  5. Brown, G., Wyatt, J.L., Tiňo, P.: Managing diversity in regression ensembles. J. Mach. Learn. Res. 6, 1621–1650 (2005)

    MathSciNet  MATH  Google Scholar 

  6. Dessì, N., Pes, B.: Stability in biomarker discovery: does ensemble feature selection really help? In: Proceedings IEA/AIE 2015 (2015)

    Google Scholar 

  7. Dittman, D.J., Khoshgoftaar, T.M., Wald, R., Napolitano, A.: Classification performance of rank aggregation techniques for ensemble gene selection. In: FLAIRS Conference. AAAI Press (2013)

    Google Scholar 

  8. Dwork, C., Kumar, R., Naor, M., Sivakumar, D.: Rank aggregation methods for the web. In: Proceedings International Conference on World Wide Web (2001)

    Google Scholar 

  9. He, Z., Yu, W.: Stable feature selection for biomarker discovery. Comput. Biol. Chem. 34, 215–225 (2010)

    Article  Google Scholar 

  10. Jurman, G., Merler, S., Barla, A., Paoli, S., Galea, A., Furlanello, C.: Algebraic stability indicators for ranked lists in molecular profiling. Bioinformatics 24, 258–264 (2008)

    Article  Google Scholar 

  11. Jurman, G., Riccadonna, S., Visintainer, R., Furlanello, C.: Algebraic comparison of partial lists in bioinformatics. PLoS one 7, e36540 (2012)

    Article  Google Scholar 

  12. Kalousis, A., Prados, J., Hilario, M.: Stability of feature selection algorithms: a study on high-dimensional spaces. Knowl. Inf. Syst. 12, 95–116 (2007)

    Article  Google Scholar 

  13. Kuncheva, L.I.: A stability index for feature selection. In: Proceedings of Artificial Intelligence and Applications (2007)

    Google Scholar 

  14. Nogueira, S., Brown, G.: Measuring the stability of feature selection. In: Frasconi, P., Landwehr, N., Manco, G., Vreeken, J. (eds.) ECML PKDD 2016. LNCS, vol. 9852, pp. 442–457. Springer, Cham (2016). doi:10.1007/978-3-319-46227-1_28

    Chapter  Google Scholar 

  15. Saeys, Y., Abeel, T., Peer, Y.: Robust feature selection using ensemble feature selection techniques. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008. LNCS, vol. 5212, pp. 313–325. Springer, Heidelberg (2008). doi:10.1007/978-3-540-87481-2_21

    Chapter  Google Scholar 

  16. Schmid, F., Schmidt, R.: Multivariate extensions of Spearman’s rho and related statistics. Stat. Probab. Lett. 77, 407–416 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  17. Sculley, D.: Rank aggregation for similar items. In: Proceedings of the Seventh SIAM International Conference on Data Mining (2007)

    Google Scholar 

  18. Sechidis, K.: Hypothesis testing and feature selection in semi-supervised data. Ph.D. thesis, School of Computer Science, University Of Manchester, UK (2015)

    Google Scholar 

  19. Voorhees, E.M.: Evaluation by highly relevant documents. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2001. ACM (2001)

    Google Scholar 

  20. Wald, R., Khoshgoftaar, T.M., Dittman, D.J., Awada, W., Napolitano, A.: An extensive comparison of feature ranking aggregation techniques in bioinformatics. In: IRI. IEEE (2012)

    Google Scholar 

Download references

Acknowledgements

The authors gratefully acknowledge the support of the EPSRC for the Manchester Centre for Doctoral Training in Computer Science (EP/I028099/1) and the LAMBDA project (EP/N035127/1).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sarah Nogueira .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Nogueira, S., Sechidis, K., Brown, G. (2017). On the Use of Spearman’s Rho to Measure the Stability of Feature Rankings. In: Alexandre, L., Salvador Sánchez, J., Rodrigues, J. (eds) Pattern Recognition and Image Analysis. IbPRIA 2017. Lecture Notes in Computer Science(), vol 10255. Springer, Cham. https://doi.org/10.1007/978-3-319-58838-4_42

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-58838-4_42

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-58837-7

  • Online ISBN: 978-3-319-58838-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics