On the Use of Spearman’s Rho to Measure the Stability of Feature Rankings

Nogueira, Sarah; Sechidis, Konstantinos; Brown, Gavin

doi:10.1007/978-3-319-58838-4_42

Sarah Nogueira¹⁶,
Konstantinos Sechidis¹⁶ &
Gavin Brown¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10255))

Included in the following conference series:

Iberian Conference on Pattern Recognition and Image Analysis

2010 Accesses
12 Citations

Abstract

Producing stable feature rankings is critical in many areas, such as in bioinformatics where the robustness of a list of ranked genes is crucial to interpretation by a domain expert. In this paper, we study Spearman’s rho as a measure of stability to training data perturbations - not just as a heuristic, but here proving that it is the natural measure of stability when using mean rank aggregation. We provide insights on the properties of this stability measure, allowing a useful interpretation of stability values - e.g. how close a stability value is to that of a purely random feature ranking process, and concepts such as the expected value of a stability estimator.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Available online at
http://www.cs.man.ac.uk/~nogueirs/files/IbPRIA2017-supplementary-material.pdf.

References

Abeel, T., Helleputte, T., Van de Peer, Y., Dupont, P., Saeys, Y.: Robust biomarker identification for cancer diagnosis with ensemble feature selection methods. Bioinformatics 26, 392–398 (2010)
Article Google Scholar
Berry, K.J., Mielke Jr., P.W., Johnston, J.E.: Permutation Statistical Methods: An Integrated Approach. Springer, Heidelberg (2016)
Book MATH Google Scholar
Boulesteix, A.L., Slawski, M.: Stability and aggregation of ranked gene lists. Brief. Bioinform. 10, 556–568 (2009)
Article Google Scholar
Brown, G., Wyatt, J.L.: The use of the ambiguity decomposition in neural network ensemble learning methods. In: Fawcett, T., Mishra, N. (eds.) ICML (2003)
Google Scholar
Brown, G., Wyatt, J.L., Tiňo, P.: Managing diversity in regression ensembles. J. Mach. Learn. Res. 6, 1621–1650 (2005)
MathSciNet MATH Google Scholar
Dessì, N., Pes, B.: Stability in biomarker discovery: does ensemble feature selection really help? In: Proceedings IEA/AIE 2015 (2015)
Google Scholar
Dittman, D.J., Khoshgoftaar, T.M., Wald, R., Napolitano, A.: Classification performance of rank aggregation techniques for ensemble gene selection. In: FLAIRS Conference. AAAI Press (2013)
Google Scholar
Dwork, C., Kumar, R., Naor, M., Sivakumar, D.: Rank aggregation methods for the web. In: Proceedings International Conference on World Wide Web (2001)
Google Scholar
He, Z., Yu, W.: Stable feature selection for biomarker discovery. Comput. Biol. Chem. 34, 215–225 (2010)
Article Google Scholar
Jurman, G., Merler, S., Barla, A., Paoli, S., Galea, A., Furlanello, C.: Algebraic stability indicators for ranked lists in molecular profiling. Bioinformatics 24, 258–264 (2008)
Article Google Scholar
Jurman, G., Riccadonna, S., Visintainer, R., Furlanello, C.: Algebraic comparison of partial lists in bioinformatics. PLoS one 7, e36540 (2012)
Article Google Scholar
Kalousis, A., Prados, J., Hilario, M.: Stability of feature selection algorithms: a study on high-dimensional spaces. Knowl. Inf. Syst. 12, 95–116 (2007)
Article Google Scholar
Kuncheva, L.I.: A stability index for feature selection. In: Proceedings of Artificial Intelligence and Applications (2007)
Google Scholar
Nogueira, S., Brown, G.: Measuring the stability of feature selection. In: Frasconi, P., Landwehr, N., Manco, G., Vreeken, J. (eds.) ECML PKDD 2016. LNCS, vol. 9852, pp. 442–457. Springer, Cham (2016). doi:10.1007/978-3-319-46227-1_28
Chapter Google Scholar
Saeys, Y., Abeel, T., Peer, Y.: Robust feature selection using ensemble feature selection techniques. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008. LNCS, vol. 5212, pp. 313–325. Springer, Heidelberg (2008). doi:10.1007/978-3-540-87481-2_21
Chapter Google Scholar
Schmid, F., Schmidt, R.: Multivariate extensions of Spearman’s rho and related statistics. Stat. Probab. Lett. 77, 407–416 (2007)
Article MathSciNet MATH Google Scholar
Sculley, D.: Rank aggregation for similar items. In: Proceedings of the Seventh SIAM International Conference on Data Mining (2007)
Google Scholar
Sechidis, K.: Hypothesis testing and feature selection in semi-supervised data. Ph.D. thesis, School of Computer Science, University Of Manchester, UK (2015)
Google Scholar
Voorhees, E.M.: Evaluation by highly relevant documents. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2001. ACM (2001)
Google Scholar
Wald, R., Khoshgoftaar, T.M., Dittman, D.J., Awada, W., Napolitano, A.: An extensive comparison of feature ranking aggregation techniques in bioinformatics. In: IRI. IEEE (2012)
Google Scholar

Download references

Acknowledgements

The authors gratefully acknowledge the support of the EPSRC for the Manchester Centre for Doctoral Training in Computer Science (EP/I028099/1) and the LAMBDA project (EP/N035127/1).

Author information

Authors and Affiliations

School of Computer Science, University of Manchester, Manchester, M13 9PL, UK
Sarah Nogueira, Konstantinos Sechidis & Gavin Brown

Authors

Sarah Nogueira
View author publications
You can also search for this author in PubMed Google Scholar
Konstantinos Sechidis
View author publications
You can also search for this author in PubMed Google Scholar
Gavin Brown
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sarah Nogueira .

Editor information

Editors and Affiliations

Universidade da Beira Interior , Covilhã, Portugal
Luís A. Alexandre
University Jaume I , Castellón, Spain
José Salvador Sánchez
University of the Algarve , Faro, Portugal
João M. F. Rodrigues

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nogueira, S., Sechidis, K., Brown, G. (2017). On the Use of Spearman’s Rho to Measure the Stability of Feature Rankings. In: Alexandre, L., Salvador Sánchez, J., Rodrigues, J. (eds) Pattern Recognition and Image Analysis. IbPRIA 2017. Lecture Notes in Computer Science(), vol 10255. Springer, Cham. https://doi.org/10.1007/978-3-319-58838-4_42

Download citation

DOI: https://doi.org/10.1007/978-3-319-58838-4_42
Published: 12 May 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-58837-7
Online ISBN: 978-3-319-58838-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

On the Use of Spearman’s Rho to Measure the Stability of Feature Rankings