Skip to main content

Efficient All Relevant Feature Selection with Random Ferns

  • Conference paper
  • First Online:
Foundations of Intelligent Systems (ISMIS 2017)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10352))

Included in the following conference series:

Abstract

Many machine learning methods can produce variable importance scores expressing the usability of each feature in context of the produced model; those scores on their own are yet not sufficient to generate feature selection, especially when an all relevant selection is required. There are wrapper methods aiming to solve this problem, mostly focused around estimating the expected distribution of irrelevant feature importance. However, such estimation often requires a substantial computational effort.

In this paper I propose a method of incorporating such estimation within the training process of a random ferns classifier and evaluate it as an all relevant feature selector, both directly and as a part of a dedicated wrapper approach. The obtained results prove its effectiveness and computational efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bosch, A., Zisserman, A., Munoz, X.: Image classification using random forests and ferns. In: 2007 IEEE 11th International Conference on Computer Vision, pp. 1–8. IEEE (2007)

    Google Scholar 

  2. Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)

    Article  MATH  Google Scholar 

  3. Brown, G., Pocock, A., Zhao, M., Luján, M.: Conditional likelihood maximisation: a unifying framework for information theoretic feature selection. J. Mach. Learn. Res. 13, 27–66 (2012)

    MathSciNet  MATH  Google Scholar 

  4. Fisher, R.A.: The use of multiple measurements in taxonomic problems. Ann. Eugenics 7(2), 179–188 (1936)

    Article  Google Scholar 

  5. Friedlander, M., Dobra, A., Massam, H., Briollais, L.: genMOSS: Functions for the Bayesian Analysis of GWAS Data, rpackageversion 1.2 (2014). https://CRAN.R-project.org/package=genMOSS

  6. Guyon, I., Gunn, S., Ben-Hur, A., Dror, G.: Result analysis of the NIPS 2003 feature selection challenge. Adv. Neural Inf. Process. Syst. 17, 545–552 (2005)

    Google Scholar 

  7. Huynh-Thu, V.A., Wehenkel, L., Geurts, P.: Exploiting tree-based variable importances to selectively identify relevant variables. In: JMLR: Workshop and Conference Proceedings, pp. 60–73 (2008)

    Google Scholar 

  8. Kursa, M.B., Jankowski, A., Rudnicki, W.R.: Boruta – a system for feature selection. Fundamenta Informaticae 101(4), 271–285 (2010)

    MathSciNet  Google Scholar 

  9. Kursa, M.B., Rudnicki, W.R.: Feature selection with the Boruta package. J. Stat. Softw. 36(11), 1–13 (2010)

    Article  Google Scholar 

  10. Kursa, M.B.: rFerns: an implementation of the random ferns method for general-purpose machine learning. J. Stat. Softw. 61(10), 1–13 (2014)

    Article  Google Scholar 

  11. Kursa, M.B.: Robustness of random forest-based gene selection methods. BMC Bioinform. 15(1), 8 (2014)

    Article  Google Scholar 

  12. Nilsson, R., Peña, J., Björkegren, J., Tegnér, J.: Consistent feature selection for pattern recognition in polynomial time. J. Mach. Learn. Res. 8, 612 (2007)

    MathSciNet  MATH  Google Scholar 

  13. Oshin, O., Gilbert, A., Illingworth, J., Bowden, R.: Action recognition using randomised ferns. In: 2009 IEEE 12th International Conference Computer Vision Workshops (ICCV Workshops), pp. 530–537. IEEE (2009)

    Google Scholar 

  14. Özuysal, M., Calonder, M., Lepetit, V., Fua, P.: Fast keypoint recognition using random ferns. Image Process. (2008)

    Google Scholar 

  15. Özuysal, M., Fua, P., Lepetit, V.: Fast keypoint recognition in ten lines of code. In: 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8, June 2007

    Google Scholar 

  16. Peng, B., Amos, C.I.: Forward-time simulation of realistic samples for genome-wide association studies. BMC Bioinform. 11(1), 1–12 (2010)

    Article  Google Scholar 

  17. Saeys, Y., Inza, I.N., Larrañaga, P.: A review of feature selection techniques in bioinformatics. Bioinformatics 23(19), 2507–2517 (2007)

    Article  Google Scholar 

  18. Singh, D., Febbo, P.G., Ross, K., Jackson, D.G., Manola, J., Ladd, C., Tamayo, P., Renshaw, A.A., D’Amico, A.V., Richie, J.P., Lander, E.S., Loda, M., Kantoff, P.W., Golub, T.R., Sellers, W.R.: Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1(2), 203–209 (2002)

    Article  Google Scholar 

  19. Tuv, E., Borisov, A., Torkkola, K.: Feature selection using ensemble based ranking against artificial contrasts. In: The 2006 IEEE International Joint Conference on Neural Network Proceedings, pp. 2181–2186. IEEE (2006)

    Google Scholar 

Download references

Acknowledgements

This work has been financed by the National Science Centre, grant 2011/01/N/ST6/07035, as well as with the support of the OCEAN—Open Centre for Data and Data Analysis Project, co-financed by the European Regional Development Fund under the Innovative Economy Operational Programme. Computations were performed at ICM, grant G48-6.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Miron Bartosz Kursa .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Kursa, M.B. (2017). Efficient All Relevant Feature Selection with Random Ferns. In: Kryszkiewicz, M., Appice, A., Ślęzak, D., Rybinski, H., Skowron, A., Raś, Z. (eds) Foundations of Intelligent Systems. ISMIS 2017. Lecture Notes in Computer Science(), vol 10352. Springer, Cham. https://doi.org/10.1007/978-3-319-60438-1_30

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-60438-1_30

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-60437-4

  • Online ISBN: 978-3-319-60438-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics