Skip to main content

A New Fuzzy-Rough Hybrid Merit to Feature Selection

  • Chapter
  • First Online:
Transactions on Rough Sets XX

Part of the book series: Lecture Notes in Computer Science ((TRS,volume 10020))

Abstract

Feature selecting is considered as one of the most important pre-process methods in machine learning, data mining and bioinformatics. By applying pre-process techniques, we can defy the curse of dimensionality by reducing computational and storage costs, facilitate data understanding and visualization, and diminish training and testing times, leading to overall performance improvement, especially when dealing with large datasets. Correlation feature selection method uses a conventional merit to evaluate different feature subsets. In this paper, we propose a new merit by adapting and employing of correlation feature selection in conjunction with fuzzy-rough feature selection, to improve the effectiveness and quality of the conventional methods. It also outperforms the newly introduced gradient boosted feature selection, by selecting more relevant and less redundant features. The two-step experimental results show the applicability and efficiency of our proposed method over some well known and mostly used datasets, as well as newly introduced ones, especially from the UCI collection with various sizes from small to large numbers of features and samples.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Hall, M.A., Smith, L.A.: Feature subset selection: a correlation based filter approach. In: Proceedings of the 1997 International Conference on Neural Information Processing and Intelligent Information Systems, New Zealand, pp. 855–858 (1997)

    Google Scholar 

  2. Javed, K., Babri, H.A., Saeed, M.: Feature selection based on class-dependent densities for high-dimensional binary data. IEEE Trans. Knowl. Data Eng. 24, 465–477 (2012)

    Article  Google Scholar 

  3. Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97, 273–324 (1997)

    Article  MATH  Google Scholar 

  4. Das, S.: Filters, wrappers and a boosting-based hybrid for feature selection. In: ICML, vol. 1, pp. 74–81. Citeseer (2001)

    Google Scholar 

  5. Kira, K., Rendell, L.A.: The feature selection problem: traditional methods and a new algorithm. In: AAAI, pp. 129–134 (1992)

    Google Scholar 

  6. Jensen, R., Shen, Q.: New approaches to fuzzy-rough feature selection. IEEE Trans. Fuzzy Syst. 17, 824–838 (2009)

    Article  Google Scholar 

  7. Anaraki, J.R., Eftekhari, M., Ahn, C.W.: Novel improvements on the fuzzy-rough quickreduct algorithm. IEICE Trans. Inf. Syst. E98.D(2), 453–456 (2015)

    Article  Google Scholar 

  8. Anaraki, J.R., Eftekhari, M.: Improving fuzzy-rough quick reduct for feature selection. In: 2011 19th Iranian Conference on Electrical Engineering (ICEE), pp. 1502–1506 (2011)

    Google Scholar 

  9. Qian, Y., Wang, Q., Cheng, H., Liang, J., Dang, C.: Fuzzy-rough feature selection accelerator. Fuzzy Sets Syst. 258, 61–78 (2015). Special issue: Uncertainty in Learning from Big Data

    Article  MathSciNet  MATH  Google Scholar 

  10. Jensen, R., Vluymans, S., Parthaláin, N.M., Cornelis, C., Saeys, Y.: Semi-supervised fuzzy-rough feature selection. In: Yao, Y., Hu, Q., Yu, H., Grzymala-Busse, J.W. (eds.) RSFDGrC 2015. LNCS (LNAI), vol. 9437, pp. 185–195. Springer, Heidelberg (2015). doi:10.1007/978-3-319-25783-9_17

    Chapter  Google Scholar 

  11. Shang, C., Barnes, D.: Fuzzy-rough feature selection aided support vector machines for mars image classification. Comput. Vis. Image Underst. 117, 202–213 (2013)

    Article  Google Scholar 

  12. Derrac, J., Verbiest, N., García, S., Cornelis, C., Herrera, F.: On the use of evolutionary feature selection for improving fuzzy rough set based prototype selection. Soft Comput. 17, 223–238 (2012)

    Article  Google Scholar 

  13. Dai, J., Xu, Q.: Attribute selection based on information gain ratio in fuzzy rough set theory with application to tumor classification. Appl. Soft Comput. 13, 211–221 (2013)

    Article  Google Scholar 

  14. Xu, Z., Huang, G., Weinberger, K.Q., Zheng, A.X.: Gradient boosted feature selection. In: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 522–531. ACM (2014)

    Google Scholar 

  15. Yu, L., Liu, H.: Efficient feature selection via analysis of relevance and redundancy. J. Mach. Learn. Res. 5, 1205–1224 (2004)

    MathSciNet  MATH  Google Scholar 

  16. Pawlak, Z.: Rough sets. Int. J. Comput. Inf. Sci. 11, 341–356 (1982)

    Article  MathSciNet  MATH  Google Scholar 

  17. Komorowski, J., Pawlak, Z., Polkowski, L., Skowron, A.: Rough sets: a tutorial. In: Pal, S.K., Skowron, A. (eds.) Rough-Fuzzy Hybridization: A New Trend in Decision Making, pp. 3–98. Springer-Verlag New York, Inc., Secaucus (1998)

    Google Scholar 

  18. Radzikowska, A.M., Kerre, E.E.: A comparative study of fuzzy rough sets. Fuzzy Sets Syst. 126, 137–155 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  19. Boln-Canedo, V., Snchez-Maroo, N., Alonso-Betanzos, A.: Feature Selection for High-Dimensional Data. Springer, Switzerland (2016)

    Google Scholar 

  20. John, G.H., Kohavi, R., Pfleger, K., et al.: Irrelevant features and the subset selection problem. In: Machine Learning: Proceedings of the Eleventh International Conference, pp. 121–129 (1994)

    Google Scholar 

  21. Kim, G., Kim, Y., Lim, H., Kim, H.: An mlp-based feature subset selection for HIV-1 protease cleavage site analysis. Artif. Intell. Med. 48, 83–89 (2010). Artificial Intelligence in Biomedical Engineering and Informatics

    Article  Google Scholar 

  22. Breiman, L., Friedman, J., Stone, C.J., Olshen, R.A.: Classification and regression trees. CRC Press, New York (1984)

    MATH  Google Scholar 

  23. Wnek, J., Michalski, R.S.: Comparing symbolic and subsymbolic learning: three studies. Mach. Learn. A Multistrategy Approach 4, 318–362 (1994)

    Google Scholar 

  24. Zhu, Z., Ong, Y.S., Zurada, J.M.: Identification of full and partial class relevant genes. IEEE/ACM Trans. Comput. Biol. Bioinform. 7, 263–277 (2010)

    Article  Google Scholar 

  25. Bache, K., Lichman, M.: UCI machine learning repository (2013)

    Google Scholar 

  26. Zieba, M., Tomczak, J.M., Lubicz, M., Swiatek, J.: Boosted svm for extracting rules from imbalanced data in application to prediction of the post-operative life expectancy in the lung cancer patients. Appl. Soft Comput. 14, 99–108 (2014)

    Article  Google Scholar 

  27. Lucas, D.D., Klein, R., Tannahill, J., Ivanova, D., Brandon, S., Domyancic, D., Zhang, Y.: Failure analysis of parameter-induced simulation crashes in climate models. Geoscientific Model Devel. 6, 1157–1171 (2013)

    Article  Google Scholar 

  28. Cortez, P., Cerdeira, A., Almeida, F., Matos, T., Reis, J.: Modeling wine preferences by data mining from physicochemical properties. Decis. Support Syst. 47, 547–553 (2009)

    Article  Google Scholar 

  29. Tsanas, A., Little, M., Fox, C., Ramig, L.: Objective automatic assessment of rehabilitative speech treatment in parkinson’s disease. IEEE Trans. Neural Syst. Rehabil. Eng. 22, 181–190 (2014)

    Article  Google Scholar 

  30. Sikora, M., Wróbel, Ł.: Application of rule induction algorithms for analysis of data collected by seismic hazard monitoring systems in coal mines. Arch. Min. Sci. 55, 91–114 (2010)

    Google Scholar 

  31. Putten, P.V.D., Someren, M.V.: Coil challenge 2000: the insurance company case. Technical report 2000–2009. Leiden Institute of Advanced Computer Science, Universiteit van Leiden (2000)

    Google Scholar 

  32. Manikandan, S.: Measures of central tendency: the mean. J. Pharmacol. Pharmacotherapeutics 2, 140 (2011)

    Article  Google Scholar 

  33. Alcala-Fdez, J., Fernandez, A., Luengo, J., Derrac, J., Garcia, S.: Keel data-mining software tool: Data set repository, integration of algorithms and experimental analysis framework. Multiple-Valued Logic Soft Comput. 17, 255–287 (2011)

    Google Scholar 

  34. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. SIGKDD Explor. Newsl. 11, 10–18 (2009)

    Article  Google Scholar 

  35. Guyon, I., Gunn, S., Ben-Hur, A., Dror, G.: Result analysis of the nips 2003 feature selection challenge. In: Advances in Neural Information Processing Systems, pp. 545–552 (2004)

    Google Scholar 

  36. Detrano, R., Janosi, A., Steinbrunn, W., Pfisterer, M., Schmid, J.J., Sandhu, S., Guppy, K.H., Lee, S., Froelicher, V.: International application of a new probability algorithm for the diagnosis of coronary artery disease. Am. J. Cardiol. 64, 304–310 (1989)

    Article  Google Scholar 

Download references

Acknowledgments

This work has been partially supported by the Natural Sciences and Engineering Research Council of Canada (NSERC) and the Research & Development Corporation of Newfoundland and Labrador (RDC).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Javad Rahimipour Anaraki .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer-Verlag GmbH Germany

About this chapter

Cite this chapter

Anaraki, J.R., Samet, S., Banzhaf, W., Eftekhari, M. (2016). A New Fuzzy-Rough Hybrid Merit to Feature Selection. In: Peters, J., Skowron, A. (eds) Transactions on Rough Sets XX. Lecture Notes in Computer Science(), vol 10020. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-53611-7_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-53611-7_1

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-53610-0

  • Online ISBN: 978-3-662-53611-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics