A New Fuzzy-Rough Hybrid Merit to Feature Selection

Anaraki, Javad Rahimipour; Samet, Saeed; Banzhaf, Wolfgang; Eftekhari, Mahdi

doi:10.1007/978-3-662-53611-7_1

Javad Rahimipour Anaraki¹⁵,
Saeed Samet¹⁶,
Wolfgang Banzhaf¹⁷ &
…
Mahdi Eftekhari¹⁸

Part of the book series: Lecture Notes in Computer Science ((TRS,volume 10020))

349 Accesses
1 Citations

Abstract

Feature selecting is considered as one of the most important pre-process methods in machine learning, data mining and bioinformatics. By applying pre-process techniques, we can defy the curse of dimensionality by reducing computational and storage costs, facilitate data understanding and visualization, and diminish training and testing times, leading to overall performance improvement, especially when dealing with large datasets. Correlation feature selection method uses a conventional merit to evaluate different feature subsets. In this paper, we propose a new merit by adapting and employing of correlation feature selection in conjunction with fuzzy-rough feature selection, to improve the effectiveness and quality of the conventional methods. It also outperforms the newly introduced gradient boosted feature selection, by selecting more relevant and less redundant features. The two-step experimental results show the applicability and efficiency of our proposed method over some well known and mostly used datasets, as well as newly introduced ones, especially from the UCI collection with various sizes from small to large numbers of features and samples.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Hall, M.A., Smith, L.A.: Feature subset selection: a correlation based filter approach. In: Proceedings of the 1997 International Conference on Neural Information Processing and Intelligent Information Systems, New Zealand, pp. 855–858 (1997)
Google Scholar
Javed, K., Babri, H.A., Saeed, M.: Feature selection based on class-dependent densities for high-dimensional binary data. IEEE Trans. Knowl. Data Eng. 24, 465–477 (2012)
Article Google Scholar
Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97, 273–324 (1997)
Article MATH Google Scholar
Das, S.: Filters, wrappers and a boosting-based hybrid for feature selection. In: ICML, vol. 1, pp. 74–81. Citeseer (2001)
Google Scholar
Kira, K., Rendell, L.A.: The feature selection problem: traditional methods and a new algorithm. In: AAAI, pp. 129–134 (1992)
Google Scholar
Jensen, R., Shen, Q.: New approaches to fuzzy-rough feature selection. IEEE Trans. Fuzzy Syst. 17, 824–838 (2009)
Article Google Scholar
Anaraki, J.R., Eftekhari, M., Ahn, C.W.: Novel improvements on the fuzzy-rough quickreduct algorithm. IEICE Trans. Inf. Syst. E98.D(2), 453–456 (2015)
Article Google Scholar
Anaraki, J.R., Eftekhari, M.: Improving fuzzy-rough quick reduct for feature selection. In: 2011 19th Iranian Conference on Electrical Engineering (ICEE), pp. 1502–1506 (2011)
Google Scholar
Qian, Y., Wang, Q., Cheng, H., Liang, J., Dang, C.: Fuzzy-rough feature selection accelerator. Fuzzy Sets Syst. 258, 61–78 (2015). Special issue: Uncertainty in Learning from Big Data
Article MathSciNet MATH Google Scholar
Jensen, R., Vluymans, S., Parthaláin, N.M., Cornelis, C., Saeys, Y.: Semi-supervised fuzzy-rough feature selection. In: Yao, Y., Hu, Q., Yu, H., Grzymala-Busse, J.W. (eds.) RSFDGrC 2015. LNCS (LNAI), vol. 9437, pp. 185–195. Springer, Heidelberg (2015). doi:10.1007/978-3-319-25783-9_17
Chapter Google Scholar
Shang, C., Barnes, D.: Fuzzy-rough feature selection aided support vector machines for mars image classification. Comput. Vis. Image Underst. 117, 202–213 (2013)
Article Google Scholar
Derrac, J., Verbiest, N., García, S., Cornelis, C., Herrera, F.: On the use of evolutionary feature selection for improving fuzzy rough set based prototype selection. Soft Comput. 17, 223–238 (2012)
Article Google Scholar
Dai, J., Xu, Q.: Attribute selection based on information gain ratio in fuzzy rough set theory with application to tumor classification. Appl. Soft Comput. 13, 211–221 (2013)
Article Google Scholar
Xu, Z., Huang, G., Weinberger, K.Q., Zheng, A.X.: Gradient boosted feature selection. In: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 522–531. ACM (2014)
Google Scholar
Yu, L., Liu, H.: Efficient feature selection via analysis of relevance and redundancy. J. Mach. Learn. Res. 5, 1205–1224 (2004)
MathSciNet MATH Google Scholar
Pawlak, Z.: Rough sets. Int. J. Comput. Inf. Sci. 11, 341–356 (1982)
Article MathSciNet MATH Google Scholar
Komorowski, J., Pawlak, Z., Polkowski, L., Skowron, A.: Rough sets: a tutorial. In: Pal, S.K., Skowron, A. (eds.) Rough-Fuzzy Hybridization: A New Trend in Decision Making, pp. 3–98. Springer-Verlag New York, Inc., Secaucus (1998)
Google Scholar
Radzikowska, A.M., Kerre, E.E.: A comparative study of fuzzy rough sets. Fuzzy Sets Syst. 126, 137–155 (2002)
Article MathSciNet MATH Google Scholar
Boln-Canedo, V., Snchez-Maroo, N., Alonso-Betanzos, A.: Feature Selection for High-Dimensional Data. Springer, Switzerland (2016)
Google Scholar
John, G.H., Kohavi, R., Pfleger, K., et al.: Irrelevant features and the subset selection problem. In: Machine Learning: Proceedings of the Eleventh International Conference, pp. 121–129 (1994)
Google Scholar
Kim, G., Kim, Y., Lim, H., Kim, H.: An mlp-based feature subset selection for HIV-1 protease cleavage site analysis. Artif. Intell. Med. 48, 83–89 (2010). Artificial Intelligence in Biomedical Engineering and Informatics
Article Google Scholar
Breiman, L., Friedman, J., Stone, C.J., Olshen, R.A.: Classification and regression trees. CRC Press, New York (1984)
MATH Google Scholar
Wnek, J., Michalski, R.S.: Comparing symbolic and subsymbolic learning: three studies. Mach. Learn. A Multistrategy Approach 4, 318–362 (1994)
Google Scholar
Zhu, Z., Ong, Y.S., Zurada, J.M.: Identification of full and partial class relevant genes. IEEE/ACM Trans. Comput. Biol. Bioinform. 7, 263–277 (2010)
Article Google Scholar
Bache, K., Lichman, M.: UCI machine learning repository (2013)
Google Scholar
Zieba, M., Tomczak, J.M., Lubicz, M., Swiatek, J.: Boosted svm for extracting rules from imbalanced data in application to prediction of the post-operative life expectancy in the lung cancer patients. Appl. Soft Comput. 14, 99–108 (2014)
Article Google Scholar
Lucas, D.D., Klein, R., Tannahill, J., Ivanova, D., Brandon, S., Domyancic, D., Zhang, Y.: Failure analysis of parameter-induced simulation crashes in climate models. Geoscientific Model Devel. 6, 1157–1171 (2013)
Article Google Scholar
Cortez, P., Cerdeira, A., Almeida, F., Matos, T., Reis, J.: Modeling wine preferences by data mining from physicochemical properties. Decis. Support Syst. 47, 547–553 (2009)
Article Google Scholar
Tsanas, A., Little, M., Fox, C., Ramig, L.: Objective automatic assessment of rehabilitative speech treatment in parkinson’s disease. IEEE Trans. Neural Syst. Rehabil. Eng. 22, 181–190 (2014)
Article Google Scholar
Sikora, M., Wróbel, Ł.: Application of rule induction algorithms for analysis of data collected by seismic hazard monitoring systems in coal mines. Arch. Min. Sci. 55, 91–114 (2010)
Google Scholar
Putten, P.V.D., Someren, M.V.: Coil challenge 2000: the insurance company case. Technical report 2000–2009. Leiden Institute of Advanced Computer Science, Universiteit van Leiden (2000)
Google Scholar
Manikandan, S.: Measures of central tendency: the mean. J. Pharmacol. Pharmacotherapeutics 2, 140 (2011)
Article Google Scholar
Alcala-Fdez, J., Fernandez, A., Luengo, J., Derrac, J., Garcia, S.: Keel data-mining software tool: Data set repository, integration of algorithms and experimental analysis framework. Multiple-Valued Logic Soft Comput. 17, 255–287 (2011)
Google Scholar
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. SIGKDD Explor. Newsl. 11, 10–18 (2009)
Article Google Scholar
Guyon, I., Gunn, S., Ben-Hur, A., Dror, G.: Result analysis of the nips 2003 feature selection challenge. In: Advances in Neural Information Processing Systems, pp. 545–552 (2004)
Google Scholar
Detrano, R., Janosi, A., Steinbrunn, W., Pfisterer, M., Schmid, J.J., Sandhu, S., Guppy, K.H., Lee, S., Froelicher, V.: International application of a new probability algorithm for the diagnosis of coronary artery disease. Am. J. Cardiol. 64, 304–310 (1989)
Article Google Scholar

Download references

Acknowledgments

This work has been partially supported by the Natural Sciences and Engineering Research Council of Canada (NSERC) and the Research & Development Corporation of Newfoundland and Labrador (RDC).

Author information

Authors and Affiliations

Department of Computer Science, Memorial University of Newfoundland, St. John’s, Nl, A1B 3X5, Canada
Javad Rahimipour Anaraki
Faculty of Medicine, Memorial University of Newfoundland, St. John’s, Nl, A1B 3V6, Canada
Saeed Samet
Department of Computer Science, Memorial University of Newfoundland, St. John’s, Nl, A1B 3X5, Canada
Wolfgang Banzhaf
Department of Computer Engineering, Shahid Bahonar University of Kerman, 7616914111, Kerman, Iran
Mahdi Eftekhari

Authors

Javad Rahimipour Anaraki
View author publications
You can also search for this author in PubMed Google Scholar
Saeed Samet
View author publications
You can also search for this author in PubMed Google Scholar
Wolfgang Banzhaf
View author publications
You can also search for this author in PubMed Google Scholar
Mahdi Eftekhari
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Javad Rahimipour Anaraki .

Editor information

Editors and Affiliations

University of Manitoba , Winnipeg, Manitoba, Canada
James F. Peters
University of Warsaw , Warsaw, Poland
Andrzej Skowron

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Anaraki, J.R., Samet, S., Banzhaf, W., Eftekhari, M. (2016). A New Fuzzy-Rough Hybrid Merit to Feature Selection. In: Peters, J., Skowron, A. (eds) Transactions on Rough Sets XX. Lecture Notes in Computer Science(), vol 10020. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-53611-7_1

Download citation

DOI: https://doi.org/10.1007/978-3-662-53611-7_1
Published: 21 October 2016
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-53610-0
Online ISBN: 978-3-662-53611-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics