Skip to main content
Log in

A Fuzzy System for Combining Filter Features Selection Methods

  • Published:
International Journal of Fuzzy Systems Aims and scope Submit manuscript

Abstract

Feature selection is considered as one of the most important data pre-processing step in different modelling fields, especially for prediction and classification purposes. Feature selection belongs to the wider class of data mining procedures, as it allows to discover the variables that mostly affect a given phenomenon from an analysis of the available data, by thus increasing the knowledge of the considered process or phenomenon. There are three main categories of feature selection approaches, namely filter, wrappers and embedded methods: this work is focused on the first one and, in particular, on a fuzzy logic-based procedure which combines some traditional filter methods. Filter methods exploit intrinsic properties of the data to select the features before the learning task and, with respect to the other kinds of approaches, require a shorter computational time and adequate for datasets with a large number of instances and features. In order to prove the effectiveness of the proposed approach, several tests have been performed. Different classifiers have been designed and applied for binary classification on different datasets: some widely used public datasets including a lot of instances and features and two datasets coming from the metal industry. The obtained results are presented and discussed in the paper.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Uncu, O., Turksen, I.: A novel feature selection approach: combining feature wrappers and filters. Inf. Sci. 177, 449–466 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  2. Chen, Y.W., Lin, C.: Combining svms with various feature selection strategies. Featur. Extr. Stud. Fuzziness Soft Comput. 207, 315–324 (2006)

    Article  Google Scholar 

  3. Palechor, F., Manotas, A., Franco, E., Colpas, P.: Feature selection, learning metrics and dimension reduction in training and classification processes in intrusion detection systems. J. Theor. Appl. Inf. Technol. 82(2), 291–298 (2015)

    Google Scholar 

  4. Khan, A., Ishtiaq, M., Jaffar, M.: A hybrid feature selection approach by combining mif and miq. In: IEEE ICET (2010)

  5. Senthilkumar, D., Boobalan, K., Suresh, M.: Bivariate analysis-based variable extraction and selection for improving accuracy in the document categorization. Int. J. Appl. Eng. Res. 10(16), 37705–37710 (2015)

    Google Scholar 

  6. Shima, K., Todoriki, M., Suzuki, A.: Svm based feature selection of latent semantic features. Pattern Recognit. 25, 1051–1057 (2004)

    Article  Google Scholar 

  7. Ekenel, H., Sankur, B.: Feature selection in the independent component subspace for face recognition. Pattern Recognit. 25, 1377–1388 (2004)

    Article  Google Scholar 

  8. Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. Mach. Learn. 3, 1157–1182 (2003)

    MATH  Google Scholar 

  9. Liu, H., Motoba, H., Setiono, R., Zhao, Z.: Feature selection: an ever evolving frontier in data mining. In: JMLR: Workshop and Conference Proceedings, vol. 10, pp. 4–13. The 4th Workshop on Feature Selection in Data Mining (2010)

  10. Sebban, M., Nock, R.: A hybrid filter/wrapper approach of feature selection using information theory. Pattern Recognit. 35, 835–846 (2002)

    Article  MATH  Google Scholar 

  11. Zhang, S., Zhao, Z.: Feature selection filtering methods for emotion recognition in chinese speech signal. In: 9th International Conference on Signal Processing, ICSP (2008)

  12. Pinheiro, R., Cavalcanti, G., Ren, T.: Data-driven global-ranking local feature selection methods for text categorization. Expert Syst. Appl. 42(4), 1941–1949 (2015)

    Article  Google Scholar 

  13. He, X., Cai, D., Niyogi, P.: Laplacian score for feature selection. In: Advances in Neural Information Processing Systems 18, vol. 18, NIPS Foundation (2005)

  14. Prati, R.: Combining feature ranking algorithms through rank aggregation. In: The 2012 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (10–15 June 2012)

  15. Novak, V., Perfilieva, I., Mockor, J.: Mathematical Principles of Fuzzy Logic. Kluwer Academic, Boston (1999)

    Book  MATH  Google Scholar 

  16. Nikooienejad, A., Wang, W., Johnson, V.E.: Bayesian variable selection for binary outcomes in high dimensional genomic studies using non-local priors. Bioinformatics, 32(2), (2016)

  17. Aghdam, M., Kabiri, P.: Feature selection for intrusion detection system using ant colony optimization. Int. J. Netw. Secur. 18(3), 420–432 (2016)

    Google Scholar 

  18. Duan, C., Fei, Z., Li, J.: A variable selection aided residual generator design approach for process control and monitoring. Neurocomputing 171, 1013–1020 (2016)

    Article  Google Scholar 

  19. Koc, L., Carswell, A.D.: Network intrusion detection using a hnb binary classifier. In: 17th UKSIM-AMSS International Conference on Modelling and Simulation (2015)

  20. Ghareb, A., Bakar, A., Hamdan, A.: Hybrid feature selection based on enhanced genetic algorithm for text categorization. Expert Syst. Appl. 49, 31–47 (2016)

    Article  Google Scholar 

  21. Tzafestas, S.G., Venetsanopoulos, A.: Fuzzy Reasoning in Information, Decision and Control Systems. Kluwer Academic Publishers, Boston (1994)

    MATH  Google Scholar 

  22. Cateni, S., Colla, V., Nastasi, G.: A multivariate fuzzy system applied for outliers detection. J. Intell. Fuzzy Syst. 24(4), 889–903 (2013)

    MathSciNet  Google Scholar 

  23. Wei, W., Mendel, J.: A fuzzy logic method for modulation classification in nonideal environments. IEEE Trans. Fuzzy Syst. 7(3), 333–344 (1999)

    Article  Google Scholar 

  24. Kwak, N., Choi, C.: Input feature selection for classification problem. IEEE Trans. Neural Netw. 13, 143–159 (2002)

    Article  Google Scholar 

  25. Cateni, S., Colla, V., Vannucci, M.: Variable selection through genetic algorithms for classification purpose. In: Proceedings of the 10th IASTED International Conference on Artificial Intelligence and Applications, AIA 2010, 6–11 (2010)

  26. Wang, S., Zhu, J.: Variable selection for model-based high dimensional clustering and its application on microarray data. Biometrics 64, 440–448 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  27. Cateni, S., Colla, V., Vannucci, M.: General purpose input variable extraction: a genetic algorithm based procedure give a gap. In: Proceedings of the 9th International Conference on Intelligence Systems design and Applications ISDA’09 (2009)

  28. Sofge, D., Elliot, D.: Improved neural modelling of real world systems using gengene algorithm based variable selection. In: Proceedings of the Conference on Neural Networks and Brain (1998)

  29. Kohavi, R., John, G.: Wrappers for feature selection. Artif. Intell. 97, 273–324 (1997)

    Article  MATH  Google Scholar 

  30. Saeys, Y., Inza, I., Larranaga, P.: A review of feature selection techniques in bioinformatics. Gene Expr. Bioinf. 23, 2507–2517 (2007)

    Google Scholar 

  31. Cateni, S., Colla, V., Vannucci, M.: A genetic algorithm-based approach for selecting input variables and setting relevant network parameters of som-based classifier. Int. J. Simul. Syst. Sci. Technol. 12(2), 30–37 (2011)

    Google Scholar 

  32. Cateni, S., Colla, V.: Improving the stability of wrapper variable selection applied to binary classification. Int. J. Comput. Inf. Syst. Ind. Manag. Appl. 8, 214–225 (2016)

  33. Rakkiyappan, R., Chandrasekar, A., Lakshmanan, S.: Stochastic sampled data robust stabilisation of ts fuzzy neutral systems with randomly occurring uncertainties and time-varying delays. Int. J. Syst. Sci. 1, 1–17 (2014)

    MATH  Google Scholar 

  34. Rakkiyappan, R., Balasubramaniam, P., Krishnasamy, R.: Delay dependent stability analysis of neutral systems with mixed time-varying delays and nonlinear perturbations. J. Comput. Appl. Math. 8, 2147–2156 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  35. Lakshmanan, S., Rakkiyappan, R., Balasubramaniam, P.: Global robust stability criteria for t-s fuzzy systems with distributed delays and time delay in the leakage term. Iran. J. Fuzzy Syst. 9(2), 127–146 (2012)

    MathSciNet  MATH  Google Scholar 

  36. Kalousis, A., Prados, J., Hilario, M.: Stability of feature selection algorithms: a study on high-dimensional spaces. Knowl. Inf. Syst. 12, 95–116 (2007)

    Article  Google Scholar 

  37. Cateni, S., Colla, V.: Improving the stability of sequential forward and backward variables selection. In: 15 th International Conference on Intelligent Systems design and applications ISDA 2015, Marrakesh, Morocco, December, pp. 14–16 (2015)

  38. Loo, L., Roberts, S., Hrebien, L., Kam, M.: New filter-based feature criteria for identifying differentially expressed genes. In: Proceedings of the Fourth International Conference on Machine Learning and Applications (2005)

  39. Carmona, P.L., Sotoca, J.M., Pla, F.: Filter-type variable selection based on information measures for regression tasks. Entropy 14, 323–343 (2012)

    Article  MATH  Google Scholar 

  40. Kumari, B., Swarnkar, T.: Filter versus wrapper feature subset selection in large dimensionality micro array: a review. Int. J. Comput. Sci. Inf. Technol. 2, 1048–1053 (2011)

    Google Scholar 

  41. Golub, T., Slonim, D., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J., Coller, H., Loh, M., Downing, J., Caligiuri, M., Bloomfield, C., Lander, E.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1999)

    Article  Google Scholar 

  42. Maldorad, S., Weber, R.: A wrapper method for feature selection using support vector machines. Inf. Sci. 179, 2208–2217 (2009)

    Article  Google Scholar 

  43. Gu, Q., Li, Z., Han, J.: Generalized fisher score for feature selection. Proc. Conf. Uncertain. Artif. Intell. 1, 266–273 (2011)

    Google Scholar 

  44. Siegel, S., Castellan, N.: Nonparametric Statistics for the Behavioral Sciences. Mac GrawHill, New York (1988)

    Google Scholar 

  45. Li, J., Liu, H., Tung, A., Wong, L.: Data mining techniques for the practical bioinformatician. Pract. Bioinf. (2004)

  46. Rice, J.A.: Mathematical Statistics and Data Analysis, 3rd edn. Duxbury Press, Belmont (2006)

    Google Scholar 

  47. Witten, I., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kauffmann Publisher, San Francisco (2005)

    MATH  Google Scholar 

  48. Mamdani, E., Assilian, S.: An experiment in linguistic synthesis with a fuzzy logic controller. Int. J. Man-Mach. Stud. 7, 1–13 (1975)

    Article  MATH  Google Scholar 

  49. Fang, G., Kwok, N., Ha, Q.: Automatic fuzzy memebership function tuning using the particle swarm optimization. In: Sidney, Workshop on Computational Intelligence and Industrial Application IEEE Australia, , pp. 324–328 (2008)

  50. Duda, R., Hart, P.: Pattern Classification and Scene Analysis. Wiley, New York (1973)

    MATH  Google Scholar 

  51. Rokach, L., Maimon, O.: Data Mining with Decision Trees: Theory and Applications. World Scientific Pub Co. Inc. ISBN 978-9812771711 (2008)

  52. Guo, Y., Hastie, T., Tibshirani, R.: Regularized discriminant analysis and its application in microarray. Biostatistics 8, 86–100 (2007)

    Article  MATH  Google Scholar 

  53. Taylor, J.R.: An Introduction to Error Analysis: The Study of Uncertainties in Physical Measurements. University Science Books, Sausalito (1999)

  54. Powers, D.: Evaluation: From precision, recall and f-measure to roc, informedness, markedness & correlation. J. Mach. Learn. Technol. 2(1), 37–63 (2007)

    Google Scholar 

  55. Sun, Y., Robinson, M., Adams, R., Boekhorst, R., Rust, A.G., Davey, N.: Using feature selection filtering methods for binding site predictions. In: Proceedings of the 5th IEEE International Conference on Cognitive Informatics (ICCI’06) (2006)

  56. Asuncion, A., Newman, D.: UCI Machine Learning Repository. UCI, Irvine (2007)

    Google Scholar 

  57. Quevedo, J., Bahamonde, A., Luaces, O.: A simple and efficient method for variable ranking according to their usefulness for learning. Artif. Intell. Elsevier 52, 578–595 (2007)

    MathSciNet  MATH  Google Scholar 

  58. Famili, A., Shen, W., Weber, R., Simoudis, E.: Data pre-processing and intelligent data analysis. Intell. Data Anal. 1, 3–23 (1997)

    Article  Google Scholar 

  59. Liu, Y., Zheng, Y.: FS-SFS: a novel feature selection method for support vector machines. Pattern Recognit. 39, 1333–1345 (2006)

    Article  MATH  Google Scholar 

  60. Ruiz, R., Riquelne, J., Anguilar-Ruiz, J., Garcia-Torres, M.: Fast feature selection aimed at high dimentional data via hybrid-sequential-ranked searches. Expert Syst. Appl. 39, 11094–11102 (2012)

    Article  Google Scholar 

  61. Janecek, A.G., Gansterer, W.: On the relationship between feature selection and classification accuracy. In: JMLR: Workshop and Conference Proceedings, vol. 4, pp. 90–105 (2008)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Silvia Cateni.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cateni, S., Colla, V. & Vannucci, M. A Fuzzy System for Combining Filter Features Selection Methods. Int. J. Fuzzy Syst. 19, 1168–1180 (2017). https://doi.org/10.1007/s40815-016-0208-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40815-016-0208-7

Keywords

Navigation