Skip to main content
Log in

Filter Feature Selection for One-Class Classification

  • Published:
Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

Abstract

In one-class classification problems all training examples belong to a single class. The absence of counter-examples represents a challenge to traditional Machine Learning and pre-processing techniques. This is the case of various feature selection techniques for labeled data. The selection of the most relevant features from a dataset usually benefits the performance obtained by classification algorithms. Despite the relevance of this issue, few techniques have been proposed for feature selection in one-class classification problems. Moreover, most of the existent techniques are wrapper approaches, which have to rely on a specific classification algorithm for feature selection, or aggregation techniques. This paper proposes a new filter feature selection approach for one-class classification. First, five feature selection measures from different paradigms are here employed or adapted to the one-class scenario. Next, the feature rankings produced by these measures are combined using different aggregation strategies. The proposed approach was able to reduce the size of the feature sets while maintaining or even improving the predictive performance obtained by the one-class classifier.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. AlcalÃ-Fdez, J., Fernandez, A., Luengo, J., Derrac, J., Garca, S., Snchez, L., Herrera, F.: Keel data-mining software tool: Data set repository, integration of algorithms and experimental analysis framework. J. Multiple-Valued Logic Soft Comput. 17(2-3), 255–287 (2011)

    Google Scholar 

  2. Bache, K., Lichman, M.: UCI machine learning repository (2014). http://archive.ics.uci.edu/ml

  3. Bauer, E., Kohavi, R.: An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Mach. learn. 36(1-2), 105–139 (1999)

    Article  Google Scholar 

  4. De Borda, J.C.: Mėmoire sur les ėlections au scrutin. Histoire de l’Acadėmie Royale des Sciences (1784)

  5. Chang, C.C., Lin, C.J.: Libsvm: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 1–30 (2011)

    Article  Google Scholar 

  6. Chang, C.C., Lin, C.J.: Libsvm: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 1–30 (2011)

    Article  Google Scholar 

  7. Chapelle, O., Scholkopf, B., Zien, A.: Semi-supervised Learning, Chap. Graph-Based Methods. The MIT Press (2006)

  8. Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and other kernel-based learning methods. Cambridge University Press (2000)

  9. Demṡar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)

    MathSciNet  Google Scholar 

  10. Deris, S., Alashwal, H., Othman, M.: One-class support vector machines for protein-protein interactions prediction. Int. J. Biol. Med. Sci. 1(2), 120–127 (2006)

    Google Scholar 

  11. Dittman, D.J., Khoshgoftaar, T.M., Wald, R., Napolitano, A.: Classification performance of rank aggregation techniques for ensemble gene selection. In: The Twenty-Sixth International FLAIRS Conference (2013)

  12. Dwork, C., Kumar, R., Naor, M., Sivakumar, D.: Rank aggregation methods for the web. In: Proceedings of the 10th international conference on World Wide Web, pp. 613–622. ACM (2001)

  13. Elith*, J., H. Graham*, C., P. Anderson, R., Dud?k, M., Ferrier, S., Guisan, A., J. Hijmans, R., Huettmann, F., R. Leathwick, J., Lehmann, A., Li, J., G. Lohmann, L., A. Loiselle, B., Manion, G., Moritz, C., Nakamura, M., Nakazawa, Y., McC. M. Overton, J., Townsend Peterson, A., J. Phillips, S., Richardson, K., Scachetti-Pereira, R., E. Schapire, R., Sober?n, J., Williams, S., S. Wisz, M., E. Zimmermann, N.: Novel methods improve prediction of species? distributions from occurrence data. Ecography 29 (2), 129–151 (2006). doi:10.1111/j.2006.0906-7590.04596.x

  14. Hall, M.: Correlation-based feature selection for discrete and numeric class machine learning. In: Proceedings 17th International Conference Machine Learning, pp. 359–366 (2000)

  15. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. SIGKDD Explor. Newsl. 11(1), 10–18 (2009). doi:10.1145/1656274.1656278

    Article  Google Scholar 

  16. He, X., Cai, D., Niyogi, P.: Laplacian score for feature selection. In: NIPS, Vol. 186, p. 189 (2005)

  17. He, X., Niyogi, P.: Locality preserving projections. In: NIPS, Vol. 16, pp. 234–241 (2003)

  18. Hoffmann, H.: Kernel pca for novelty detection. Patt. Recogn. 40(3), 863–874 (2007)

    Article  MATH  Google Scholar 

  19. Jeong, Y.S., Kang, I.H., Jeong, M.K., Kong, D.: A new feature selection method for one-class classification problems. Systems, Man, and Cybernetics, Part C: Applications and Reviews. IEEE Trans. 42(6), 1500–1509 (2012)

    Google Scholar 

  20. Khan, S.S., Madden, M.G.: A survey of recent trends in one class classification. Artif. Intell. Cogn. Sci. 6206, 188–197 (2010)

    Article  Google Scholar 

  21. Lian, H.: On feature selection with principal component analysis for one-class svm. Pattern Recogn. Lett. 33(9), 1027–1031 (2012)

    Article  Google Scholar 

  22. Liu, H., Motoda, H.: Feature Extraction, Construction and Selection - A Data Mining Perspective. Kluwer Academic Publishers (1998)

  23. Liu, H., Motoda, H., Setiono, R., Zhao, Z.: Feature selection : An ever evolving frontier in data mining. Knowl. Creat. Diffus. Utilization 4, 4–13 (2010)

    Google Scholar 

  24. Liu, H., Yu, L.: Toward integrating feature selection algorithms for classification and clustering. IEEE Trans. Knowl. Data Eng. 17(4), 491–502 (2005)

    Article  Google Scholar 

  25. Lorena, A.C., Jacintho, L.F.O., Siqueira, M.F., Giovanni, R., Lohmann, L.G., Carvalho, A.C.P.L.F., Yamamoto, M.: Comparing machine learning classifiers in potential distribution modelling. Expert Syst. Appl. 38, 5268–5275 (2011)

    Article  Google Scholar 

  26. Lorena, L.H.N, De Carvalho, A.C.P.L.F., Lorena, A.C.: Seleo de atributos em problemas de classificao com uma nica classe [in portuguese]. In: X Encontro Nacional de Inteligncia Artificial e Computacional (ENIAC), pp. 1–11 (2013)

  27. Mitra, P., Murthy, C.A., Pal, S.: Unsupervised feature selection using feature similarity. IEEE Trans. Pattern Anal. Mach. Intell. 24(3), 301–312 (2002)

    Article  Google Scholar 

  28. Namsrai, E., Munkhdalai, T., Li, M., Shin, J.H., Namsrai, O.E., Ryu, K.H.: A feature selection-based ensemble method for arrhythmia classification. JIPS 9(1), 31–40 (2013)

    Google Scholar 

  29. Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering analysis and an algorithm. Proceedings of Advances in Neural Information Processing Systems. Cambridge, MA: MIT Press 14, 849–856 (2001)

  30. Pimentel, M.A., Clifton, D.A., Clifton, L., Tarassenko, L.: A review of novelty detection. Sig. Process. 99(0), 215–249 (2014)

    Article  Google Scholar 

  31. Prati, R.C.: Combining feature ranking algorithms through rank aggregation. In: Neural Networks (IJCNN), The 2012 International Joint Conference on, pp. 1–8. IEEE (2012)

  32. Reyes, J., Gilbert, D.: Combining one-class classification models based on diverse biological data for prediction of protein-protein interactions. In: Data Integration in the Life Sciences, Lecture Notes in Computer Science, Vol. 5109, pp. 177–191. Springer Berlin Heidelberg (2008)

  33. Reyes, J.A., Gilbert, D.: Prediction of protein-protein interactions using one-class classification methods and integrating diverse data. J. Integr. Bioinforma. 4(3), 77 (2007)

    Google Scholar 

  34. Scholkopf, B., Plattz, J.C., Shawe-Taylory, J., Smolax, A.J., Williamson, R.C.: Estimating the support of a high-dimensional distribution. Neural Comput. 13(7), 1443–1471 (2001)

    Article  Google Scholar 

  35. Shahid, N., Aleem, S., Naqvi, I.H., Zaffar, N.: Support vector machine based fault detection & classification in smart grids. In: Globecom Workshops (GC Wkshps), 2012 IEEE, pp. 1526–1531. IEEE (2012)

  36. Shen, Q., Diao, R., Su, P.: Feature selection ensemble. In: A. Voronkov (ed.) Turing-100, EPiC Series, Vol. 10, pp. 289–306. EasyChair (2012)

  37. Shin, H.J., Eom, D.H., Kim, S.S.: One-class support vector machines-an application in machine fault detection and classification. Comput. Ind. Eng. 48(2), 395–408 (2005). doi:10.1016/j.cie.2005.01.009

    Article  Google Scholar 

  38. Smart, E., Brown, D.J., Axel-Berg, L.: Comparing one and two class classification methods for multiple fault detection on an induction motor. In: ISIEA, 2013 IEEE Symposium on (2013)

  39. Tax, D.M., Duin, R.P.: Characterizing one-class datasets. In: Proceedings of the Sixteenth Annual Symposium of the Pattern Recognition Association of South Africa, pp. 21–26 (2005)

  40. Tax, D.M.J.: One-class classification: Concept-learning in the absence of counter-examples. PhD dissertation, Delft University of Technology (2001)

  41. Tsymbal, A., Cunningham, P.: Diversity in ensemble feature selection. Tech. rep., Department of Computer Science, Trinity College Dublin (2003). URL http://www.cs.tcd.ie/publications/tech-reports/reports.03/TCD-CS-2003-44.pdf

  42. Tsymbal, A., Pechenizkiy, M., Cunningham, P.: Diversity in search strategies for ensemble feature selection. Inf. fusion 6(1), 83–98 (2005)

    Article  Google Scholar 

  43. Tsymbal, A., Puuronen, S., Patterson, D.W.: Ensemble feature selection with the simple bayesian classification. Information Fusion 4(2), 87–100 (2003)

    Article  Google Scholar 

  44. Villalba, S.D., Cunningham, P.: An evaluation of dimension reduction techniques for one-class classification. Artif. Intell. Rev. 27(4), 273–294 (2007)

    Article  Google Scholar 

  45. Wald, R., Khoshgoftaar, T.M., Dittman, D., Awada, W., Napolitano, A.: An extensive comparison of feature ranking aggregation techniques in bioinformatics. In: Information Reuse and Integration (IRI), 2012 IEEE 13th International Conference on, pp. 377–384. IEEE (2012)

  46. Zhang, D., Wang, Y.: A new ensemble feature selection and its application to pattern classification. J. Control Theory Appl. 7(4), 419–426 (2009)

    Article  MATH  Google Scholar 

  47. Zhao, Z., Liu, H.: Spectral feature selection for supervised and unsupervised learning. In: Proceedings 24th International Conference on Machine Learning, pp. 1151–1157 (2007)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Luiz H N Lorena.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lorena, L.H.N., Carvalho, A.C.P.L.F. & Lorena, A.C. Filter Feature Selection for One-Class Classification. J Intell Robot Syst 80 (Suppl 1), 227–243 (2015). https://doi.org/10.1007/s10846-014-0101-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10846-014-0101-2

Keywords

Navigation