Skip to main content

Improving Classification Accuracy by Means of the Sliding Window Method in Consistency-Based Feature Selection

  • Conference paper
  • First Online:
  • 935 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10558))

Abstract

In the digital era, collecting relevant information of a technological process has become increasingly cheaper and easier. However, due to the huge available amount of data, supervised classification is one of the most challenging tasks within the artificial intelligence field. Feature selection solves this problem by removing irrelevant and redundant features from data. In this paper we propose a new feature selection algorithm called Swcfs, which works well in high-dimensional and noisy data. Swcfs can detect noisy features by leveraging the sliding window method over the set of consecutive features ranked according to their non-linear correlation with the class feature. The metric Swcfs uses to evaluate sets of features, with respect to their relevance to the class label, is the bayesian risk, which represents the theoretical upper error bound of deterministic classification. Experiments reveal Swcfs is more accurate than most of the state-of-the-art feature selection algorithms.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Rohrmair, G., Lowe, G.: Using data-independence in the analysis of intrusion detection systems. Theor. Comput. Sci. 340(1), 82–101 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  2. Angeleska, A., Jonoska, N., Saito, M.: Rewriting rule chains modeling DNA rearrangement pathways. Theor. Comput. Sci. 454, 5–22 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  3. De Maria, E., Fages, F., Rizk, A., Soliman, S.: Design, optimization, and predictions of a coupled model of the cell cycle, circadian clock, DNA repair system, irinotecan metabolism and exposure control under temporal logic constraints. Theor. Comput. Sci. 412(21), 2108–2127 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  4. Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97, 273–324 (1997)

    Article  MATH  Google Scholar 

  5. Molina, L.C., Belanche, L., Nebot, A.: Feature selection algorithms: a survey and experimental evaluations. In: Proceedings of the 2002 IEEE International Conference on Data Mining (ICDM 2002), 9–12 December 2002, Maebashi City (2002)

    Google Scholar 

  6. Hodorog, M., Schicho, J.: A regularization approach for estimating the type of a plane curve singularity. Theor. Comput. Sci. 479, 99–119 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  7. John, G.H., Kohavi, R., Pfleger, K.: Irrelevant feature and the subset selection problem. In: ICML (1994)

    Google Scholar 

  8. Shin, K., Kuboyama, T., Hashimoto, T., Shepard, D.: Super-CWC and super-LCC: super fast feature selection algorithms. In: Proceedings of 2015 IEEE International Conference on Big Data (Big Data), pp. 1–7 (2015)

    Google Scholar 

  9. Pino Angulo, A., Shin, K.: Fast and accurate steepest-descent consistency-constrained algorithms for feature selection. In: Pardalos, P., Pavone, M., Farinella, G.M., Cutello, V. (eds.) MOD 2015. LNCS, vol. 9432, pp. 293–305. Springer, Cham (2015). doi:10.1007/978-3-319-27926-8_26

    Chapter  Google Scholar 

  10. Shin, K., Xu, X.M.: A consistency-constrained feature selection algorithm with the steepest descent method. In: Torra, V., Narukawa, Y., Inuiguchi, M. (eds.) MDAI 2009. LNCS, vol. 5861, pp. 338–350. Springer, Heidelberg (2009). doi:10.1007/978-3-642-04820-3_31

    Chapter  Google Scholar 

  11. Kira, K., Rendell, L.A.: A practical approach to feature selection. In: Proceedings of the Ninth International Workshop on Machine Learning, pp. 249–256. Morgan Kaufman Publishers Inc. (1992)

    Google Scholar 

  12. Kononenko, I.: Estimating attributes: analysis and extensions of RELIEF. In: Bergadano, F., De Raedt, L. (eds.) ECML 1994. LNCS, vol. 784, pp. 171–182. Springer, Heidelberg (1994). doi:10.1007/3-540-57868-4_57

    Chapter  Google Scholar 

  13. Xiaofei, H., Deng, C., Partha, N.: Laplacian score for feature selection. In: Proceedings of the 18th International Conference on Neural Information Processing Systems (NIPS 2005), pp. 507–514 (2005)

    Google Scholar 

  14. Zhu, L., Miao, L., Zhang, D.: Iterative Laplacian score for feature selection. In: Liu, C.-L., Zhang, C., Wang, L. (eds.) CCPR 2012. CCIS, vol. 321, pp. 80–87. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33506-8_11

    Chapter  Google Scholar 

  15. Quanquan, G., Zhenhui, L., Jiawei, H.: Generalized Fisher score for feature selection. In: Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence (UAI 2011), pp. 266–273 (2011)

    Google Scholar 

  16. Yu, L., Liu, H.: Feature selection for high-dimensional data: a fast correlation-based filter solution. In: Proceedings of the Twentieth International Conference on Machine Learning (ICML-2003) (2003)

    Google Scholar 

  17. Guyon, I., Weston, J., Barnhill, S.: Gene selection for cancer classification using support vector machines. Mach. Learn. 46, 389 (2002)

    Article  MATH  Google Scholar 

  18. Hall, M.A., Smith, L.A.: Feature selection for machine learning: comparing a correlation-based filter approach to the wrapper. In: Proceedings of the Twelfth International, pp. 235–239. AAAI Press (1999)

    Google Scholar 

  19. Ding, C., Peng, H.: Minimum redundancy feature selection from microarray gene expression data. In: Proceedings of the IEEE Computer Society Conference on Bioinformatics (CSB 2003) (2003)

    Google Scholar 

  20. Zhao, Z., Liu, H.: Searching for interacting features. In: Proceedings of the 20th International Joint Conference on Artifical Intelligence (IJCAI 2007) (2007)

    Google Scholar 

  21. Shin, K., Xu, X.M.: Consistency-based feature selection. In: Velásquez, J.D., Ríos, S.A., Howlett, R.J., Jain, L.C. (eds.) KES 2009. LNCS, vol. 5711, pp. 342–350. Springer, Heidelberg (2009). doi:10.1007/978-3-642-04595-0_42

    Chapter  Google Scholar 

  22. Lichman, M.: UCI machine learning repository, School of Information and Computer Science, University of California, Irvine (2013). http://archive.ics.uci.edu/ml

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Adrian Pino Angulo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Pino Angulo, A., Shin, K. (2017). Improving Classification Accuracy by Means of the Sliding Window Method in Consistency-Based Feature Selection. In: Yamamoto, A., Kida, T., Uno, T., Kuboyama, T. (eds) Discovery Science. DS 2017. Lecture Notes in Computer Science(), vol 10558. Springer, Cham. https://doi.org/10.1007/978-3-319-67786-6_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-67786-6_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-67785-9

  • Online ISBN: 978-3-319-67786-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics