Skip to main content
Log in

An exponent weighted algorithm for minimal cost feature selection

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Minimal cost feature selection plays a crucial role in cost-sensitive learning. It aims to determine a feature subset for minimizing the average total cost by considering the trade-off between test costs and misclassification costs. Recently, a backtracking algorithm has been developed to tackle this problem. Unfortunately, the efficiency of the algorithm for large datasets is often unacceptable. Moreover, the run time of this algorithm significantly increases with the rise of misclassification costs. In this paper, we develop an exponent weighted algorithm for minimal cost feature selection, and the exponent weighted function of feature significance is constructed to increase the efficiency of the algorithm. The exponent weighted function is based on the information entropy, test cost, and a user-specified non-positive exponent. The effectiveness of our algorithm is demonstrated on six UCI datasets with two representative test cost distributions. Compared with the existing backtracking algorithm, the proposed algorithm significantly increases efficiency without being influenced by the misclassification cost setting.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Asuncion A, Newman D (2007) Uci machine learning repository

  2. Bishop CM et al (2006) Pattern recognition and machine learning, vol. 1, Springer, New York

    MATH  Google Scholar 

  3. Domingos P, (1999) Metacost: a general method for making classifiers cost-sensitive. In: Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, ACM

  4. Elkan C (2001) The foundations of cost-sensitive learning. In: International joint conference on artificial intelligence, vol. 17, Citeseer

  5. Friedman N, Geiger D, Goldszmidt M (1997) Bayesian network classifiers. Mach Learn 29(2–3):131–163

    Article  MATH  Google Scholar 

  6. Goldberg DE et al (1989) Genetic algorithms in search, optimization, and machine learning, vol. 412. Addison-wesley Reading, Menlo Park

    Google Scholar 

  7. Hu Q, Liu J, Yu D (2008) Mixed feature selection based on granulation and approximation. Knowl Based Syst 21(4):294–304

    Article  Google Scholar 

  8. Kira K, Rendell LA (1992) The feature selection problem: traditional methods and a new algorithm. In: AAAI

  9. Min F, He H, Qian Y, Zhu W (2011) Test-cost-sensitive attribute reduction. Inf Sci 181(22):4928–4942

    Article  Google Scholar 

  10. Min F, Liu Q (2009) A hierarchical model for test-cost-sensitive decision systems. Inf Sci 179(14):2442–2452

    Article  MathSciNet  MATH  Google Scholar 

  11. Min F, Zhu W (2011) Minimal cost attribute reduction through backtracking. In: Database theory and application, bio-science and bio-technology. Springer, New York, pp 100–107

  12. Min F, Zhu W, Zhao H, Pan G, Liu J, Xu Z (2010) Coser: cost-senstive rough sets

  13. Pawlak Z (1982) Rough sets. Int J Comput Inf Sci 11(5):341–356

    Article  MathSciNet  MATH  Google Scholar 

  14. Pazzani MJ, Merz CJ, Murphy PM, Ali K, Hume T, Brunk C (1994) Reducing misclassification costs. In: ICML, vol. 94

  15. Polkowski L (2013) Rough sets: mathematical foundations. Springer Science & Business, Berlin

    MATH  Google Scholar 

  16. Quinlan JR (1986) Induction of decision trees. Mach Learn 1(1):81–106

    Google Scholar 

  17. Rückstieß T, Osendorfer C, van der Smagt P (2013) Minimizing data consumption with sequential online feature selection. Int J Mach Learn Cybernet 4(3):235–243

    Article  Google Scholar 

  18. Specht DF (1990) Probabilistic neural networks. Neural Netw 3(1):109–118

    Article  Google Scholar 

  19. Subrahmanya N, Shin YC (2013) A variational bayesian framework for group feature selection. Int J Mach Learn Cybernet 4(6):609–619

    Article  Google Scholar 

  20. Turney P Cost-sensitive classification: Empirical evaluation of a hybrid genetic decision tree induction algorithm. J Artif Intell Res (JAIR) 2

  21. Wang G, Yu H, Yang D (2002) Decision table reduction based on conditional information entropy. Chin J Comput 25(7):759–766

    MathSciNet  Google Scholar 

  22. Wang X, Dong L, Yan J (2012) Maximum ambiguity-based sample selection in fuzzy decision tree induction. IEEE Trans Knowl Data Eng 24(8):1491–1505

    Article  Google Scholar 

  23. Wang X, He Y, Wang D (2013) Non-naive bayesian classifiers for classification problems with continuous attributes

  24. Wang Y, Vassileva J (2003) Bayesian network-based trust model. In: Web intelligence, 2003. WI 2003. Proceedings. IEEE/WIC International Conference on, IEEE

  25. Xie Z, Xu Y (2014) Sparse group lasso based uncertain feature selection. Int J Mach Learn Cybernet 5:201–210

    Article  Google Scholar 

  26. Yao YY, Zhao Y (2008) Attribute reduction in decision-theoretic rough set models. Inf Sci 178(17):3356–3373

    Article  MathSciNet  MATH  Google Scholar 

  27. Yao YY, Zhao Y, Wang J (2008) On reduct construction algorithms. Trans Comput Sci 2:100–117

    MATH  Google Scholar 

  28. Yegnanarayana B (2009) Artificial neural networks. PHI Learning Pvt. Ltd., New Delhi

    Google Scholar 

  29. Yuan Y, Shaw MJ (1995) Induction of fuzzy decision trees. Fuzzy Sets Syst 69(2):125–139

    Article  MathSciNet  Google Scholar 

  30. Zhong N, Dong J, Ohsuga S (2001) Using rough sets with heuristics for feature selection. J Intell Inf Syst 16(3):199–214

    Article  MATH  Google Scholar 

Download references

Acknowledgments

This work is in part supported by the Key Project of Education Department of Fujian Province under Grant No. JA13192, the Zhangzhou Municipal Natural Science Foundation under Grant No. ZZ2013J03, the National Science Foundation of China under Grant Nos. 61379049, 61379089 and 61170128, and the Postgraduate Research Innovation Project of Minnan Normal University under Grant No. YJS201438.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hong Zhao.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, X., Zhao, H. & Zhu, W. An exponent weighted algorithm for minimal cost feature selection. Int. J. Mach. Learn. & Cyber. 7, 689–698 (2016). https://doi.org/10.1007/s13042-014-0279-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-014-0279-4

Keywords

Navigation