Skip to main content
Log in

Constrained class-wise feature selection (CCFS)

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Feature selection plays a vital role as a preprocessing step for high dimensional data in machine learning. The basic purpose of feature selection is to avoid “curse of dimensionality” and reduce time and space complexity of training data. Several techniques, including those that use information theory, have been proposed in the literature as a means to measure the information content of a feature. Most of them incrementally select features with max dependency with the category but minimum redundancy with already selected features. A key missing idea in these techniques is the fair representation of features with max dependency among the different categories, i.e., skewed selection of features having high mutual information (MI) with a particular class. This can result in a biased classification in favor of that particular class while other classes have low matching scores during classification. We propose a novel approach based on information theory that selects features in a class-wise fashion rather than based on their global max dependency. In addition, a constrained search is used instead of a global sequential forward search. We prove that our proposed approach enhances Maximum Relevance while keeping Minimum Redundancy under a constrained search. Results on multiple benchmark datasets show that our proposed method improves accuracy as compared to other state-of-the-art feature selection algorithms while having a lower time complexity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data availability

The datasets used and analyzed during the current study are publicly available in the UCI repository and other sources as given below: (1) Newsgroup: https://archive.ics.uci.edu/ml/datasets/Twenty+Newsgroups. (2) 4 Universities dataset: http://www-2.cs.cmu.edu/afs/cs.cmu.edu/project/theo-20/www/data/.

Notes

  1. Available at http://qwone.com/~jason/20Newsgroups/

  2. Available at http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-20/www/data/

References

  1. Alaba PA, Popoola SI, Olatomiwa L, Akanle MB, Ohunakin OS, Adetiba E, Alex OD, Atayero AA, Daud WMAW (2019) Towards a more efficient and cost-sensitive extreme learning machine: a state-of-the-art review of recent trend. Neurocomputing 350:70–90

    Article  Google Scholar 

  2. Battiti R (1994) Using mutual information for selecting features in supervised neural net learning. IEEE Trans Neural Netw 5:537–550

    Article  Google Scholar 

  3. Bisson G, Hussain F (2008) Chi-sim: a new similarity measure for the co-clustering task. In: Proceedings of the 7th International Conference on Machine Learning and Applications. pp 211–217

  4. Boser BE, Guyon IM, Vapnik VN (1992) A training algorithm for optimal margin classifiers. In: Proceedings of the 5th annual workshop on Computational learning theory, pp 144–152

  5. Bugata P, Drotár P (2019) Weighted nearest neighbors feature selection. Knowl-Based Syst 163:749–761

    Article  Google Scholar 

  6. Cai J, Luo J, Wang S, Yang S (2018) Feature selection in machine learning: a new perspective. Neurocomputing 300:70–79

    Article  Google Scholar 

  7. Cao W, Hu L, Gao J, Wang X, Ming Z (2020) A study on the relationship between the rank of input data and the performance of random weight neural network. Neural Comput Appl 32:12685–12696

    Article  Google Scholar 

  8. Cao W, Wang X, Ming Z, Gao J (2018) A review on neural networks with random weights. Neurocomputing 275:278–287

    Article  Google Scholar 

  9. Cao W, Xie Z, Li J, Xu Z, Ming Z, Wang X (2021) Bidirectional stochastic configuration network for regression problems. Neural Netw 140:237–246

    Article  Google Scholar 

  10. Chen G, Chen J (2015) A novel wrapper method for feature selection and its applications. Neurocomputing 159:219–226

    Article  Google Scholar 

  11. Deng X, Li Y, Weng J, Zhang J (2019) Feature selection for text classification: a review. Multimed Tools Appl 78:3797–3816

    Article  Google Scholar 

  12. Forman G (2003) An extensive empirical study of feature selection metrics for text classification. J Mach Learn Res 3:1289–1305

    MATH  Google Scholar 

  13. Fragoso RC, Pinheiro RH, Cavalcanti GD (2016) Class-dependent feature selection algorithm for text categorization. In: 2016 International Joint Conference on Neural Networks (IJCNN). IEEE, pp 3508–3515

  14. Gao W, Hu L, Zhang P (2018) Class-specific mutual information variation for feature selection. Pattern Recogn 79:328–339

    Article  Google Scholar 

  15. Gao W, Hu L, Zhang P, He J (2018) Feature selection considering the composition of feature relevancy. Pattern Recogn Lett 112:70–74

    Article  Google Scholar 

  16. Hancer E, Xue B, Zhang M, Karaboga D, Akay B (2018) Pareto front feature selection based on artificial bee colony optimization. Inf Sci 422:462–479

    Article  Google Scholar 

  17. Hussain S (2011) Bi-clustering gene expression data using co-similarity. Presented at the International Conferences on Advanced Data Mining and Applications, Beijing, China, pp 190–200

  18. Hussain SF (2019) A novel robust kernel for classifying high-dimensional data using support vector machines. Expert Syst Appl 131:116–131

    Article  Google Scholar 

  19. Hussain SF, Iqbal S (2018) CCGA: co-similarity based co-clustering using genetic algorithm. Appl Soft Comput 72:30–42

    Article  Google Scholar 

  20. Labani M, Moradi P, Ahmadizar F, Jalili M (2018) A novel multivariate filter method for feature selection in text classification problems. Eng Appl Artif Intell 70:25–37

    Article  Google Scholar 

  21. Lewis DD (1992) Feature selection and feature extract ion for text categorization. In: Speech and Natural Language: proceedings of a Workshop Held at Harriman, New York, February, pp 23–26

  22. Li J, Cheng K, Wang S, Morstatter F, Trevino RP, Tang J, Liu H (2017) Feature selection: a data perspective. ACM Comput Surv (CSUR) 50:1–45

    Article  Google Scholar 

  23. Long B, Wu X, Zhang ZM, Yu PS (2006) Unsupervised learning on k-partite graphs. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. pp 317–326

  24. Long B, Zhang ZM, Yu PS (2005) Co-clustering by block value decomposition. In: Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, pp 635–640

  25. Maldonado S, Weber R (2009) A wrapper method for feature selection using support vector machines. Inf Sci 179:2208–2217

    Article  Google Scholar 

  26. Peng H, Long F, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27:1226–1238

    Article  Google Scholar 

  27. Qaisar SM, Hussain SF (2021) Effective epileptic seizure detection by using level-crossing EEG sampling sub-bands statistical features selection and machine learning for mobile healthcare. Comput Methods Programs Biomed 203:106034

    Article  Google Scholar 

  28. Salton G, Wong A, Yang CS (1975) A vector space model for automatic indexing. Commun ACM 18:613–620

    Article  Google Scholar 

  29. Shang C, Li M, Feng S, Jiang Q, Fan J (2013) Feature selection via maximizing global information gain for text classification. Knowl-Based Syst 54:298–309

    Article  Google Scholar 

  30. Shannon CE (1948) A mathematical theory of communication. Bell Syst Techn J 27:379–423

    Article  MathSciNet  Google Scholar 

  31. Tibshirani R (2011) Regression shrinkage and selection via the lasso: a retrospective. J R Stat Soc Ser B (Stat Methodol) 73:273–282

    Article  MathSciNet  Google Scholar 

  32. Vergara JR, Estévez PA (2014) A review of feature selection methods based on mutual information. Neural Comput Appl 24:175–186

    Article  Google Scholar 

  33. Wu G, Xu J (2015) Optimized approach of feature selection based on information gain. In: 2015 International Conference on Computer Science and Mechanical Automation (CSMA). IEEE, pp 157–161

  34. Xu J, Jiang H (2015) An improved information gain feature selection algorithm for SVM text classifier. In: 2015 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery. IEEE, pp 273–276

  35. Yang Y, Pedersen JO (1997) A comparative study on feature selection in text categorization. In: Proceedings of the International Conference on Machine Learning (ICML). pp 412–420

  36. Zeng Z, Zhang H, Zhang R, Yin C (2015) A novel feature selection method considering feature interaction. Pattern Recogn 48:2656–2666

    Article  Google Scholar 

  37. Zhang R, Nie F, Li X, Wei X (2019) Feature selection with multi-view data: a survey. Inform Fus 50:158–167

    Article  Google Scholar 

  38. Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc Ser B (Stat Methodol) 67:301–320

    Article  MathSciNet  Google Scholar 

Download references

Funding

This work was done as part of an MS thesis by Fatima Shahzadi supported by the Ghulam Ishaq Khan Institute of Engineering Sciences and Technology under the GA-scheme.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Syed Fawad Hussain.

Ethics declarations

Conflict of interest

None.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hussain, S.F., Shahzadi, F. & Munir, B. Constrained class-wise feature selection (CCFS). Int. J. Mach. Learn. & Cyber. 13, 3211–3224 (2022). https://doi.org/10.1007/s13042-022-01589-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-022-01589-5

Keywords

Navigation