Abstract
Feature selection is very important in many machine learning and data mining applications. In this paper, a simple and effective correlation-deflation-based feature selection method is proposed. The objective function of residual minimization constrained by \(L_{2,0}\)-norm is proved to be equivalent to maximizing sum of square of correlations between class labels and features. Then the whole procedure of correlation-deflation-based feature selection turns into selecting features out one-by-one by deflating correlations. Experiments on several public benchmark data sets show that the proposed method has better residual reduction and classification performance than many state-of-the-art feature selection methods.




Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Alon U, Barkai N, Notterman D et al (1999) Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci 96(12):6745–6750
Backer E, Schipper JAD (1977) On the max–min approach for feature ordering and selection. In: The seminar on pattern recognition, Liege Univ, Liege, Belgium
Battiti R (1994) Using mutual information for selecting features in supervised neural net learning. IEEE Trans Neural Netw 5:537–550
Bhattacharjee A, Richards W, Staunton J et al (2001) Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. Proc Natl Acad Sci 98(24):13790–13795
Devijver PA, Kittler J (1982) Pattern recognition: a statistical approach. Prentice-Hall, London
Ding CHQ, Peng H (2005) Minimum redundancy feature selection from microarray gene expression data. J Bioinform Computat Biol 3(2):185–206
Ding CHQ, Zhou D, He X, Zha H (2006) R1-PCA: rotational invariant L1-norm principal component analysis for robust subspace factorization. In: ICML, Pittsburgh, PA, USA, pp 281–288
Dudoit S, Fridlyand J, Speed T (2002) Comparison of discrimination methods for the classification of tumors using gene expression data. J Am Stat Assoc 97(457):77–87
Fang X, Xu Y, Li X, Fan Z, Liu H, Chen Y (2014) Locality and similarity preserving embedding for feature selection. Neurocomputing 128:304–315
Golub T, Slonim D, Tamayo P et al (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(5439):531–537
Gu B, Sheng VS (2016) A robust regularization path algorithm for v-support vector classification. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2016.2527796
Gu B, Sheng VS, Tay KY, Romano W, Li S (2015a) Incremental support vector learning for ordinal regression. IEEE Trans Neural Netw Learn Syst 26(7):1403–1416
Gu B, Sheng VS, Wang Z, Ho D, Osman S, Li S (2015b) Incremental learning for v-support vector regression. Neural Netw 67:140–150
Gu B, Sun X, Sheng VS (2016) Structural minimax probability machine. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2016.2544779
Guyon I (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182
Huang D, Chow TW (2005) Effective feature selection scheme using mutual information. Neurocomputing 63:325–343
Jain A, Zongker D (1997) Feature selection: evaluation, application and small sample performance. IEEE Trans Pattern Anal Machine Intell 19(2):153–158
Khan J, Wei JS, Ringner M et al (2001) Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nat Med 7(6):673–679
Kira K, Rendell LA (1992) A practical approach to feature selection. In: Proceedings of the 9th international workshop on machine learning, ML92, pp 249–256
Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97(1–2):273–324
Kononenko I (1994) Estimating attributes: analysis and extensions of RELIEF. In: European conference on machine learning, pp 171–182
Langley P (1994) Selection of relevant features in machine learning. In: Proceedings of the AAAI Fall symposium on relevance, pp 140–144
Li Q, Xie B, You J, Bian W, Tao D (2016) Correlated logistic model with elastic net regularization for multilabel image classification. IEEE Trans Image Process 25(8):3801–3813
Liu H, Motoda H (1998) Feature selection for knowledge discovery and data mining. Kluwer, Norwell
Liu H, Liu L, Zhang H (2009) Boosting feature selection using information metric for classification. Neurocomputing 73(1–3):295–303
Ma S, Song X, Huang J (2007) Supervised group lasso with applications to microarray data analysis. BMC Bioinform 8:60
Mao KZ (2002) Fast orthogonal forward selection algorithm for feature subset selection. IEEE Trans Neural Netw 13(5):1218–1224
Mao KZ (2004) Orthogonal forward selection and backward elimination algorithms for feature subset selection. IEEE Trans Syst Man Cybern Part B 34(1):629–634
Ng AY (2004) Feature selection, \(l_1\) vs. \(l_2\) regularization, and rotational invariance. In: ICML
Nie F, Huang H, Cai X, Ding CHQ (2010) Efficient and robust feature selection via joint \(l_{2,1}\)-norms minimization. In: Advances in neural information processing systems, pp 1813–1821
Nutt C, Mani D, Betensky R et al (2003) Gene expression-based classification of malignant gliomas correlates better with survival than histological classification. Cancer Res 63(7):1602–1607
Pan Z, Jin P, Lei J et al (2016) Fast reference frame selection based on content similarity for low complexity HEVC encoder. J Vis Commun Image Represent 40(Part B):516–524
Pan Z, Zhang Y, Kwong S (2015) Efficient motion and disparity estimation optimization for low complexity multiview video coding. IEEE Trans Broadcast 61(2):166–176
Pan Z, Lei J, Zhang Y, Sun X, Kwong S (2016) Fast motion estimation based on content property for low-complexity H265 HEVC encoder. IEEE Trans Broadcast 62(3):675–684
Peng H, Long F, Ding C (2005) Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238
Pudil P, Novovicova J, Kittler J (1994) Floating search methods in feature selection. Pattern Recognit Lett 15(11):1119–1125
Raileanu LE, Stoffel K (2004) Theoretical comparison between the Gini index and information gain criteria. Ann Math Artif Intell 41(1):77–93
Skalak DB (1994) Prototype and feature selection by sampling and random mutation hill climbing algorithms. In: ICML, NJ, USA, pp 293–301
Su A, Welsh J, Sapinoso L et al (2001) Molecular classification of human carcinomas by use of gene expression signatures. Cancer Res 61(20):7388–7393
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B (Methodol) 58(1):267–288
Wei D, Li S, Tan M (2012) Graph embedding based feature selection. Neurocomputing 93:115–125
Wei H, Billings S (2007) Feature subset selection and ranking for data dimensionality reduction. IEEE Trans Pattern Anal Mach Intell 29(1):162–166
Xia Z, Wang X, Sun X, Liu Q, Xiong N (2016) Steganalysis of LSB matching using differences between nonadjacent pixels. Multimed Tools Appl 75(4):1947–1962
Xuan P, Guo MZ, Wang J, Liu XY, Liu Y (2011) Genetic algorithm-based efficient feature selection for classification of pre-mirnas. Genet Mol Res 10(2):588–603
Xue Y, Jiang J, Zhao B, Ma T (2017) A self-adaptive artificial bee colony algorithm based on global best for global optimization. Soft Comput. https://doi.org/10.1007/s00500-017-2547-1
Yang K, Cai Z, Li J, Lin G (2006) A stable gene selection in microarray data analysis. BMC Bioinform 7:228
Yuan C, Sun X, R LV (2016) Fingerprint liveness detection based on multi-scale LPQ and PCA. China Commun 13(7):60–65
Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. J R Stat Soc Ser B 68(1):49–67
Zhang J, Yu J, Wan J, Zeng Z (2015) L2,1-norm regularized fisher criterion for optimal feature selection. Neurocomputing 166:455–463
Zhang M, Ding CHQ, Zhang Y, Nie F (2014) Feature selection at the discrete limit. In: Proceedings of the 28th AAAI, Québec, Canada, pp 1355–1361
Zhao G, Wu Y, Chen F, Zhang J, Bai J (2015) Effective feature selection using feature vector graph for classification. Neurocomputing 151:376–389
Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc B 67(2):301–320
Acknowledgements
The authors would like to thank the Editor and anonymous reviewers for their valuable comments and suggestions, which were helpful in improving the paper. This work was supported in part by Key Project of Chinese National Programs for Fundamental Research and Development (973 Program) under Grant 2015CB351705, in part by the National Natural Science Foundation of China under Grants 61202228, 61472002, 61572030 and 61671018, and Collegiate Natural Science Fund of Anhui Province under Grant KJ2017A014.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Rights and permissions
About this article
Cite this article
Chen, SB., Ding, C.H.Q., Zhou, ZL. et al. Feature selection based on correlation deflation. Neural Comput & Applic 31, 6383–6392 (2019). https://doi.org/10.1007/s00521-018-3467-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-018-3467-4