Skip to main content
Log in

Semi-supervised local feature selection for data classification

  • Research Paper
  • Published:
Science China Information Sciences Aims and scope Submit manuscript

Abstract

Conventional feature selection methods select the same feature subset for all classes, which means that the selected features might work better for some classes than the others. Towards this end, this paper proposes a new semi-supervised local feature selection method (S2LFS) allowing to select different feature subsets for different classes. According to this method, class-specific feature subsets are selected by learning the importance of features considering each class separately. In particular, the class labels of all available data are jointly learned under a consistent constraint over the labeled data, which enables the proposed method to select the most discriminative features. Experiments on six data sets demonstrate the effectiveness of the proposed method compared to some popular feature selection methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Zhuang Y T, Han Y H, Wu F, et al. Stable multi-label boosting for image annotation with structural feature selection. Sci China Inf Sci, 2011, 54: 2508–2521

    Article  MathSciNet  Google Scholar 

  2. Liu C W, Pei M T, Wu X X, et al. Learning a discriminative mid-level feature for action recognition. Sci China Inf Sci, 2014, 57: 052112

    Google Scholar 

  3. Chen J B, Stern M, Wainwright M J, et al. Kernel feature selection via conditional covariance minimization. In: Proceedings of Advances in Neural Information Processing Systems, Long Beach, 2017. 6946–6955

  4. Guyon I, Elisseeff A. An introduction to variable and feature selection. J Mach Learn Res, 2003, 3: 1157–1182

    MATH  Google Scholar 

  5. Li Z C, Tang J H. Unsupervised feature selection via nonnegative spectral analysis and redundancy control. IEEE Trans Image Process, 2015, 24: 5343–5355

    Article  MathSciNet  Google Scholar 

  6. Nie F P, Huang H, Cai X. et al. Efficient and robust feature selection via joint 2,1-norms minimization. In: Proceedings of Advances in Neural Information Processing Systems, Vancouver, 2010. 1813–1821

  7. Li Z C, Yang Y, Liu J, et al. Unsupervised feature selection using nonnegative spectral analysis. In: Proceedings of AAAI Conference on Artificial Intelligence, Toronto, 2012. 1026–1032

  8. Mitra P, Murthy C A, Pal S K. Unsupervised feature selection using feature similarity. IEEE Trans Pattern Anal Machine Intell, 2002, 24: 301–312

    Article  Google Scholar 

  9. He X F, Cai D, Niyogi P. Laplacian score for feature selection. In: Proceedings of Advances in Neural Information Processing Systems, Vancouver, 2005. 1813–1821

  10. Kolar M, Liu H. Feature selection in high-dimensional classification. In: Proceedings of International Conference on Machine Learning, Atlanta, 2013. 329–337

  11. Gao S Y, ver Steeg G, Galstyan A. Variational information maximization for feature selection. In: Proceedings of Advances in Neural Information Processing Systems, Barcelona, 2016. 487–495

  12. Zhao Z, Liu H. Spectral feature selection for supervised and unsupervised learning. In: Proceedings of International Conference on Machine Learning, Corvallis, 2007. 1151–1157

  13. Helleputte T, Dupont P. Partially supervised feature selection with regularized linear models. In: Proceedings of International Conference on Machine Learning, Montreal, 2009. 409–416

  14. Xu Z L, Jin R, Lyu M R, et al. Discriminative semi-supervised feature selection via manifold regularization. In: Proceedings of International Joint Conference on Artificial Intelligence, Pasadena, 2009. 1303–1308

  15. Benabdeslem K, Hindawi M. Constrained laplacian score for semi-supervised feature selection. In: Proceedings of European Conference on Machine Learning and Knowledge Discovery in Databases, Athens, 2011. 204–218

  16. Li Y H, Dong M, Hua J. Localized feature selection for clustering. Pattern Recogn Lett, 2008, 29: 10–18

    Article  Google Scholar 

  17. Armanfard N, Reilly J P, Komeili M. Local feature selection for data classification. IEEE Trans Pattern Anal Mach Intell, 2016, 38: 1217–1227

    Article  Google Scholar 

  18. Bugata P, Drotar P. On some aspects of minimum redundancy maximum relevance feature selection. Sci China Inf Sci, 2020, 63: 112103

    Article  MathSciNet  Google Scholar 

  19. Huang T T, Xu Y C, Bai S, et al. Feature context learning for human parsing. Sci China Inf Sci, 2019, 62: 220101

    Article  Google Scholar 

  20. Zhang Q, Li R, Chu T G. Kernel semi-supervised graph embedding model for multimodal and mixmodal data. Sci China Inf Sci, 2020, 63: 119204

    Article  Google Scholar 

  21. Cai D, Zhang C Y, He X F. Unsupervised feature selection for multi-cluster data. In: Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, 2010. 333–342

  22. Boutsidis C, Mahoney M W, Drineas P. Unsupervised feature selection for the k-means clustering problem. In: Proceedings of Advances in Neural Information Processing Systems, Vancouver, 2009. 153–161

  23. Li C Z, Xu Z B, Qiao C, et al. Hierarchical clustering driven by cognitive features. Sci China Inf Sci, 2014, 57: 012109

    MathSciNet  Google Scholar 

  24. An S, Wang J, Wei J M, et al. Unsupervised feature selection with joint clustering analysis. In: Proceedings of ACM Conference on Information and Knowledge Management, Singapore, 2017. 1639–1648

  25. Li Z C, Liu J, Yang Y, et al. Clustering-guided sparse structural learning for unsupervised feature selection. IEEE Trans Knowl Data Eng, 2014, 26: 2138–2150

    Article  Google Scholar 

  26. Wang J, Wei J M, Yang Z L. Supervised feature selection by preserving class correlation. In: Proceedings of ACM International Conference on Information and Knowledge Management, Indianapolis, 2016. 1613–1622

  27. Zhang R, Nie F P, Li X L. Self-weighted supervised discriminative feature selection. IEEE Trans Neural Netw Learn Syst, 2018, 29: 3913–3918

    Article  Google Scholar 

  28. Tang J H, Shu X B, Qi G J, et al. Tri-clustered tensor completion for social-aware image tag refinement. IEEE Trans Pattern Anal Mach Intell, 2017, 39: 1662–1674

    Article  Google Scholar 

  29. Tang J H, Shu X B, Li Z C, et al. Social anchor-unit graph regularized tensor completion for large-scale image retagging. IEEE Trans Pattern Anal Mach Intell, 2019, 41: 2027–2034

    Article  Google Scholar 

  30. Zhao Z, Liu H. Semi-supervised feature selection via spectral analysis. In: Proceedings of SIAM International Conference on Data Mining, Minneapolis, Minnesota, 2007. 641–646

  31. Chen X J, Yuan G W, Nie F P, et al. Semi-supervised feature selection via sparse rescaled linear square regression. IEEE Trans Knowl Data Eng, 2020, 32: 165–176

    Article  Google Scholar 

  32. Yuan G W, Chen X J, Wang C, et al. Discriminative semi-supervised feature selection via rescaled least squares regression-supplement. In: Proceedings of AAAI Conference on Artificial Intelligence, New Orleans, 2018. 8177–8178

  33. Benabdeslem K, Hindawi M. Efficient semi-supervised feature selection: constraint, relevance, and redundancy. IEEE Trans Knowl Data Eng, 2014, 26: 1131–1143

    Article  Google Scholar 

  34. Sheikhpour R, Sarram M A, Gharaghani S, et al. A survey on semi-supervised feature selection methods. Pattern Recogn, 2017, 64: 141–158

    Article  Google Scholar 

  35. Dhillon I S. Co-clustering documents and words using bipartite spectral graph partitioning. In: Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, 2001. 269–274

  36. Nakajima S, Takeda A, Babacan S D, et al. Global solver and its efficient approximation for variational Bayesian low-rank subspace clustering. In: Proceedings of Advances in Neural Information Processing Systems, Lake Tahoe, 2013. 1439–1447

  37. Li Z C, Liu J, Tang J H, et al. Robust structured subspace learning for data representation. IEEE Trans Pattern Anal Mach Intell, 2015, 37: 2085–2098

    Article  Google Scholar 

  38. Sun Y J, Todorovic S, Goodison S. Local-learning-based feature selection for high-dimensional data analysis. IEEE Trans Pattern Anal Mach Intell, 2010, 32: 1610–1626

    Article  Google Scholar 

  39. Guan Y, Dy J G, Jordan M I. A unified probabilistic model for global and local unsupervised feature selection. In: Proceedings of International Conference on Machine Learning, Bellevue, 2011. 1073–1080

  40. Hindawi M, Benabdeslem K. Local-to-global semi-supervised feature selection. In: Proceedings of ACM International Conference on Information and Knowledge Management, San Francisco, 2013. 2159–2168

  41. Zhu X J, Ghahramani Z B, Lafferty J D. Semi-supervised learning using gaussian fields and harmonic functions. In: Proceedings of International Conference on Machine Learning, Washington, 2003. 912–919

  42. Hull J J. A database for handwritten text recognition research. IEEE Trans Pattern Anal Machine Intell, 1994, 16: 550–554

    Article  Google Scholar 

  43. Nene S A, Nayar S K, Murase H. Columbia Object Image Library (COIL-20). Technical Report CUCS-005-96. 1996

  44. Gourier N, Hall D, Crowley J L. Estimating face orientation from robust detection of salient facial features. In: Proceedings of Pointing 2004 ICPR International Workshop on Visual Observation of Deictic Gestures, Cambridge, 2004. 1–9

  45. Georghiades A S, Belhumeur P N, Kriegman D J. From few to many: illumination cone models for face recognition under variable lighting and pose. IEEE Trans Pattern Anal Mach Intell, 2001, 23: 643–660

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by National Key Research and Development Program of China (Grant No. 2017YFC0820601), National Natural Science Foundation of China (Grant No. 61720106004, 61732007), and Natural Science Foundation of Jiangsu Province (Grant No. BK20170033).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jinhui Tang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, Z., Tang, J. Semi-supervised local feature selection for data classification. Sci. China Inf. Sci. 64, 192108 (2021). https://doi.org/10.1007/s11432-020-3063-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11432-020-3063-0

Keywords

Navigation