skip to main content
10.1145/3453800.3453820acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlscConference Proceedingsconference-collections
research-article

Graph-based Kullback-Leibler Divergence Minimization for Unsupervised Feature Selection

Published: 18 June 2021 Publication History

Abstract

We live in an era of big data, in which feature selection technology is getting more and more attention. Feature selection technology is one of the important methods to reduce the dimension of data. It can select some useful features for learning tasks. The traditional feature selection methods mainly select the useful features by the scores of the features under a certain standard. However, the performance of these methods are less satisfactory in many cases because they ignore the correlation between features. For this article, we present a new unsupervised method by minimizing the Kullback-Leibler(KL) divergence based on graph matching. Firstly, we extract manifold structures from all features of the original data space by using non-negative Local Linear Embedding(NNLLE). Then, we extract manifold structure of each feature by using non-negative local linear embedding (NNLLE). We assess the importance of every feature by minimizing the KL-divergence between the graphs using all features and weighted linear combination of base graphs on each individual feature. At the same time, a global optimization algorithm based on proximal gradient descent framework is proposed. Experiments show that the proposed method is better than many existing unsupervised methods.

References

[1]
Stephen Boyd and Lieven Vandenberghe. 2004. Convex Optimization.
[2]
D. Cai, C. Zhang, and X. He. 2010. Unsupervised feature selection for multi-cluster data. In SIGKDD. 333–342.
[3]
Liang Du, Xiaolin Lv, Chaohong Ren, and Yan Chen. 2019. A Filter-based Unsupervised Feature Selection Method via Improved Local Structure Preserving. Proceedings - 2019 5th International Conference on Big Data and Information Analytics, BigDIA 2019, 162 – 169.
[4]
Liang Du, Chaohong Ren, Xiaolin Lv, Yan Chen, Peng Zhou, and Zhiguo Hu. 2019. Local graph reconstruction for parameter free unsupervised feature selection. IEEE Access 7(2019), 102921–102930.
[5]
Liang Du, Xin Ren, Peng Zhou, and Zhiguo Hu. 2020. Unsupervised Dual Learning for Feature and Instance Selection. IEEE Access 8(2020), 170248–170260.
[6]
Liang Du and Yi-Dong Shen. 2013. Joint clustering and feature selection. (2013), 241–252.
[7]
Liang Du and Yi-Dong Shen. 2013. Joint clustering and feature selection. In International conference on web-age information management. 241–252.
[8]
Mingyu Fan, Xiaojun Chang, Xiaoqin Zhang, Di Wang, and Liang Du. 2017. Top-k Supervise Feature Selection via ADMM for Integer Programming. In IJCAI. 1646–1653.
[9]
Nannan Gu, Mingyu Fan, Liang Du, and Dongchun Ren. 2015. Efficient sequential feature selection based on adaptive eigenspace model. Neurocomputing 161(2015), 199–209.
[10]
Quanquan Gu, Zhenhui Li, and Jiawei Han. 2011. Linear Discriminant Dimensionality Reduction.
[11]
X. He, D. Cai, and P. Niyogi. 2006. Laplacian score for feature selection. Neural information processing systems 18 (2006), 507–514.
[12]
Rongyao Hu, Xiaofeng Zhu, Debo Cheng, Wei He, Yan Yan, Jingkuan Song, and Shichao Zhang. 2017. Graph self-representation method for unsupervised feature selection. Neurocomputing 220(2017), 130–137.
[13]
Xuan Li, Yi-Dong Shen, Liang Du, and Chen-Yan Xiong. 2010. Exploiting novelty, coverage and balance for topic-focused multi-document summarization. In Proceedings of the 19th ACM international conference on Information and knowledge management. 1765–1768.
[14]
Feiping Nie, Sheng Yang, Rui Zhang, and Xuelong Li. 2018. A General Framework for Auto-Weighted Feature Selection via Global Redundancy Minimization. IEEE Transactions on Image Processing PP, 99 (2018), 1–1.
[15]
Z Rached, F Alajaji, and L.L Campbell. 2004. The Kullback-Leibler divergence rate between Markov sources.TIT 50, 5 (2004), 917–921.
[16]
S.T. Roweis and L.K. Saul. 2000. Nonlinear dimensionality reduction by locally linear embedding. Science 290, 5500 (2000), 2323–2326.
[17]
Lei Shi, Liang Du, and Yi-Dong Shen. 2014. Robust Spectral Learning for Unsupervised Feature Selection. In ICDM. 977–982.
[18]
De Wang, Feiping Nie, and Heng Huang. 2015. Feature selection via global redundancy minimization. TKDE 27, 10 (2015), 2743–2755.
[19]
Shiping Wang, Witold Pedrycz, Qingxin Zhu, and William Zhu. 2015. Subspace learning for unsupervised feature selection via matrix factorization. Pattern Recognition 48, 1 (2015), 10–19.
[20]
L. Wolf and A. Shashua. 2005. Feature selection for unsupervised and supervised inference: The emergence of sparsity in a weight-based approach. JMLR 6(2005), 1855–1887.
[21]
Zenglin Xu, Irwin King, MR-T Lyu, and Rong Jin. 2010. Discriminative semi-supervised feature selection via manifold regularization. TNNLS 21, 7 (2010), 1033–1047.
[22]
Chao Yao, Ya-Feng Liu, Bo Jiang, Jungong Han, and Junwei Han. 2017. LLE score: A new filter-based unsupervised feature selection method based on nonlinear manifold embedding and its application to image recognition. TIP 26, 11 (2017), 5257–5269.
[23]
Jun Yin, Weiming Zeng, and Lai Wei. 2019. Optimal feature extraction methods for classification methods and their applications to biometric recognition. KBS 99, 1 (2019), 112–122.
[24]
Lihi Zelnik-Manor and Pietro Perona. 2005. Self-tuning spectral clustering. In NIPS. 1601–1608.
[25]
Rui Zhang, Feiping Nie, Yunhai Wang, and Xuelong Li. 2019. Unsupervised Feature Selection via Adaptive Multimeasure Fusion. TNNLS (2019).
[26]
Yan Zhang, Zhao Zhang, Sheng Li, Jie Qin, Guangcan Liu, Meng Wang, and Shuicheng Yan. 2018. Unsupervised Nonnegative Adaptive Feature Extraction for Data Representation. TKDE (2018).
[27]
Hua Zhao, Liang Du, Jianglai Wei, and Yalong Fan. 2020. Local Sensitive Dual Concept Factorization for Unsupervised Feature Selection. IEEE Access 8(2020), 133128–133143.
[28]
Zheng Zhao and Huan Liu. 2007. Semi-supervised Feature Selection via Spectral Analysis. In SDM. 641–646.
[29]
Peng Zhou, Liang Du, Mingyu Fan, and Yi-Dong Shen. 2015. An LLE based Heterogeneous Metric Learning for Cross-media Retrieval. In SDM. 64–72.
[30]
Peng Zhou, Liang Du, Mingyu Fan, and Yi-Dong Shen. 2015. An LLE based heterogeneous metric learning for cross-media retrieval. In Proceedings of the 2015 SIAM international conference on data mining. 64–72.
[31]
Peng Zhou, Liang Du, Xuejun Li, Yi-Dong Shen, and Yuhua Qian. 2020. Unsupervised Feature Selection with Adaptive Multiple Graph Learning. Pattern Recognition (2020), 107375.
[32]
Wei Zhou, Chengdong Wu, Yugen Yi, and Guoliang Luo. 2017. Structure Preserving Non-negative Feature Self-Representation for Unsupervised Feature Selection. IEEE Access 5(2017), 8792–8803.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICMLSC '21: Proceedings of the 2021 5th International Conference on Machine Learning and Soft Computing
January 2021
178 pages
ISBN:9781450387613
DOI:10.1145/3453800
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 June 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Feature Graph
  2. KL-divergence Minimization
  3. Unsupervised Feature Selection

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ICMLSC '21

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 90
    Total Downloads
  • Downloads (Last 12 months)16
  • Downloads (Last 6 weeks)1
Reflects downloads up to 03 Mar 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media