Elsevier

Neurocomputing

Volume 219, 5 January 2017, Pages 350-363
Neurocomputing

Unsupervised feature selection via Diversity-induced Self-representation

https://doi.org/10.1016/j.neucom.2016.09.043Get rights and content

Abstract

Feature selection is to select a subset of relevant features from the original feature set. In practical applications, regarding the unavailability of an amount of the labels is still a challenging problem. To overcome this problem, unsupervised feature selection algorithms have been developed and achieve promising performance. However, most existing approaches consider only the representativeness of features, but the diversity of features which may lead to the high redundancy and the losses of valuable features are ignored. In this paper, we propose a Diversity-induced Self-representation (DISR) based unsupervised feature selection method to effectively select the features with both representativeness and diversity. Specifically, based on the inherent self-representation property of features, the most representative features can be selected. Meanwhile, to preserve the diversity of selected features and reduce the redundancy of the original features as soon as possible, we introduce a novel diversity term, which adjusts the weights of selected features by incorporating the similarities between features. We then present an efficient algorithm to solve the optimization problem by using the inexact Augmented Lagrange Method (ALM). Finally, both clustering and classification tasks are used to evaluate the proposed method. Empirical results on the synthetic dataset and nine real-world datasets demonstrate the superiority of our method compared with state-of-the-art algorithms.

Introduction

The high-dimensional data is ubiquitous in many areas, such as computer vision, pattern recognition and data mining. It not only significantly increases the computation and storage cost, but also induces overfitting and incomprehensible models. To overcome these issues, feature selection techniques have been considered as a type of effective method to reduce the dimensionality by removing irrelevant and redundant features. The aim of feature selection is to obtain a subset of features by removing the noise and redundancy in original features, so that the more intrinsic representation of data and the better performance are achieved [1].

According to the availability of the label information, feature selection approaches can be categorized into supervised methods [2], [3], [4], [5] and unsupervised methods [6], [7], [8], [9], [10], [11]. Compared with supervised methods, unsupervised feature selection methods aim at selecting relevant features without label information. Since the labeling of samples is usually expensive, we cannot always obtain the labels beforehand. Thus, unsupervised feature selection holds great potential in real-world applications.

The early unsupervised feature sePh.D.lection algorithms consider feature ranking techniques as the principle criteria for feature selection [8], [12], [13], [14], [15]. One of the main limitations of these methods is that they treat features independently without considering possible correlations among features. To address this problem, a series of algorithms [10], [16], [17] have been developed. A typical method is spectral clustering based algorithms, which can select a feature subset with preserving the underlying structure between clusters. Spectral clustering based methods explore the cluster structure of data using matrix factorization for spectral analysis, and then select features via sparsity regularization models. Nevertheless, they heavily rely on the learned graph Laplacian. Noises in features may lead to their unreliability. Recently, the self-representation technique has shown significant potential in many tasks, such as subspace clustering [18], [19] and active learning [20], [21]. Motivated by this, some researchers consider the feature selection from the perspective of self-representation property of features [22], i.e., each feature can be well represented by the linear combination of its relevant features. However, they mainly consider selecting the representative features while ignoring the diversity among them. Both representativeness and diversity properties are very important for selecting the effective features: (1) the ideal selected features are supposed to represent the whole original features. That is to say, the highly irrelevant features are discarded, and meanwhile the most relevant features are preserved. (2) The ideal selected features should be diverse enough to capture not only important (representative) but also comprehensive information from features. By considering the diversity property, we can capture more information of data, because features usually describe different aspects of data. (3) The diversity property implies that the very similar features should not be selected simultaneously, so that the redundancy can be greatly reduced. Therefore, there is a great need for integrating both representativeness and diversity properties of features for feature selection.

In this paper, considering both representativeness and diversity properties of features, we propose a novel method, called Diversity-induced Self-representation (DISR) for unsupervised feature selection. Specifically, based on self-representation property, i.e., each feature can be well approximated by the linear combination of its relevant features, the most representative features can be selected. Meanwhile, by incorporating the similarities between features to adjust the weights of being selected, we introduce a diversity term to reduce the redundancy. Then, an efficient optimization algorithm is provided by using the inexact Augmented Lagrange Method (ALM).

Finally, we evaluate our method in both clustering and classification tasks. Experimental results on the synthetic dataset and nine real-world datasets show that the proposed DISR has a better performance than other compared methods.

To summarize, the main contributions of this paper are as follows:

  • A novel Diversity-induced Self-representation (DISR) for unsupervised feature selection algorithm is proposed. The algorithm considers both the representativeness and diversity properties of features, and hence it can select more valuable features.

  • A diversity term is introduced to the method. This term is the measure of diversity of all the features. Thus, the diversity of features is used to guide for feature selection.

  • An iterative algorithm based on inexact ALM is proposed to efficiently solve the optimization model. Experimental results demonstrate that the superiority of our algorithm compared with state-of-the-art algorithms.

The rest of the paper is organized as follows. A brief review of the related works on unsupervised feature selection is given in Section 2. Section 3 introduces the proposed DISR algorithm, and Section 4 describes the optimization of our algorithm in details. Extensive experiments on synthetic and real-world datasets are presented in Section 5. Finally, Section 6 concludes this paper.

Section snippets

Related work

Recently, many unsupervised feature selection methods have been proposed. These methods can be roughly divided into three categories: filter, wrapper, and embedded methods. Because our work belongs to the embedded method, we just briefly review the filter and wrapper methods first, and then review the related works on embedded methods in more details. Filter methods use feature ranking techniques as the principle criteria for feature selection due to their simplicity and practical success

Diversity-induced Self-representation for unsupervised feature selection

In this section, we first introduce the unsupervised feature selection based on regularized self-representation, and then provide our novel unsupervised feature selection method via Diversity-induced Self-representation.

Optimization

So far, the method based on the Diversity-induced Self-representation has been proposed. Obviously, it is hard to find the global optimizers since two non-smooth terms are involved. Thus, we employ the inexact augmented Lagrange method [40] to optimize each variable iteratively.

Experiments

In this section, we demonstrate the effectiveness of the proposed DISR on the synthetic dataset, and then evaluate the performance of DISR in both clustering and classification tasks on the real-world datasets.

Conclusion

In this paper, we propose a novel unsupervised feature selection method via Diversity-induced Self-representation, called DISR, which can select the features with both representativeness and diversity. By incorporating the similarities between features to adjust the weights of being selected, a novel diversity term is designed to eliminate redundancy among selected features. To solve the proposed optimization problem, an efficient optimization algorithm is presented by using ALM method.

Yanbei Liu received the B.E. degree from Zhengzhou University of Light Industry, Zhengzhou, China, in 2009 and the M.E. degree from Tianjin Polytechnic University, Tianjin, China, in 2012. He is currently pursuing the Ph.D. degree from the School of Electronic Information Engineering, Tianjin University, Tianjin, China. His current research interests include machine learning, subspace learning, pattern recognition.

References (51)

  • P. Mitra et al.

    Unsupervised feature selection using feature similarity

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2002)
  • D. Liang, Z. Shen, L. Xuan, Z. Peng, Y.D. Shen, Local and global discriminative learning for unsupervised feature...
  • X. He, D. Cai, P. Niyogi, Laplacian score for feature selection, in: Advances in Neural Information Processing Systems,...
  • D. Cai, C. Zhang, X. He, Unsupervised feature selection for multi-cluster data, in: Proceedings of the International...
  • Z. Zhao, H. Liu, Spectral feature selection for supervised and unsupervised learning, in: Proceedings of the...
  • F. Nie, S. Xiang, Y. Jia, C. Zhang, S. Yan, Trace ratio criterion for feature selection, in: Association for the...
  • P. Jing, Y. Su, C. Xu, L. Zhang, Hyperssr: a hypergraph based semi-supervised ranking method for visual search...
  • Z. Li, Y. Yang, J. Liu, X. Zhou, H. Lu, Unsupervised feature selection using nonnegative spectral analysis, in:...
  • Y. Yang, H.T. Shen, Z. Ma, Z. Huang, X. Zhou, ℓ2,1-norm regularized discriminative feature selection for unsupervised...
  • E. Elhamifar et al.

    Sparse subspace clusteringalgorithm, theory, and applications

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2013)
  • C. Lu, J. Feng, Z. Lin, S. Yan, Correlation adaptive subspace segmentation by trace lasso, in: Proceedings of the...
  • F. Nie, H. Wang, H. Huang, C.H. Ding, Early active learning via robust representation and structured sparsity, in:...
  • Y. Hu, D. Zhang, Z. Jin, D. Cai, X. He, Active learning via neighborhood reconstruction, in: Proceedings of the...
  • W. Krzanowski

    Selection of variables to preserve multivariate data structure, using principal components

    Appl. Stat.

    (1987)
  • C. Constantinopoulos et al.

    Bayesian feature and model selection for gaussian mixture models

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2006)
  • Cited by (34)

    • Unsupervised feature selection via self-paced learning and low-redundant regularization

      2022, Knowledge-Based Systems
      Citation Excerpt :

      Given the fact that redundancy between features plays an important role in feature selection, it is necessary to introduce a regularization term to eliminate its negative impact. For this reason, Liu et al. proposed a novel term taking the pairwise similarity of features into consideration [23]. In order to achieve low redundancy between features, we add the pairwise similarity as regularizer to the framework of subspace learning.

    View all citing articles on Scopus

    Yanbei Liu received the B.E. degree from Zhengzhou University of Light Industry, Zhengzhou, China, in 2009 and the M.E. degree from Tianjin Polytechnic University, Tianjin, China, in 2012. He is currently pursuing the Ph.D. degree from the School of Electronic Information Engineering, Tianjin University, Tianjin, China. His current research interests include machine learning, subspace learning, pattern recognition.

    Kaihua Liu received the B.E. degree in 1981, M.E. degree in 1991, and Ph.D. degree in 1999 from Tianjin University, Tianjin, China. Currently, he is a Professor at the School of Electronic Information Engineering, Tianjin University, Tianjin, China. His current research interests include radio frequency identification theory and application, digital signal processing theory and application, pattern recognition.

    Changqing Zhang received the B.S. and M.E. degrees in computer science from Sichuan University in 2005 and 2008, and the Ph.D. degree from Tianjin University in 2016, respectively. He is currently an Assistant Professor with Tianjin University. His current research interests include machine learning, data mining, and computer vision.

    Jing Wang is currently pursuing the Ph.D. degree from the Faculty of Science and Technology, Bournemouth University, UK. Before that, she received the M.E. degree from City University of Hong Kong, China. Her current research interests include machine learning and data mining, such as nonnegative matrix factorization, subspace clustering and semi-supervised learning.

    Xiao Wang received M.E. degree from Henan University, Kaifeng, China, in 2012 and the Ph.D. degree from the School of Computer Science and Technology, Tianjin University, Tianjin, China, in 2016. He is currently a postdoctoral in Department of Computer Science and Technology, Tsinghua University, Beijing, China. He got the China Scholarship Council Fellowship in 2014 and visited Washington University in St. Louis, USA, as a joint training student from Nov. 2014 to Nov. 2015. His current research interests include complex network analysis, machine learning, and data mining.

    View full text