Elsevier

Neurocomputing

Volume 149, Part B, 3 February 2015, Pages 483-489
Neurocomputing

Letters
Blind identification of the underdetermined mixing matrix based on K-weighted hyperline clustering

https://doi.org/10.1016/j.neucom.2014.08.026Get rights and content

Highlights

  • We propose a discriminatory clustering algorithm for hyperline identification.

  • A weighting scheme is employed for the discriminatory clustering.

  • The proposed method is suitable for the multiple dominant SCA problem.

Abstract

Blind identification of the underdetermined mixing matrix is an emerging problem in the area of sparse component analysis (SCA). Traditionally, the K-hyperLine clustering (K-HLC) learning algorithm is employed to solve it, but this method is designed under the strict sparse assumption on the source signals. In order to deal with the blind identification problem of multiple dominant, a discriminatory clustering algorithm, K-weighted hyperline clustering (K-WHLC), is developed via weighting scheme. The Gaussian membership function is offered as the weight factor for the hyperline clustering, together with an optimal selection in relation to the involved threshold. As shown in the paper, this discriminatory clustering scheme is efficient and especially suitable for the hyperline identification against the multiple dominant SCA problem. Also, the developed algorithm has higher accuracy than the traditional K-HLC, and it is with lower computational cost in the medium or large-scale problem. Numerical simulations are provided finally to verify the advantages of our clustering scheme.

Introduction

In recent years, sparse component analysis (SCA) [1], [2], [3] has been applied to many fields, such as image processing, electromagnetic and biomagnetic imaging, speech separation, compressed sensing and so on [4], [5], [6], [7], [8]. The typical linear SCA model isx(t)=As(t)+e(t)orX=AS+E,where X=[x(1),x(2),,x(T)]Rm×T denotes the observation matrix, A=[α1,,αn]Rm×n denotes the unknown mixing matrix, S=[s(1),,s(T)]Rn×T denotes the source matrix and E=[e(1),,e(T)]Rm×T denotes noises matrix. In general, any m columns of A are assumed to be linearly independent and each column is normalized to be one, i.e., αi22=1,i=1,2,,n. The task of SCA is to identify the mixing matrix and recover sources only using the knowledge of observed samples. If the number of observed sample is less than the number of sources (m<n), then system (1) is called the underdetermined mixing system. In this case, the sparsity assumption is usually treated as a necessary additional condition to SCA. A two-stage “clustering-then-l1-optimization” approach is often used in SCA, i.e., clustering to identify the mixing matrix first and then using the sparsity based optimization method to recover the sources [9]. In this paper, we focus on the blind identification problem of the mixing matrix estimation in SCA.

Most existing linear orientation-based algorithms [10], [11], such as K-SVD algorithm, mainly discuss the single dominant SCA problem, where the sources are assumed to be sparse enough to satisfy the disjoint orthogonality condition [12], i.e., only one entry of s(t) is nonzero but the other entries are zeros in each time instant. Moreover, He et al. [13] designed a so-called K-hyperline clustering (K-HLC) learning algorithm to improve the performance of K-SVD algorithm [14], [15], in which the procedure of mixing matrix identification is composed of two stages: hyperline identification and hyperline number detection. For the first stage, the K-means clustering method [16] cooperated with the eigenvalue decomposition (EVD) is used to cluster and find each hyperline from the corresponding cluster set; for the second stage, an eigenvalue gap-based detection method [17], [18], [19] is employed to ensure the true number of sources. Such K-HLC algorithm has excellent performance in the sufficient sparse SCA case, but it can hardly be used in multiple dominant SCA directly. Yet, it is correlated to the selection of parameter K, i.e., overestimated number of hyperlines, which is difficult to choose a prior. If K is selected too small, it may fail to detect the correct number of hyperlines. Thus, the K-HLC uses a multi-layer clustering scheme with large K to enhance the robustness of the algorithm. Unfortunately, it not only increases the computational cost but also requires large storage space, especially to the large-scale problem.

Regarding the multiple dominant SCA problem, a series subspace-based direction finding methods [20], [21] are presented. Among these works, the observed samples are assumed to be concentrated around q-dimensional subspaces which are spanned by a set of q mixing vectors, such that the mixing matrix can be identified by searching out all the concentration subspaces. The challenge of these algorithms is that the computational time grows exponentially with the increase of the problem scale. Recently, Naini et al. [22] proposed an improved subspace-based clustering algorithm, namely partial q-dimensional sparse subspace clustering. They try to estimate the mixing matrix only by using partial selected concentration subspaces instead of all the concentration subspaces. Such strategy could reduce the computational burden, partly. However, the exponential growth computation problem is still far from being fully solved.

In this paper, we propose a discriminatory clustering algorithm cooperated with weighting scheme to the hyperlines identification. We relax the restriction of strict sparse sources to the multiple dominant signals case [9], namely q-sparse (q>1) sources. Then, based on the number of active components of sources in each instant, the observed samples are classified into two major types, namely single dominant samples and multiple dominant samples. Inspired by the intuitive feature that the hyperline directions are illustrated by the single dominant samples from the scatter plot of data samples, we design a weighting scheme to improve the efficiency of hyperline clustering to exploit features of single dominant samples. Specifically, we utilize the Gaussian membership function as the weight factor, which has been widely applied in image recognition [23] and the fuzzy clustering literatures [24]. Moreover, we demonstrate that when the tolerance parameter of Gaussian weight factor is selected close to the level of noise variance, the single dominant samples are persevered while the multiple dominant samples are suppressed maximally. As a result, the multiple dominant SCA problem is converted into a general single dominant SCA problem, which can be solved by the hyperline clustering method without involving any design of subspace-based clustering. The advantages of our scheme are also verified in the simulation results.

The framework of this paper is organized as follows. First, we give a short introduction of the K-HLC algorithm in Section 2. Then we offer a complete framework of K-weighted hyperline clustering (KWHLC) in Section 3. Later, various simulation experiments are offered to demonstrate the validity of K-WHLC in Section 4. Finally, the paper is concluded and discussed in Section 5.

Section snippets

Background of K-hyperlines clustering

In this section, we review the K-hyperline clustering learning algorithm for sufficient sparse SCA problem. The sparse components of s(t) satisfy disjoint orthogonality condition, i.e., si(t)sj(t)0 (i,j{1,2,,n}). Under this strict assumption, the clustering problem can be converted into solving the following optimization problem [13]:minlk,Ωk,k=1,,KJ(lk,Ωk,k=1,,K)=t=1Tk=1Kd2(x(t),lk)×ItΩk,where hyperline lk is assumed to be normalized, that is, lk22=1; the Euclid distance between

Assumptions and system model

In this paper, the linear system (1) is based on the following three assumptions:

  • (A1)

    For any i{1,,n}, source component si is assumed to be statistically independent with N(0,σs2); for any j{1,,m}, noises entry ej is treated to be statistically independent with N(0,σe2), where σe2⪯¡σs2. Any si and ej are assumed to be mutually independent.

  • (A2)

    Sources are assumed to be q-sparse (1qm1) signals, that is, the active number of source components at each instant equals to or less than the number of q.

  • (A3)

Numerical results

In this section, we offer a series of experiments to test the validity of our proposed K-WHLC algorithm. All the simulations are implemented in Matlab 7.1 environment and are run on an Intel Core Duo T6670, 2.2 GHz processor with VRAM 512 MB under the Microsoft Windows XP operating system. The precision of the estimation algorithms are measured in terms of Signal to Interference Ratio (SIR) bySIR(A,A^)=20log10(1ni=1nminj=1,,nmin{αiα^jF,αi+α^jF}αiF)(dB),where ·F is the Frobenius norm,

Conclusions

In this paper, we proposed a new hyperline clustering algorithm, namely K-WHLC to the general SCA problem. A discriminatory clustering via weighting scheme has been employed to improve the performance of the mixing matrix estimation. A principal of the selection of tolerance parameter for the weight factor has been developed, facilitating optimal hyperline clustering performance. The K-WHLC is a general hyperline clustering method where sources are no longer required to be sufficiently sparse.

Acknowledgments

This work was supported in part by the Natural Science Foundation of Guangdong Province (S2011030002886 and S2012010008813), and in part by the projects of Science and Technology of Guangzhou (2014J4100209).

Jun-Jie Yang received the B.S. degree from Guangdong University of Technology, Guangzhou, China, in 2011. Currently he is working towards his Ph.D. degree at pattern recognition from the School of Automation of Guangdong University of Technology. His current research interests include blind signal processing, physical layer security and smart grid cyber-security.

References (27)

  • H.L. Liu et al.

    On blind source separation using generalized eigenvalues with a new metric

    Neurocomputing

    (2008)
  • Z.S. He et al.

    K-hyperline clustering learning for sparse component analysis

    IEEE Trans. Signal Process.

    (2009)
  • M. Lavielle

    Bayesian deconvolution of Bernoulli–Gaussian processes

    Signal Process.

    (1993)
  • F. Georgiev et al.

    Sparse Component Analysis: A New Tool for Data Mining

    Data Mining in Biomedicine, Series: Optimization and Its Applications, Springer

    (2007)
  • Y.Q. Li et al.

    Underdetermined blind source separation based on sparse representation

    IEEE Trans. Signal Process.

    (2006)
  • A. Cichocki et al.

    Adaptive Blind Signal and Image Processing: Learning Algorithms and Applications

    (2003)
  • Z.S. He et al.

    Convolutive blind source separation in frequency domain based on sparse representation

    IEEE Trans. Audio Speech Lang. Process.

    (2007)
  • Y.Q. Li et al.

    Blind estimation of channel parameters and source components for EEG signalsa sparse factorization approach

    IEEE Trans. Neural Netw.

    (2006)
  • I.F. Gorodnitsky et al.

    Sparse signal reconstruction from limited data using FOCUSSa re-weighted minimum norm algorithm

    IEEE Trans. Signal Process.

    (1997)
  • D.L. Donoho

    Compressed sensing

    IEEE Trans. Inf. Theory

    (2006)
  • P. Georgiev et al.

    Sparse component analysis and blind source separation of underdetermined mixtures

    IEEE Trans. Neural Netw.

    (2005)
  • D.L. Donoho et al.

    Maximal sparsity representation via l1 minimization

    Proc. Natl. Acad. Sci.

    (2003)
  • F. Theis et al.

    Linear geometric ICAfundamentals and algorithms

    Neural Comput.

    (2003)
  • Cited by (15)

    • Underdetermined blind separation of source using l<inf>p</inf>-norm diversity measures

      2020, Neurocomputing
      Citation Excerpt :

      In addition, the estimation of mixing matrix is very significant for source reconstruction. It is well known that K-means is a simple and easy clustering method, which has been widely used to estimate the mixing matrix [21]. Zhou et al. [22] present a new mixing matrix estimation method by using a nonlinear projection and a masking procedure.

    • Novel mixing matrix estimation approach in underdetermined blind source separation

      2016, Neurocomputing
      Citation Excerpt :

      The number of clusters is equal to the number of sources, and the cluster center is calculated for each category of data to estimate the mixing matrix. Further, a cluster center modification algorithm based on Hough transform is developed to improve the accuracy of the estimated mixing matrix [22–25]. The paper is organized as follows.

    View all citing articles on Scopus

    Jun-Jie Yang received the B.S. degree from Guangdong University of Technology, Guangzhou, China, in 2011. Currently he is working towards his Ph.D. degree at pattern recognition from the School of Automation of Guangdong University of Technology. His current research interests include blind signal processing, physical layer security and smart grid cyber-security.

    Hai-Lin Liu received the Ph.D. degree in control theory and engineering from South China University of Technology, Guangzhou, China, in 2002, and the MS degree in applied mathematics from Xidian University, Xi an, China, in 1989. He is currently a Professor of the School of Applied Mathematics at the Guangdong University of Technology. His research interests include evolutionary computation and optimization, blind signal processing.

    View full text