Elsevier

Signal Processing

Volume 91, Issue 5, May 2011, Pages 1150-1156
Signal Processing

Estimating source kurtosis directly from observation data for ICA

https://doi.org/10.1016/j.sigpro.2010.11.002Get rights and content

Abstract

Based on the fourth order blind identification (FOBI) in ICA methods, this paper shows that kurtosis of each independent source component can be estimated with eigenvalues of two relative matrices directly from observation data. With the new approach, the total number of super-Gaussian components in noiseless observation data can be directly calculated before any de-mixing algorithm is performed. As an application, a new switching criterion for the extended infomax algorithm is presented. Experimental results demonstrate the effectiveness of the kurtosis estimation algorithm and the new alternative switching criterion.

Introduction

Independent component analysis (ICA) is a statistical model where the observed data is expressed as linear combination of underlying latent variables. The latent variables are assumed non-Gaussian and mutually independent. The main challenge in ICA is to find both the latent variables and the mixing process. Noiseless linear model is often used in ICA, which is based on the following assumption:x=As,where A is an n×n mixing matrix (not known), s is the original source vector of independent latent variables called the independent components, and x is the n×1 vector of observed random variables.

ICA has been exploited for many years by various researchers, and a number of results can be seen in existing literature. For a detailed review, readers are referred to two monographes [1], [2]. In many ICA algorithms, solutions based on measurements of non-Gaussianity are well established and understood, and there are at least two kinds of approaches that involve the estimation of kurtosis or its sign. One of the two approaches is based on information theory and is equivalent to a minimization of the following objective function:J(W)=H(y)i=1nPW(yi;W)logPi(yi)dyi,where H(y)=P(y)logP(y)dy is the entropy of y, Pi(yi) is the pre-determined model probability density function (pdf), and PW(yi;W) is the pdf based on y=Wx. By means of optimization, the estimations of the de-mixing matrix W and the original signal whose components are statistically independent can be obtained.

As was well known, any steepest gradient descent learning algorithm, such as the natural or relative gradient algorithm [3], [4], works only in the case that the components of s are all either super- or sub-Gaussians. If one separates observation data mixed with both super- and sub-Gaussian components by matching model pdfs with J(W) in Eq. (2), the key issue becomes is how one chooses the model pdf Pi(yi). For the general ICA method, based on experiments, Xu et al. have summarized the one-bit-matching principle which states that “all the sources can be separated as long as there is a one-to-one same-sign-correspondence between the kurtosis signs of all source pdf's and the kurtosis signs of all model pdf's” [5]. Although proofs of the conjecture are still open under general assumptions, a large number of experiments have been conducted to demonstrate its correctness.

On the other hand, some ICA algorithms exploits directly the estimation of sample kurtosis as the objective function [1], [2], [6], [7], [8]—they are known as the deflation method. This approach can achieve blind source separation by extracting the sources one by one. Recently, the study of deflation method has been extended to taking into account of some prior knowledge about the sources [9], [10], or the objective function [11].

Kurtosis is the simplest statistical quantity which is often used as a measure of the non-Gaussianity of a random variable. The sign of kurtosis determines the random variable to be the super- or sub-Gaussian distribution. In many ICA algorithms, the act is to direct the choice of nonlinearities or to lead the algorithm to maximizing or minimizing. Generally speaking, the estimation of kurtosis is often embedded in some algorithm, and in the operation of the algorithm, the estimation of kurtosis and the de-mixing process are alternately finished by the definition or some distribution with the given algorithm. Since the number of super-Gaussian components in observation data is not known beforehand, many de-mixing algorithms become complicated and are extensive in computational cost; e.g., the extended infomax algorithm [12] and the gradient algorithm using kurtosis [2]. Alternatively, another approach of estimating the kurtosis (or its sign) of two model pdfs can be implemented by introducing the general Gaussian distribution [13], [14].

In this paper, we first prove a theorem on estimating kurtosis of components in the mixed source directly from noiseless observation data not depending on any de-mixing algorithm. Then, using the theorem, we present an algorithm which can be used to estimate kurtosis of components in original sources from whitening observation data before any de-mixing operation is carried out. Furthermore, we propose a new alternative switching criterion for the extended infomax algorithm as an application of the algorithm presented in this paper. We shall also present some experimental results that demonstrate the effectiveness of the algorithm proposed in this paper. We emphasize that the new alternative criterion is as good as the original one, while the computation cost can be reduced considerably.

Section snippets

The kurtosis estimation for source components

The kurtosis of a random variable Y (assuming E(Y)=0) is defined asκ4=E(Y4)3(E(Y2))2.In this section, based on the fourth order blind identification [4], we firstly prove a theorem which expresses the kurtosis of independent source components with eigenvalues of some relative matrices. By the theorem, an algorithm of estimating source kurtosis directly from the noiseless observed data is then presented.

Theorem 1

Suppose that the observation data arex=As,where s is an n×1 source vector, A is an n×n

Numerical experiments

From the algorithm obtained in the above section, we know that the estimation for kurtosis of each component in mixture data can be obtained by the calculation of eigenvalues of E(v¯v¯T) and E(Sv¯v¯T) with Eq. (17). We conducted the following experiments to test the validity and reliability of the algorithm.

We selected two groups of random variables which belong to super- and sub-Gaussian distributions respectively. With different kurtoses, they are listed in Table 1.

Ten datum sets were

An application

Based on the algorithm, we present a new switching criterion for the extended infomax algorithm proposed by Lee et al.

By selecting φ(u)=u±tanh(u) for super- and sub-Gaussian sources, Lee et al. give the natural gradient learning algorithm with a switching criterion [12]:ΔW[InKtanh(u)uTuuT],where K=diag[k1,k2,,kn] is an n×n diagonal matrix with ki=1 and −1 for super- and sub-Gaussian of model pdfs, respectively. According to the sufficient condition for the asymptotic stability of the

Conclusion

With the help of the assumption of the source signal components' mutual independence, we have proved that the source component kurtosis can be estimated directly from observation data. Then, the kurtosis estimation algorithm is proposed. It is demonstrated by experiments that the algorithm is valid and reliable. As the application of the algorithm, we propose the new alternative switching criterion for the extended infomax algorithm. In contrast with known switching criteria, the new criterion

Acknowledgments

Author wishes to gratefully thank the anonymous reviewers and the associate editor who provided insightful and helpful comments.

References (19)

There are more references available in the full text version of this article.
View full text