Elsevier

Signal Processing

Volume 64, Issue 3, 26 February 1998, Pages 291-300
Signal Processing

Information–theoretic approach to blind separation of sources in non-linear mixture

https://doi.org/10.1016/S0165-1684(97)00196-5Get rights and content

Abstract

The linear mixture model is assumed in most of the papers devoted to blind separation. A more realistic model for mixture should be non-linear. In this paper, a two-layer perceptron is used as a de-mixing system to separate sources in non-linear mixture. The learning algorithms for the de-mixing system are derived by two approaches: maximum entropy and minimum mutual information. The algorithms derived from the two approaches have a common structure. The new learning equations for the hidden layer are different from the learning equations for the output layer. The natural gradient descent method is applied in maximizing entropy and minimizing mutual information. The information (entropy or mutual information) back-propagation method is proposed to derive the learning equations for the hidden layer.

Zusammenfassung

In den meisten Arbeiten zur blinden Separation von Quellen wird von einem linearen Mischmodell ausgegangen. In dieser Arbeit hingegen wird ein Zweischicht-Perzeptron als System zur Trennung von Quellen bei nichtlinearer Mischung verwendet. Die Lernalgorithmen für das trennende System werden ausgehend von zwei Ansätzen abgeleitet: Maximale Entropie und minimale wechselseitige Information. Die Algorithmen, die von den beiden Ansätzen abgeleitet werden, haben eine gemeinsame Struktur. Die resultierenden Gleichungen für die versteckte Schicht weichen von denen für die Ausgangsschicht ab. Das herkömmliche Gradientenabstiegsverfahren wird dazu verwendet, die Entropie zu maximieren und die wechselseitige Information zu minimieren. Für die versteckte Schicht wird die Methode des Informations-Backpropagation (Entropie oder wechselseitige Information) zur Ableitung der Gleichungen des Lernverfahrens vorgeschlagen.

Résumé

Un modèle linéaire de mélange est supposé dans la plupart des articles dévolus à la séparation aveugle. Un modèle plus réaliste de mélange devrait être non-linéaire. Dans cet article, un perceptron à deux couches est utilisé comme système de séparation de sources dans un mélange non-linéaire. Les algorithmes pour le système de séparation sont dérivés à l’aide de deux approches: entropie maximum et information mutuelle minimum. Les algorithmes dérivés dans ces deux approches ont une structure commune. Les nouvelles équations d’apprentissage pour la couche cachée sont différentes de celles pour la couche de sortie. La méthode du gradient naturelle est appliquée pour la maximisation et la minimisation de l’information mutuelle. La méthode de rétro-propagation de l’information (entropie ou information mutuelle) est proposée pour dériver les équations d’apprentissage de la couche cachée.

Introduction

One of the important issues in signal processing is to find a set of statistically independent components from the observed data by linear or non-linear transformation. Most of the blind separation algorithms are based on the theory of the independent component analysis (ICA) [8]when the mixture model is linear. The idea of the ICA is to find the independent components in the mixture of the statistically independent source signals by optimizing some criteria. Some blind separation algorithms such as those in Refs. 1, 2, 5, 6, 7have the equivariant property when there is no noise in the observation of the mixture. However, for the non-linear mixture model, the linear ICA theory is not applicable and the equivariant property does not hold. The blind separation algorithms for the linear mixture model generally fail to extract the independent sources in non-linear mixture.

The self-organizing map (SOM) has been used to extract sources in non-linear mixture 9, 11, 12. It is a model-free method but suffers from (a) the exponential growth of the network complexity and (b) interpolation error in recovering continuous sources.

An extension of ICA to the separation sources in non-linear mixture is to employ a non-linear function to transform the mixture such that the outputs become statistically independent. However, without limiting the function class for de-mixing transforms, this extension may give statistically independent outputs which tell nothing about sources because any random variable with a continuous distribution function can be transformed to a uniformly distributed random variable and multiple uniform random variables are always independent. This transformation is not unique. To limit a function class for de-mixing functions is equivalent to assume some knowledge about the mixing function. We employ a two-layer perceptron, a parametric model, as a de-mixing system. A two-layer perceptron was also used in Ref. [4]to separate sources in non-linear mixture by the gradient descent method to minimize the mutual information (measure of dependence) of the outputs. Comparing our approach with the approach in Ref. [4], this paper includes the following innovations:

  • 1.

    on-line learning algorithms derived by maximizing entropy and minimizing mutual information,

  • 2.

    the learning algorithms consisting of (a) the learning equations for the hidden-output layer which are the same as those in 2, 3, 6, 15for linear de-mixing system and (b) novel learning equations for input-hidden layer and threshold for the hidden neurons,

  • 3.

    using the natural gradient descent method to optimize cost functions (entropy or mutual information) without constraints.

To estimate the mutual information, instead of using the Taylor expansion of a multi-variable characteristic function [4], we use the Gram–Charlier expansion to approximate the marginal probability density function and obtain a more reliable estimation of the mutual information.

A short version of this paper appeared in Ref. [17]. The non-linear mixture model considered there and in this paper contains crossing non-linearities in the mixture. This model is more general than that in Refs. 10, 14where the non-linear mixture is obtained by operating non-linear functions componentwise on the linear mixture.

Section snippets

Mixture models and de-mixing systems

Let us consider unknown source signals si(t),i=1,…,n, which are mutually independent and stationary. It is assumed that each source has a zero mean and moments of any orders and at most one source is Gaussian.

Before we describe the non-linear mixture model, let us briefly review the linear mixture modelx(t)=As(t),where A is an unknown non-singular mixing matrix, s(t)=[s1(t),…,sn(t)]T and x(t)=[x1(t),…,xn(t)]T. In this paper, the transpose operation is denoted by (·)T. A linear de-mixing system y

Information back-propagation approach

We have explained why the minimum mutual information (MMI) approach can be applied for extracting independent sources in the non-linear mixture when the inverse of the non-linear mixing function is approximated by the two-layer perceptron. We can also apply the maximum entropy (ME) approach in Ref. [3]to train the de-mixing system. When the mixture model is linear, the ME approach is justified by its relation with the MMI approach [16]. The two approaches are generally equivalent around the

The information BP vs the error BP

Blind separation of sources is an unsupervized learning problem. We employ a two-layer perceptron as the de-mixing system to extract the sources in the non-linear mixture. However, we cannot use the well-known error BP method to train this multilayer network since the desired signals are not accessible. Instead, we use information criteria such as entropy and mutual information as error signals to derive the information BP algorithms. The block diagram of the entropy BP is plotted in Fig. 2.

Simulation

Assume that the mixing function f(u) in Fig. 1 is a two-layer perceptron of 5 input neurons and 5 hidden neurons and 5 output neurons:x(t)=A2tanh(A1s(t)),where the source vectors(t)=[sign(cos(2π155t)),sin(2π800t),sin(2π300t+6cos(2π60t)),sin(2π90t),b(t)]T,consisting of four modulated signals and one random source b(t) uniformly distributed in [−1,1]. A1 and A2 are two 5×5 mixing matrices:A1=0.14200.3016−0.38630.25060.4690−0.4347−0.2243−0.31500.40640.20120.35430.35890.09420.09160.4059−0.4545

Conclusions

Assuming the inverse of the non-linear mixing function can be approximated by a two-layer perceptron we employ a two-layer perceptron as a de-mixing system to extract sources in the non-linear mixture. The two blind separation algorithms are derived by using two approaches: minimum entropy and maximum mutual information. The two algorithms are called the entropy BP algorithm and the mutual information BP algorithm or information BP algorithms altogether. The information (entropy or mutual) BP

References (17)

  • G. Burel

    Blind separation of sources: a non-linear neural algorithm

    Neural Networks

    (1992)
  • P. Comon

    Independent component analysis, a new concept

    Signal Process.

    (1994)
  • S. Amari, A. Cichocki, H.H. Yang, Recurrent neural networks for blind separation of sources, in: Proc. NOLTA 1995, vol....
  • S. Amari, A. Cichocki, H.H. Yang, A new learning algorithm for blind signal separation, in: D.S. Touretzky, M.C. Mozer,...
  • A.J. Bell et al.

    An information-maximisation approach to blind separation and blind deconvolution

    Neural Comput.

    (1995)
  • J.-F. Cardoso, B. Laheld, Equivariant adaptive source separation, IEEE Trans. Signal Process. 44 (12) (December 1996)...
  • A. Cichocki, R. Unbehauen, Robust neural networks with on-line learning for blind identification and blind separation...
  • A. Cichocki, R. Unbehauen, L. Moszczyński, E. Rummert, A new on-line adaptive learning algorithm for blind separation...
There are more references available in the full text version of this article.

Cited by (124)

  • Color calibration and correction applying linear interpolation technique for color fringe projection system

    2016, Optik
    Citation Excerpt :

    However, their approach requires physically switch of the lenses, which then makes the measurement speed be largely limited. In [22], the IBP (Information Back Propagation) [23] method was proposed to implement color correction. And a decoupling network was built to invert the color leaking process and reduce the color crosstalk between each channel.

  • Nonlinear mixtures

    2010, Handbook of Blind Source Separation
  • Modern Signal Processing

    2022, Modern Signal Processing
  • Implementation of Blind Source Separation Based on Improved Variable Step-Size Natural Gradient Algorithm

    2022, 2022 7th International Conference on Signal and Image Processing, ICSIP 2022
View all citing articles on Scopus
View full text