Information–theoretic approach to blind separation of sources in non-linear mixture

doi:10.1016/S0165-1684(97)00196-5

Signal Processing

Volume 64, Issue 3, 26 February 1998, Pages 291-300

https://doi.org/10.1016/S0165-1684(97)00196-5 Get rights and content

Abstract

The linear mixture model is assumed in most of the papers devoted to blind separation. A more realistic model for mixture should be non-linear. In this paper, a two-layer perceptron is used as a de-mixing system to separate sources in non-linear mixture. The learning algorithms for the de-mixing system are derived by two approaches: maximum entropy and minimum mutual information. The algorithms derived from the two approaches have a common structure. The new learning equations for the hidden layer are different from the learning equations for the output layer. The natural gradient descent method is applied in maximizing entropy and minimizing mutual information. The information (entropy or mutual information) back-propagation method is proposed to derive the learning equations for the hidden layer.

Zusammenfassung

In den meisten Arbeiten zur blinden Separation von Quellen wird von einem linearen Mischmodell ausgegangen. In dieser Arbeit hingegen wird ein Zweischicht-Perzeptron als System zur Trennung von Quellen bei nichtlinearer Mischung verwendet. Die Lernalgorithmen für das trennende System werden ausgehend von zwei Ansätzen abgeleitet: Maximale Entropie und minimale wechselseitige Information. Die Algorithmen, die von den beiden Ansätzen abgeleitet werden, haben eine gemeinsame Struktur. Die resultierenden Gleichungen für die versteckte Schicht weichen von denen für die Ausgangsschicht ab. Das herkömmliche Gradientenabstiegsverfahren wird dazu verwendet, die Entropie zu maximieren und die wechselseitige Information zu minimieren. Für die versteckte Schicht wird die Methode des Informations-Backpropagation (Entropie oder wechselseitige Information) zur Ableitung der Gleichungen des Lernverfahrens vorgeschlagen.

Résumé

Un modèle linéaire de mélange est supposé dans la plupart des articles dévolus à la séparation aveugle. Un modèle plus réaliste de mélange devrait être non-linéaire. Dans cet article, un perceptron à deux couches est utilisé comme système de séparation de sources dans un mélange non-linéaire. Les algorithmes pour le système de séparation sont dérivés à l’aide de deux approches: entropie maximum et information mutuelle minimum. Les algorithmes dérivés dans ces deux approches ont une structure commune. Les nouvelles équations d’apprentissage pour la couche cachée sont différentes de celles pour la couche de sortie. La méthode du gradient naturelle est appliquée pour la maximisation et la minimisation de l’information mutuelle. La méthode de rétro-propagation de l’information (entropie ou information mutuelle) est proposée pour dériver les équations d’apprentissage de la couche cachée.

Introduction

One of the important issues in signal processing is to find a set of statistically independent components from the observed data by linear or non-linear transformation. Most of the blind separation algorithms are based on the theory of the independent component analysis (ICA) [8]when the mixture model is linear. The idea of the ICA is to find the independent components in the mixture of the statistically independent source signals by optimizing some criteria. Some blind separation algorithms such as those in Refs. 1, 2, 5, 6, 7have the equivariant property when there is no noise in the observation of the mixture. However, for the non-linear mixture model, the linear ICA theory is not applicable and the equivariant property does not hold. The blind separation algorithms for the linear mixture model generally fail to extract the independent sources in non-linear mixture.

The self-organizing map (SOM) has been used to extract sources in non-linear mixture 9, 11, 12. It is a model-free method but suffers from (a) the exponential growth of the network complexity and (b) interpolation error in recovering continuous sources.

An extension of ICA to the separation sources in non-linear mixture is to employ a non-linear function to transform the mixture such that the outputs become statistically independent. However, without limiting the function class for de-mixing transforms, this extension may give statistically independent outputs which tell nothing about sources because any random variable with a continuous distribution function can be transformed to a uniformly distributed random variable and multiple uniform random variables are always independent. This transformation is not unique. To limit a function class for de-mixing functions is equivalent to assume some knowledge about the mixing function. We employ a two-layer perceptron, a parametric model, as a de-mixing system. A two-layer perceptron was also used in Ref. [4]to separate sources in non-linear mixture by the gradient descent method to minimize the mutual information (measure of dependence) of the outputs. Comparing our approach with the approach in Ref. [4], this paper includes the following innovations:

1.
on-line learning algorithms derived by maximizing entropy and minimizing mutual information,
2.
the learning algorithms consisting of (a) the learning equations for the hidden-output layer which are the same as those in 2, 3, 6, 15for linear de-mixing system and (b) novel learning equations for input-hidden layer and threshold for the hidden neurons,
3.
using the natural gradient descent method to optimize cost functions (entropy or mutual information) without constraints.

To estimate the mutual information, instead of using the Taylor expansion of a multi-variable characteristic function [4], we use the Gram–Charlier expansion to approximate the marginal probability density function and obtain a more reliable estimation of the mutual information.

A short version of this paper appeared in Ref. [17]. The non-linear mixture model considered there and in this paper contains crossing non-linearities in the mixture. This model is more general than that in Refs. 10, 14where the non-linear mixture is obtained by operating non-linear functions componentwise on the linear mixture.

Section snippets

Mixture models and de-mixing systems

Let us consider unknown source signals $s_{i} (t),i=1,…, n$ , which are mutually independent and stationary. It is assumed that each source has a zero mean and moments of any orders and at most one source is Gaussian.

Before we describe the non-linear mixture model, let us briefly review the linear mixture model $x (t)= As (t),$ where $A$ is an unknown non-singular mixing matrix, $s (t)=[s_{1} (t),…, s_{n} (t)]^{T}$ and $x (t)=[x_{1} (t),…, x_{n} (t)]^{T}$ . In this paper, the transpose operation is denoted by (·)^T. A linear de-mixing system $y$

Information back-propagation approach

We have explained why the minimum mutual information (MMI) approach can be applied for extracting independent sources in the non-linear mixture when the inverse of the non-linear mixing function is approximated by the two-layer perceptron. We can also apply the maximum entropy (ME) approach in Ref. [3]to train the de-mixing system. When the mixture model is linear, the ME approach is justified by its relation with the MMI approach [16]. The two approaches are generally equivalent around the

The information BP vs the error BP

Blind separation of sources is an unsupervized learning problem. We employ a two-layer perceptron as the de-mixing system to extract the sources in the non-linear mixture. However, we cannot use the well-known error BP method to train this multilayer network since the desired signals are not accessible. Instead, we use information criteria such as entropy and mutual information as error signals to derive the information BP algorithms. The block diagram of the entropy BP is plotted in Fig. 2.

Simulation

Assume that the mixing function $f (u)$ in Fig. 1 is a two-layer perceptron of 5 input neurons and 5 hidden neurons and 5 output neurons: $x (t)= A_{2} tanh (A_{1} s (t)),$ where the source vector $s (t)=[sign (cos (2 π 155t)), sin (2 π 800t),$ $sin (2 π 300t+6 cos (2 π 60t)), sin (2 π 90t), b(t)]^{T},$ consisting of four modulated signals and one random source b(t) uniformly distributed in [−1,1]. $A_{1}$ and $A_{2}$ are two 5×5 mixing matrices: $A_{1} = 0.1420 0.3016 −0.3863 0.2506 0.4690 −0.4347 −0.2243 −0.3150 0.4064 0.2012 0.3543 0.3589 0.0942 0.0916 0.4059 −0.4545$

Conclusions

Assuming the inverse of the non-linear mixing function can be approximated by a two-layer perceptron we employ a two-layer perceptron as a de-mixing system to extract sources in the non-linear mixture. The two blind separation algorithms are derived by using two approaches: minimum entropy and maximum mutual information. The two algorithms are called the entropy BP algorithm and the mutual information BP algorithm or information BP algorithms altogether. The information (entropy or mutual) BP

References (17)

G. Burel
Blind separation of sources: a non-linear neural algorithm
Neural Networks
(1992)
P. Comon
Independent component analysis, a new concept
Signal Process.
(1994)
S. Amari, A. Cichocki, H.H. Yang, Recurrent neural networks for blind separation of sources, in: Proc. NOLTA 1995, vol....
S. Amari, A. Cichocki, H.H. Yang, A new learning algorithm for blind signal separation, in: D.S. Touretzky, M.C. Mozer,...
A.J. Bell et al.
An information-maximisation approach to blind separation and blind deconvolution
Neural Comput.
(1995)
J.-F. Cardoso, B. Laheld, Equivariant adaptive source separation, IEEE Trans. Signal Process. 44 (12) (December 1996)...
A. Cichocki, R. Unbehauen, Robust neural networks with on-line learning for blind identification and blind separation...
A. Cichocki, R. Unbehauen, L. Moszczyński, E. Rummert, A new on-line adaptive learning algorithm for blind separation...

There are more references available in the full text version of this article.

Cited by (124)

Color calibration and correction applying linear interpolation technique for color fringe projection system
2016, Optik
Citation Excerpt :
However, their approach requires physically switch of the lenses, which then makes the measurement speed be largely limited. In [22], the IBP (Information Back Propagation) [23] method was proposed to implement color correction. And a decoupling network was built to invert the color leaking process and reduce the color crosstalk between each channel.
We presents an efficient color calibration approach by establishing color correspondence between source (design) color and measured color, and color correspondence is used to correct the recorded color obtained by camera. The proposed technique is applied to improve the accuracy of color fringe projection system, which combines color image segmentation and color intensity linear interpolation technique. In our approach, only 24 color patterns are projected and are recorded to set up color correspondence, the total number of projected patterns is fewer than many previous reported methods; we deduce that the relation between source color and measured color meets a linear expression within a small local color space, and the linear expression is regarded as the base of succedent color correction. That means: for a given measured color, the initial source color is first searched based on the rule of a minimum color distance, and then linear interpolation is applied to the local adjacent color space of the initial source color point to implement color correction. Experiment results validate the correctness of the proposed approach.
The velocity measurement of two-phase flow based on particle swarm optimization algorithm and nonlinear blind source separation
2012, Chinese Journal of Chemical Engineering
In order to overcome the disturbance of noise, this paper presented a method to measure two-phase flow velocity using particle swarm optimization algorithm, nonlinear blind source separation and cross correlation method. Because of the nonlinear relationship between the output signals of capacitance sensors and fluid in pipeline, nonlinear blind source separation is applied. In nonlinear blind source separation, the odd polynomials of higher order are used to fit the nonlinear transformation function, and the mutual information of separation signals is used as the evaluation function. Then the parameters of polynomial and linear separation matrix can be estimated by mutual information of separation signals and particle swarm optimization algorithm, thus the source signals can be separated from the mixed signals. The two-phase flow signals with noise which are obtained from upstream and downstream sensors are respectively processed by nonlinear blind source separation method so that the noise can be effectively removed. Therefore, based on these noise-suppressed signals, the distinct curves of cross correlation function and the transit times are obtained, and then the velocities of two-phase flow can be accurately calculated. Finally, the simulation experimental results are given. The results have proved that this method can meet the measurement requirements of two-phase flow velocity.
Nonlinear mixtures
2010, Handbook of Blind Source Separation
The linear mixing model, either without memory (instantaneous mixtures) or with memory (convolutive mixtures), is an approximated model, which is valid provided that the nonlinearities in the mixing system are weak, or the amplitudes of the signals are limited. In various situations, this approximation does not hold, for instance when one uses sensors with hard nonlinearities, or when signal levels lead to saturation of conditioning electronic circuits. Thus, it is relevant to consider the blind source separation problem in the more general framework of nonlinear mixtures. A natural extension of the linear independent component analysis method to the nonlinear case consists in estimating, by using only observations, a nonlinear transform such that the random vector has mutually independent components. Conformal mappings are often considered in the framework of functions of complex-valued variables, which are restricted to plane (two-dimensional) mappings. The postnonlinear mixtures are inspired by devices for which the sensors and their amplifiers used for signal conditioning are assumed to be nonlinear, for example due to saturation.
Modern Signal Processing
2022, Modern Signal Processing
Overview of industrial process monitoring methods based on independent component analysis and its extended model
2022, Kongzhi yu Juece/Control and Decision
Implementation of Blind Source Separation Based on Improved Variable Step-Size Natural Gradient Algorithm
2022, 2022 7th International Conference on Signal and Image Processing, ICSIP 2022

View all citing articles on Scopus

View full text