Elsevier

Signal Processing

Volume 91, Issue 4, April 2011, Pages 759-772
Signal Processing

Enhanced sampling schemes for MCMC based blind Bernoulli–Gaussian deconvolution

https://doi.org/10.1016/j.sigpro.2010.08.009Get rights and content

Abstract

This paper proposes and compares two new sampling schemes for sparse deconvolution using a Bernoulli–Gaussian model. To tackle such a deconvolution problem in a blind and unsupervised context, the Markov Chain Monte Carlo (MCMC) framework is usually adopted, and the chosen sampling scheme is most often the Gibbs sampler. However, such a sampling scheme fails to explore the state space efficiently. Our first alternative, the K-tuple Gibbs sampler, is simply a grouped Gibbs sampler. The second one, called partially marginalized sampler, is obtained by integrating the Gaussian amplitudes out of the target distribution. While the mathematical validity of the first scheme is obvious as a particular instance of the Gibbs sampler, a more detailed analysis is provided to prove the validity of the second scheme.

For both methods, optimized implementations are proposed in terms of computation and storage cost. Finally, simulation results validate both schemes as more efficient in terms of convergence time compared with the plain Gibbs sampler. Benchmark sequence simulations show that the partially marginalized sampler takes fewer iterations to converge than the K-tuple Gibbs sampler. However, its computation load per iteration grows almost quadratically with respect to the data length, while it only grows linearly for the K-tuple Gibbs sampler.

Introduction

The problem of the restoration of a sparse spike train x distorted by a linear system h and corrupted by noise ε such that z=x*h+ε, arises in many fields such as seismic exploration [1], [2] and astronomy [3].

In this paper, we adopt a Bernoulli–Gaussian (BG) model for the spike train x, following [2] and many subsequent contributions such as [1], [4]. A BG signal is an independent, identically distributed (iid) process defined in two stages. Firstly, the sparse nature of the spikes is governed by the Bernoulli law:P(q)=λL(1λ)MLwith the Bernoulli sequence q=[q1,,qM]t (a binary sequence of length M) and L=m=1Mqm, the number of non-zero entries of q. Secondly, amplitudes x=[x1,,xM]t are assumed iid zero-mean Gaussian conditionally to q:x|qN(0,σx2diag(q)),where diag(q) denotes a diagonal matrix whose diagonal is q.

The MCMC approach [5], [6] is a powerful numerical tool, appropriate to solve complex inference problems such as blind deconvolution. In the field of blind BG deconvolution, Cheng et al. pioneered the introduction of MCMC methods [1]. They proposed to rely on a plain Gibbs sampler, i.e., with a site-by-site updating scheme for the spike train, for which their algorithm constitutes a simple and canonical example. However, simulation results indicate that it lacks reliability: from different initial conditions, significantly different estimations are obtained, even after a considerable number of iterations.

The recent contribution of [7] already identified a convergence issue linked to time-shift ambiguities, and proposed an efficient way to solve it. In addition, the scale ambiguities are treated by [8] by proposing a scale re-sampling step to accelerate the convergence rate of the Markov chain. In this study, we point out another source of inefficiency, unrelated to the above-mentioned ambiguities: instead of exploring the 2M configurations of q at an acceptable speed, the Gibbs sampler tends to get stuck for many iterations around some particular configurations of q, often corresponding to local optimal configurations of x of the posterior distribution as illustrated by the example in Section 2.2. This conclusion meets Bourguignon and Carfantan's analysis [3]: the Markov chain equilibrates rapidly around a mode (i.e., a local optimal configuration), but takes a long time to move from mode to mode.

In order to make up for this deficiency, our first proposition is to adopt a grouped Gibbs sampler [5], called a K-tuple Gibbs sampler, where blocks of K adjacent BG variables (qi,…,qi+K−1) and their associated amplitudes (xi,…,xi+K−1) are jointly sampled. A comparable idea first appeared in [9], and more recently in [3], under the form of a deterministic iteration aiming at producing an increase of the posterior probability. We then propose a second solution based on sampling the posterior distribution marginally to the amplitudes x [10]. A comparable idea is found in statistical signal segmentation [11] where some hyper-parameters are partially marginalized. In fact, as Liu pointed out in [5], completely integrating out some components (the Gaussian amplitudes x in our case), leads to a more efficient sampling scheme called collapsed Gibbs sampling. However, a plain collapsed Gibbs sampling on the marginal posterior distribution involves hardly tractable sampling steps. In particular, it is all but simple to sample h conditionally on (z,q) and marginally with respect to x. Our scheme solves this problem by combining a step that samples q marginally with respect to x and other sampling steps involving x. Such a partially marginalized sampler is fully valid from the mathematical point of view. In this paper, we show that it can be interpreted as a plain Gibbs sampler with a particular scanning order of the variables.

Simulation tests on toy examples and on the so-called Mendel's sequence [2] confirm the efficiency of both methods in terms of computation time before convergence of the Markov chain. Further analysis shows the data length as a criterion to choose between the two proposed methods.

This paper is organized as follows. In Section 2, after a brief formulation of the blind BG deconvolution problem, the Gibbs sampler of the joint posterior distribution [1] is presented, and an example illustrates its inefficiency as regards the sampling of q. 3 Generalized Gibbs on K-tuple variables, 4 Partially marginalized Gibbs sampler, respectively, introduce the generalized K-tuple Gibbs sampler and the partially marginalized sampler. In both cases, a toy example is used to evaluate the capability of the sampler to escape from local optimal configurations, and implementation issues are carefully dealt with.

Finally, simulation results are presented in Section 5 to compare the efficiency of the sampling schemes according to Brooks and Gelman's convergence diagnostic [12], and conclusions are drawn in Section 6.

Section snippets

Statistical model

The mathematical model of convolution readszn=k=0Phkxnk+εnfor all n{1,,N}, where z=[z1,,zN]t denotes the observed vector, x=[x1,,xM]t is the unknown spike train, h=[h0,,hP]t the impulse response (IR) of the system (assumed finite here) and ε=[ε1,,εM]t the noise vector, often assumed white stationary Gaussian. The deconvolution problem is said blind when h is unknown, which is the studied case here. Akin to [1], the following assumptions are made:

  • εN(0,σε2I) is independent of x and h;

  • x

Generalized Gibbs on K-tuple variables

The inefficiency of basic Gibbs sampling in the context of sparse spike estimation has already been noticed by Bourguignon and Carfantan [3]. They proposed a solution involving shifts of detected spikes to adjacent positions. However, their solution is a deterministic procedure aimed at increasing the posterior probability during the burn-in period of the Markov chain. Therefore, it does not leave the posterior distribution invariant. In contrast, our goal is to propose a valid random sampling

Partially marginalized Gibbs sampler

Another sampling scheme is now proposed and studied, that indirectly tackles the inefficiency of the hybrid Gibbs sampling by marginalizing out x in the step that samples q. The resulting Markov chain should move more freely from one configuration to another. For instance, in the example of Fig. 1, q* can be reached from q(0) in three iterations that consist in first inserting a new spike at position 10, and then removing the other two, one after the other. Intuitively, none of these iterations

Convergence diagnostic

In order to compare empirically the convergence speed of the different samplers, we have resorted to Brooks and Gelman's iterated graphical method to assess convergence [12]. This diagnostic method is based upon the covariance estimation of m independent Markov chains {Φjt,j=1,,m;t=1,,n} of equal length n. Let Φ¯j. (respectively, Φ¯..) denote the local (respectively, global) mean of the chains. The intra-chain and inter-chain variances are defined as covariance matrix averages:Vintra=1m(n1)j

Conclusion

This paper proposes two distinct methods in dealing with the identified inefficiency concerning the Bernoulli label q sampling in the BG deconvolution problem. Detailed algorithms are given for both methods as well as their performance on simulation tests. It is shown that both methods are mathematically valid MCMC sampling schemes and both achieve better convergence properties in comparison to the hybrid sampler, the latter is based on Cheng et al.'s Gibbs scheme with time-shift compensations

References (17)

  • O. Cappé et al.

    Simulation-based methods for blind maximum-likelihood filter identification

    Signal Processing

    (1999)
  • Q. Cheng et al.

    Simultaneous wavelet estimation and deconvolution of reflection seismic signals

    IEEE Trans. Geosci. Remote Sensing

    (1996)
  • J.J. Kormylo et al.

    Maximum-likelihood seismic deconvolution

    IEEE Trans. Geosci. Remote Sensing

    (1983)
  • S. Bourguignon, H. Carfantan, Bernoulli–Gaussian spectral analysis of unevenly spaced astrophysical data, in: IEEE...
  • F. Champagnat et al.

    Unsupervised deconvolution of sparse spike trains using stochastic approximation

    IEEE Trans. Signal Process.

    (1996)
  • J.S. Liu

    Monte Carlo Strategies in Scientific Computing

  • C.P. Robert et al.

    Monte Carlo Statistical Methods

  • C. Labat, J. Idier, Sparse blind deconvolution accounting for time-shift ambiguity, in: Proceedings of IEEE ICASSP,...
There are more references available in the full text version of this article.

Cited by (36)

  • Variational Bayesian blind image deconvolution: A review

    2015, Digital Signal Processing: A Review Journal
    Citation Excerpt :

    The works in this field are focused on developing more efficient algorithms. Ge et al. [57] or Kail et al. [58] propose modified versions of the Gibbs sampling, and Pereyra [35] uses the Langevin algorithm which uses convex analysis to simulate efficiently the distributions. Before presenting some BID examples we would like to comment here on some open problems, either on the Variational Bayesian BID (VBBID) or BID itself, that we believe will very likely be explored in the near future:

  • Variational semi-blind sparse deconvolution with orthogonal kernel bases and its application to MRFM

    2014, Signal Processing
    Citation Excerpt :

    It is not proven in our solution either. However, the shift/time ambiguity issue noticed in [47] is implicitly addressed in our method using a nominal and basis PSFs. Moreover, our constraint on the PSF space using a basis approach effectively excludes a delta function as a PSF solution, thus avoiding the trivial solution.

View all citing articles on Scopus
1

The matlab source code of the algorithms in this article is available upon demand.

View full text