Enhanced sampling schemes for MCMC based blind Bernoulli–Gaussian deconvolution

doi:10.1016/j.sigpro.2010.08.009

Signal Processing

Volume 91, Issue 4, April 2011, Pages 759-772

https://doi.org/10.1016/j.sigpro.2010.08.009 Get rights and content

Abstract

This paper proposes and compares two new sampling schemes for sparse deconvolution using a Bernoulli–Gaussian model. To tackle such a deconvolution problem in a blind and unsupervised context, the Markov Chain Monte Carlo (MCMC) framework is usually adopted, and the chosen sampling scheme is most often the Gibbs sampler. However, such a sampling scheme fails to explore the state space efficiently. Our first alternative, the K-tuple Gibbs sampler, is simply a grouped Gibbs sampler. The second one, called partially marginalized sampler, is obtained by integrating the Gaussian amplitudes out of the target distribution. While the mathematical validity of the first scheme is obvious as a particular instance of the Gibbs sampler, a more detailed analysis is provided to prove the validity of the second scheme.

For both methods, optimized implementations are proposed in terms of computation and storage cost. Finally, simulation results validate both schemes as more efficient in terms of convergence time compared with the plain Gibbs sampler. Benchmark sequence simulations show that the partially marginalized sampler takes fewer iterations to converge than the K-tuple Gibbs sampler. However, its computation load per iteration grows almost quadratically with respect to the data length, while it only grows linearly for the K-tuple Gibbs sampler.

Introduction

The problem of the restoration of a sparse spike train $x$ distorted by a linear system $h$ and corrupted by noise $ε$ such that $z = x * h + ε$ , arises in many fields such as seismic exploration [1], [2] and astronomy [3].

In this paper, we adopt a Bernoulli–Gaussian (BG) model for the spike train $x$ , following [2] and many subsequent contributions such as [1], [4]. A BG signal is an independent, identically distributed (iid) process defined in two stages. Firstly, the sparse nature of the spikes is governed by the Bernoulli law: $P (q) = λ^{L} (1 - λ)^{M - L}$ with the Bernoulli sequence $q = [q_{1}, \dots, q_{M}]^{t}$ (a binary sequence of length M) and $L = \sum_{m = 1}^{M} q_{m}$ , the number of non-zero entries of $q$ . Secondly, amplitudes $x = [x_{1}, \dots, x_{M}]^{t}$ are assumed iid zero-mean Gaussian conditionally to $q$ : $x | q \sim N (0, σ_{x}^{2} diag (q)),$ where $diag (q)$ denotes a diagonal matrix whose diagonal is $q$ .

The MCMC approach [5], [6] is a powerful numerical tool, appropriate to solve complex inference problems such as blind deconvolution. In the field of blind BG deconvolution, Cheng et al. pioneered the introduction of MCMC methods [1]. They proposed to rely on a plain Gibbs sampler, i.e., with a site-by-site updating scheme for the spike train, for which their algorithm constitutes a simple and canonical example. However, simulation results indicate that it lacks reliability: from different initial conditions, significantly different estimations are obtained, even after a considerable number of iterations.

The recent contribution of [7] already identified a convergence issue linked to time-shift ambiguities, and proposed an efficient way to solve it. In addition, the scale ambiguities are treated by [8] by proposing a scale re-sampling step to accelerate the convergence rate of the Markov chain. In this study, we point out another source of inefficiency, unrelated to the above-mentioned ambiguities: instead of exploring the 2_M configurations of $q$ at an acceptable speed, the Gibbs sampler tends to get stuck for many iterations around some particular configurations of $q$ , often corresponding to local optimal configurations of $x$ of the posterior distribution as illustrated by the example in Section 2.2. This conclusion meets Bourguignon and Carfantan's analysis [3]: the Markov chain equilibrates rapidly around a mode (i.e., a local optimal configuration), but takes a long time to move from mode to mode.

In order to make up for this deficiency, our first proposition is to adopt a grouped Gibbs sampler [5], called a K-tuple Gibbs sampler, where blocks of K adjacent BG variables (q_i,…,q_i+K−1) and their associated amplitudes (x_i,…,x_i+K−1) are jointly sampled. A comparable idea first appeared in [9], and more recently in [3], under the form of a deterministic iteration aiming at producing an increase of the posterior probability. We then propose a second solution based on sampling the posterior distribution marginally to the amplitudes $x$ [10]. A comparable idea is found in statistical signal segmentation [11] where some hyper-parameters are partially marginalized. In fact, as Liu pointed out in [5], completely integrating out some components (the Gaussian amplitudes $x$ in our case), leads to a more efficient sampling scheme called collapsed Gibbs sampling. However, a plain collapsed Gibbs sampling on the marginal posterior distribution involves hardly tractable sampling steps. In particular, it is all but simple to sample $h$ conditionally on $(z, q)$ and marginally with respect to $x$ . Our scheme solves this problem by combining a step that samples $q$ marginally with respect to $x$ and other sampling steps involving $x$ . Such a partially marginalized sampler is fully valid from the mathematical point of view. In this paper, we show that it can be interpreted as a plain Gibbs sampler with a particular scanning order of the variables.

Simulation tests on toy examples and on the so-called Mendel's sequence [2] confirm the efficiency of both methods in terms of computation time before convergence of the Markov chain. Further analysis shows the data length as a criterion to choose between the two proposed methods.

This paper is organized as follows. In Section 2, after a brief formulation of the blind BG deconvolution problem, the Gibbs sampler of the joint posterior distribution [1] is presented, and an example illustrates its inefficiency as regards the sampling of $q$ . 3 Generalized Gibbs on K-tuple variables, 4 Partially marginalized Gibbs sampler, respectively, introduce the generalized K-tuple Gibbs sampler and the partially marginalized sampler. In both cases, a toy example is used to evaluate the capability of the sampler to escape from local optimal configurations, and implementation issues are carefully dealt with.

Finally, simulation results are presented in Section 5 to compare the efficiency of the sampling schemes according to Brooks and Gelman's convergence diagnostic [12], and conclusions are drawn in Section 6.

Section snippets

Statistical model

The mathematical model of convolution reads $z_{n} = \sum_{k = 0}^{P} h_{k} x_{n - k} + ε_{n}$ for all $n \in {1, \dots, N}$ , where $z = [z_{1}, \dots, z_{N}]^{t}$ denotes the observed vector, $x = [x_{1}, \dots, x_{M}]^{t}$ is the unknown spike train, $h = [h_{0}, \dots, h_{P}]^{t}$ the impulse response (IR) of the system (assumed finite here) and $ε = [ε_{1}, \dots, ε_{M}]^{t}$ the noise vector, often assumed white stationary Gaussian. The deconvolution problem is said blind when $h$ is unknown, which is the studied case here. Akin to [1], the following assumptions are made:

•
$ε \sim N (0, σ_{ε}^{2} I)$ is independent of $x$ and $h$ ;
•
$x$

Generalized Gibbs on K-tuple variables

The inefficiency of basic Gibbs sampling in the context of sparse spike estimation has already been noticed by Bourguignon and Carfantan [3]. They proposed a solution involving shifts of detected spikes to adjacent positions. However, their solution is a deterministic procedure aimed at increasing the posterior probability during the burn-in period of the Markov chain. Therefore, it does not leave the posterior distribution invariant. In contrast, our goal is to propose a valid random sampling

Partially marginalized Gibbs sampler

Another sampling scheme is now proposed and studied, that indirectly tackles the inefficiency of the hybrid Gibbs sampling by marginalizing out $x$ in the step that samples $q$ . The resulting Markov chain should move more freely from one configuration to another. For instance, in the example of Fig. 1, $q^{*}$ can be reached from $q^{(0)}$ in three iterations that consist in first inserting a new spike at position 10, and then removing the other two, one after the other. Intuitively, none of these iterations

Convergence diagnostic

In order to compare empirically the convergence speed of the different samplers, we have resorted to Brooks and Gelman's iterated graphical method to assess convergence [12]. This diagnostic method is based upon the covariance estimation of m independent Markov chains ${Φ_{jt}, j = 1, \dots, m; t = 1, \dots, n}$ of equal length n. Let ${\bar{Φ}}_{j .}$ (respectively, ${\bar{Φ}}_{. .}$ ) denote the local (respectively, global) mean of the chains. The intra-chain and inter-chain variances are defined as covariance matrix averages: $V_{intra} = \frac{1}{m (n - 1)} \sum j$

Conclusion

This paper proposes two distinct methods in dealing with the identified inefficiency concerning the Bernoulli label $q$ sampling in the BG deconvolution problem. Detailed algorithms are given for both methods as well as their performance on simulation tests. It is shown that both methods are mathematically valid MCMC sampling schemes and both achieve better convergence properties in comparison to the hybrid sampler, the latter is based on Cheng et al.'s Gibbs scheme with time-shift compensations

References (17)

O. Cappé et al.
Simulation-based methods for blind maximum-likelihood filter identification
Signal Processing
(1999)
Q. Cheng et al.
Simultaneous wavelet estimation and deconvolution of reflection seismic signals
IEEE Trans. Geosci. Remote Sensing
(1996)
J.J. Kormylo et al.
Maximum-likelihood seismic deconvolution
IEEE Trans. Geosci. Remote Sensing
(1983)
S. Bourguignon, H. Carfantan, Bernoulli–Gaussian spectral analysis of unevenly spaced astrophysical data, in: IEEE...
F. Champagnat et al.
Unsupervised deconvolution of sparse spike trains using stochastic approximation
IEEE Trans. Signal Process.
(1996)
J.S. Liu
Monte Carlo Strategies in Scientific Computing
C.P. Robert et al.
Monte Carlo Statistical Methods
C. Labat, J. Idier, Sparse blind deconvolution accounting for time-shift ambiguity, in: Proceedings of IEEE ICASSP,...

There are more references available in the full text version of this article.

Cited by (36)

A fully Bayesian approach based on Bernoulli–Gaussian prior for the identification of sparse vibratory sources from displacement measurements
2022, Journal of Sound and Vibration
This paper introduces a fully Bayesian approach to the Force Analysis Technique (FAT), which aims at identifying sparse vibratory sources from displacement measurements. Being based on the local equation of motion of a structure, the FAT allows for the estimation of vibratory sources without the need of specifying boundary conditions. Nevertheless, since it involves the calculation of higher-order spatial derivatives of the measured displacements, it is highly sensitive to noise perturbations and thus requires careful regularization. Besides, although sparse excitations are commonplace in structural vibrations, standard regularization strategies tend to over-smooth them in the reconstruction process. This paper shows how to reconcile the two goals of regularization and sparsity enforcement in the FAT by setting up a hierarchical Bayesian model rooted on a Bernoulli–Gaussian prior. Inference of all the parameters in the model is achieved with a Gibbs sampler whose convergence is efficiently accelerated with a partial collapsing strategy.
Variational Bayesian blind image deconvolution: A review
2015, Digital Signal Processing: A Review Journal
Citation Excerpt :
The works in this field are focused on developing more efficient algorithms. Ge et al. [57] or Kail et al. [58] propose modified versions of the Gibbs sampling, and Pereyra [35] uses the Langevin algorithm which uses convex analysis to simulate efficiently the distributions. Before presenting some BID examples we would like to comment here on some open problems, either on the Variational Bayesian BID (VBBID) or BID itself, that we believe will very likely be explored in the near future:
In this paper we provide a review of the recent literature on Bayesian Blind Image Deconvolution (BID) methods. We believe that two events have marked the recent history of BID: the predominance of Variational Bayes (VB) inference as a tool to solve BID problems and the increasing interest of the computer vision community in solving BID problems. VB inference in combination with recent image models like the ones based on Super Gaussian (SG) and Scale Mixture of Gaussians (SMG) representations have led to the use of very general and powerful tools to provide clear images from blurry observations. In the provided review emphasis is paid on VB inference and the use of SG and SMG models with coverage of recent advances in sampling methods. We also provide examples of current state of the art BID methods and discuss problems that very likely will mark the near future of BID.
A validation study of α-stable distribution characteristic for seismic data
2015, Signal Processing
This paper studied the statistical characteristics of seismic traces. Compared with the statistical characteristic of $α - stable$ distribution random variables, it drew a conclusion that the distribution of real seismic traces should be heavy tailed and asymmetric. They are the characteristics of non-Gaussian $α - stable$ distribution rather than the traditional Gaussian distribution. Following this assumption, the Kogon–Williams characteristic estimation method was applied to estimate the $α - stable$ distribution parameters from the seismic traces. As expected, the estimated characteristic exponent α of the seismic traces are less than 2 and symmetry parameters are not 0. QQ-plots and sample variances experiments were conducted on simulated random data and seismic traces. The experiments verify the assumption that seismic data follow non-Gaussian $α - stable$ distributions.
Variational semi-blind sparse deconvolution with orthogonal kernel bases and its application to MRFM
2014, Signal Processing
Citation Excerpt :
It is not proven in our solution either. However, the shift/time ambiguity issue noticed in [47] is implicitly addressed in our method using a nominal and basis PSFs. Moreover, our constraint on the PSF space using a basis approach effectively excludes a delta function as a PSF solution, thus avoiding the trivial solution.
We present a variational Bayesian method of joint image reconstruction and point spread function (PSF) estimation when the PSF of the imaging device is only partially known. To solve this semi-blind deconvolution problem, prior distributions are specified for the PSF and the 3D image. Joint image reconstruction and PSF estimation is then performed within a Bayesian framework, using a variational algorithm to estimate the posterior distribution. The image prior distribution imposes an explicit atomic measure that corresponds to image sparsity. Importantly, the proposed Bayesian deconvolution algorithm does not require hand tuning. Simulation results clearly demonstrate that the semi-blind deconvolution algorithm compares favorably with previous Markov chain Monte Carlo (MCMC) version of myopic sparse reconstruction. It significantly outperforms mismatched non-blind algorithms that rely on the assumption of the perfect knowledge of the PSF. The algorithm is illustrated on real data from magnetic resonance force microscopy (MRFM).
Marginal MAP estimation of a Bernoulli-Gaussian signal: continuous relaxation approach
2023, European Signal Processing Conference
A Statistical Framework to Investigate the Optimality of Signal-Reconstruction Methods
2023, IEEE Transactions on Signal Processing

View all citing articles on Scopus

¹: The matlab source code of the algorithms in this article is available upon demand.

View full text

Enhanced sampling schemes for MCMC based blind Bernoulli–Gaussian deconvolution

Abstract

Introduction

Section snippets

Statistical model

Generalized Gibbs on K-tuple variables

Partially marginalized Gibbs sampler

Convergence diagnostic

Conclusion

Signal Processing

Simultaneous wavelet estimation and deconvolution of reflection seismic signals

IEEE Trans. Geosci. Remote Sensing

Maximum-likelihood seismic deconvolution

IEEE Trans. Geosci. Remote Sensing

Unsupervised deconvolution of sparse spike trains using stochastic approximation

IEEE Trans. Signal Process.

Monte Carlo Strategies in Scientific Computing

Monte Carlo Statistical Methods