Elsevier

Signal Processing

Volume 91, Issue 4, April 2011, Pages 801-820
Signal Processing

Bayesian learning of finite generalized Gaussian mixture models on images

https://doi.org/10.1016/j.sigpro.2010.08.014Get rights and content

Abstract

This paper presents a fully Bayesian approach to analyze finite generalized Gaussian mixture models which incorporate several standard mixtures, widely used in signal and image processing applications, such as Laplace and Gaussian. Our work is motivated by the fact that the generalized Gaussian distribution (GGD) can be applied on a wide range of data due to its shape flexibility which justifies its usefulness to model the statistical behavior of multimedia signals [1]. We present a method to evaluate the posterior distribution and Bayes estimators using a Gibbs sampling algorithm. For the selection of number of components in the mixture, we use the integrated likelihood and Bayesian information criteria. We validate the proposed method by applying it to: synthetic data, real datasets, texture classification and retrieval, and image segmentation; while comparing it to different other approaches.

Introduction

Finite mixtures are a flexible and powerful probabilistic tool for modeling data [2]. Mixture models are very useful in areas where statistical modeling of data is needed such as in signal and image processing, pattern recognition, bioinformatics, computer vision, and machine learning. The three main problems in mixture modeling are the choice of the probability density function (pdf), the parameters estimation and the selection of the number of clusters. In most of the applications, the Gaussian density is used in the mixture modeling of data. However, many signal processing systems often operate in environments characterized by non-Gaussian and highly peaked sources (subband image and speech coefficients, for instance) [1], [3], [4], [5]. An interesting approach involving very general parametric models (i.e. statistical distribution), based on Pearson's system, has been proposed in [6] to model non-Gaussian data. Moreover, many studies have shown that the GGD, can be a good alternative to the Gaussian thanks to its shape flexibility which allows the modeling of a large number of non-Gaussian signals [7], [8], [9], [10]. The GGD contains the Laplacian, the Gaussian and asymptotically the uniform distribution as special cases [11] and has been used, for instance, in [12], [4] to fit subband histograms, in [13] for multiresolution transmission of high-definition video, in [14] for subband decomposition of video, in [15] for buffer control, in [16], [17], [18] for texture classification and retrieval, in [19] for denoising applications, in [20], [21] for data and image compression, in [22] for edge modeling, in [23], [24] for image thresholding, in [25], [26] for speech modeling, in [27], [28], [29] for video and image segmentation, in [30] for SAR images statistics modeling, and in [31] for multichannel audioresynthesis.

Several approaches have been considered in the past to estimate GGD's parameters such as moment estimation [32], [14], [33], entropy matching estimation [34], [26], and maximum likelihood estimation [32], [16], [35], [36], [10]. It is noteworthy that these approaches consider a single distribution. Concerning finite mixture model parameters estimation, approaches can be classified into two categories: deterministic and Bayesian methods. In deterministic approaches, parameters are taken as fixed and unknown, and inference is based on the likelihood of the data. Some deterministic approaches have been proposed in the past for the estimation of finite generalized Gaussian mixture (GGM) model parameters (see, for instance [28], [29]). Despite the fact that deterministic approaches, such as the expectation–maximization (EM) algorithm [37], have dominated mixture models estimation due to their small computational time, many works have proved that these methods have severe problems such as convergence to local maxima, and the tendency to complicate the resulted models (i.e overfitting) [38] especially when data are sparse or noisy. Several stochastic versions of the EM algorithm have been introduced to overcome these problems. Examples include the stochastic EM (SEM) [39], the stochastic approximation EM (SAEM) [40], the iterated conditional expectation (ICE) [41] and the Monte Carlo EM (MCEM) [42]. With the evolution of computational tools, signal and image processing researchers were encouraged also to develop and use pure Bayesian Markov chain Monte Carlo (MCMC) methods and techniques as an alternative approach. In Bayesian methods, parameters are considered random, and follow different probability distributions (prior distributions). These distributions describe our knowledge before considering the data, as for updating our prior beliefs the likelihood is used. For interesting and in depth discussions about the general Bayesian theory refer to [38], [43].

To the best of our knowledge the learning techniques that have been proposed for the GGM are deterministic and then usually excessively sensitive to noise. Thus, we propose in this paper a novel Bayesian approach to evaluate the posterior distribution of GGM and then learn its parameters using Gibbs sampling [44] for the estimation and the integrated likelihood for the selection of the optimal number of components. To validate our learning algorithm, we compare it to four different stochastic techniques namely SEM, SAEM, MCEM, and ICE using synthetic data, real datasets, and real world applications involving texture classification and retrieval, and image segmentation.

The rest of this paper is organized as follows. The next section describes the GGM Bayesian estimation algorithm. In Section 3, we assess the performance of the new model on different applications. Our last section is devoted to the conclusion and some perspectives.

Section snippets

Finite GGM model

If the random variable xR follows a GGD with parameters μ, α and β, then the density function is given by [12], [14]P(x|μ,α,β)=βα2Γ(1/β)e(α|xμ|)βwhere α=(1/σ)Γ(3/β)/Γ(1/β), <μ<, β>0, σ>0, α>0, and Γ(·) is the Gamma function given by: Γ(x)=0tx1etdt, x>0. μ, α, σ and β denote the distribution mean, the inverse scale parameter, the standard deviation, and the shape parameter, respectively. The parameter β controls the shape of the pdf. The larger the value, the flatter the pdf; and the

Design of experiments

In this section, we apply our Bayesian GGM estimation algorithm for synthetic data, real datasets, and real applications involving texture classification and retrieval, and image segmentation. We validate our algorithm by comparing it to various stochastic versions of the EM like the SEM, SAEM, MCEM and ICE. In fact, choosing a relevant model consists of both choosing its form (GGM in our case) and the number of components M. We use two approaches in order to rate the ability of the tested

Conclusion

We have presented a Bayesian analysis of finite generalized Gaussian mixtures. Our learning algorithm is based on the Monte Carlo simulation technique of Gibbs sampling mixed with a Metropolis-Hasting step. For the estimation of the number of clusters describing the mixture model, we used the marginal likelihood with Laplace approximation, and the BIC criterion. We have demonstrated clearly by different applications that Bayesian estimation and selection gives reliable estimates. The Bayesian

Acknowledgment

The completion of this research was made possible thanks to the Natural Sciences and Engineering Research Council of Canada (NSERC), a NATEQ Nouveaux Chercheurs Grant, and a start-up grant from Concordia University. The authors would like to thank the anonymous referees for their helpful comments.

References (60)

  • R. Laroia et al.

    A structured fixed-rate vector quantizer derived from a variable-length scalar quantizer: part I—memoryless sources

    IEEE Transactions on Information Theory

    (1993)
  • Y. Delignon et al.

    Estimation of generalized mixtures and its application in image segmentation

    IEEE Transactions on Image Processing

    (1997)
  • J.H. Miller et al.

    Detectors for discrete-time signals in non-Gaussian noise

    IEEE Transactions on Information Theory

    (1972)
  • N. Farvardin et al.

    Optimum quantizer performance for a class of non-Gaussian memoryless sources

    IEEE Transactions on Information Theory

    (1984)
  • Z. Gao et al.

    A comparison of the Z, E8, and leech lattices for quantization of low-shape-parameter generalized Gaussian sources

    IEEE Signal Processing Letters

    (1995)
  • S. Meignen et al.

    On the modeling of small sample distributions with generalized Gaussian density in a maximum likelihood framework

    IEEE Transactions on Image Processing

    (2006)
  • W. Mauersberger

    Experimental results on the performance of mismatched quantizers

    IEEE Transactions on Information Theory

    (1979)
  • S.G. Mallat

    A theory for multiresolution signal decomposition: the wavelet representation

    IEEE Transactions on Pattern Analysis and Machine Intelligence

    (1989)
  • T. Naveen et al.

    Motion compensated multiresolution transmission of high definition video

    IEEE Transactions on Circuits and Systems for Video Technology

    (1994)
  • K. Sharifi et al.

    Estimation of shape parameter for generalized Gaussian distributions in subband decomposition of video

    IEEE Transactions on Circuits and Systems for Video Technology

    (1995)
  • G. Calvagno et al.

    Modeling of subband image data for buffer control

    IEEE Transactions on Circuits and Systems for Video Technology

    (1997)
  • M.N. Do et al.

    Wavelet-based texture retrieval using generalized Gaussian density and Kullback–Leibler distance

    IEEE Transactions on Image Processing

    (2002)
  • J-F. Aujol et al.

    Wavelet-based level set evolution for classification of textured images

    IEEE Transactions on Image Processing

    (2003)
  • S-K. Choi et al.

    Supervised texture classification using characteristic generalized Gaussian density

    Journal of Mathematical Imaging and Vision

    (2007)
  • P. Moulin et al.

    Analysis of multiresolution image denoising schemes using generalized Gaussian and complexity priors

    IEEE Transactions on Information Theory

    (1999)
  • T.R. Fischer

    A pyramid vector quantizer

    IEEE Transactions on Information Theory

    (1986)
  • K.A. Birney et al.

    On the modeling of DCT and subband image data for compression

    IEEE Transactions on Image Processing

    (1995)
  • C. Bouman et al.

    A generalized Gaussian image model for edge-preserving MAP estimation

    IEEE Transactions on Image Processing

    (1993)
  • S. Gazor et al.

    Speech probability distributions

    IEEE Signal Processing Letters

    (2003)
  • M.S. Allili, N. Bouguila, D. Ziou, A robust video foreground segmentation by using generalized Gaussian mixture...
  • Cited by (73)

    • EPLL image restoration with a bounded asymmetrical Student's-t mixture model

      2022, Journal of Visual Communication and Image Representation
      Citation Excerpt :

      Unfortunately, although these methods are outstanding in noise reduction, they still suffer from the residual noise and undesirable artifacts [24-26,31] because of the sensitivity of the GMM employed in these methods [33-36] and the short tail of the Gaussian distribution [22,23,25]. Therefore, various schemes were proposed to overcome the sensitivity of Gaussian distributions [37-40]. Among these methods, the Student’s-t mixture model (SMM) is noticeable, which is utilized in the EPLL image restoration model [41-43]with an extraordinary restoration performance and flexibility.

    • Nonparametric hierarchical mixture models based on asymmetric Gaussian distribution

      2020, Digital Signal Processing: A Review Journal
      Citation Excerpt :

      Indeed, this is the case especially for natural images. For achieving improved accurate approximation and modeling performance, we consider AGD which is competent in modeling asymmetric data: this distribution has left and right standard deviations to capture the asymmetry of data [12]. The major contributions of this work can be summarized as follows: Firstly, we propose two efficient nonparametric hierarchical models based on DP and PYP mixtures with AGD.

    • Refining Nonparametric Mixture Models with Explainability for Smart Building Applications

      2023, Conference Proceedings - IEEE International Conference on Systems, Man and Cybernetics
    View all citing articles on Scopus
    View full text