Elsevier

Image and Vision Computing

Volume 21, Issue 1, 10 January 2003, Pages 29-36
Image and Vision Computing

Algorithms from statistical physics for generative models of images

https://doi.org/10.1016/S0262-8856(02)00134-8Get rights and content

Abstract

A general framework for defining generative models of images is Markov random fields (MRFs), with shift-invariant (homogeneous) MRFs being an important special case for modeling textures and generic images. Given a dataset of natural images and a set of filters from which filter histogram statistics are obtained, a shift-invariant MRF can be defined (as in [Neural Comput. 9 (1997) 1627]) as a distribution of images whose mean filter histogram values match the empirical values obtained from the data set. Certain parameters in the MRF model, called potentials, must be determined in order for the model to match the empirical statistics. Standard methods for calculating the potentials are computationally very demanding, such as Generalized Iterative Scaling (GIS), an iterative procedure that converges to the correct potential values. We define a fast approximation, called BKGIS, which uses the Bethe-Kikuchi approximation from statistical physics to speed up the GIS procedure. Results are demonstrated on a model using two filters, and we show synthetic images that have been sampled from the model. Finally, we show a connection between GIS and our previous work on the g-factor.

Introduction

It is increasingly important to learn generative models for vision from real data. A general framework for defining generative models of images is Markov random fields (MRFs), which may be used to define a probability distribution on an entire image pixel lattice. An MRF probability distribution is defined in terms of clique potential functions, referred to as ‘potentials’, which are functions of local clusters of pixels (‘cliques’) that enforce the desired statistical relationships among the pixel intensity values in these clusters. An important sub-class of MRFs are those that are shift-invariant (homogeneous), i.e. those for which the potential functions are the same from clique to clique. Shift-invariant MRFs can be used for modeling statistically homogeneous patterns such as textures and generic images. Pioneering work by Zhu et al. [12] introduced the Minimax Entropy Learning (MEL) scheme which enabled them to learn shift-invariant MRF distributions for images based on filter histograms obtained from a dataset of images. This work gave an elegant connection between generative models on images (e.g. [11], [12]) and empirical studies of the statistical properties of images, for example see [5], [6], [7].

Learning MRF distributions from empirical image data requires calculating the values of the potential functions that result in a distribution which is consistent with the empirical data. (This corresponds to the classic problem of estimating the parameters of a log-linear model.) Standard methods for calculating the potentials are computationally very demanding, such as Generalized iterative scaling (GIS), an iterative procedure due to Darroch and Ratcliff [3] that is guaranteed to converge to the correct potential values. GIS may be thought of as a form of steepest descent in which each iteration updates the potential values. For the type of MRF we consider in this paper, shift-invariant MRFs defined in terms of histograms of filter responses, each iteration of GIS requires calculating the mean filter histogram values, or histogram expectations, given the current value of the potentials. Calculating the histogram expectations is the computational bottleneck of GIS, since closed-form expressions for these expectations are intractable, and estimating expectations by methods such as Markov Chain Monte Carlo (MCMC) is very slow.

To speed up this bottleneck, we apply recent work on the Bethe-Kikuchi approximation [9], which is a standard approximation in statistical physics [4], to estimate the histogram expectations at each step of GIS. We name this algorithm BKGIS. Since the Bethe-Kikuchi approximation is a variational procedure that requires constrained optimization, we employ the recently devised CCCP algorithm [10] to perform the required constrained optimization. We apply the CCCP algorithm in a way that exploits the homogeneous structure of the MRF, resulting in an algorithm that converges quickly to a solution of the Bethe-Kikuchi approximation. We demonstrate our work by learning MRF models for generic images and generating corresponding image samples.

In addition, we show a direct relationship between the GIS algorithm and a previous method, called the multinomial approximation, proposed for estimating potentials using an independence assumption [2]. We show that this previous approach corresponds to the first iteration of the GIS algorithm with uniform initial conditions. This means that a single iteration of GIS can be sufficient to get a good approximation to the MRF potentials. The same relationship holds for BKGIS.

In Section 2, we briefly review MRFs and MEL. Section 3 introduces the GIS algorithm and demonstrates that the first iteration of GIS corresponds to the multinomial method. In 4 The BKGIS algorithm, 4.2 CCCP updates we describe the BKGIS algorithm which uses the Bethe-Kikuchi approximation and a CCCP algorithm to speed up GIS. Section 5 gives results. Finally, in appendix a The, appendix b Computing we review the multinomial approximation method and show how its properties can be computed efficiently.

Section snippets

Shift-invariant Markov random fields

Suppose we have training image data which we assume has been generated by an (unknown) probability distribution PT(x), where x represents an image. A MRF may be used to construct a generative model of PT(x), and one procedure for defining an appropriate MRF is given by MEL [12]. The MEL procedure approximates PT(x) by selecting the distribution with maximum entropy constrained by observed feature statistics φ(x)〉=ψobs. This gives the exponential form P(x|λ)=exp[λ·φ(x)]/Z[λ], where

Generalized iterative scaling

In this section we introduce GIS [3] and explain the connection between it and the multinomial approximation, which is an approximation described in Appendix A that gives a rapid procedure for estimating potentials. GIS is an iterative procedure for calculating clique potentials that is guaranteed to converge to the maximum likelihood values of the potentials given the desired empirical filter marginals (e.g. filter histograms). We show that estimating the potentials by the multinomial

The BKGIS algorithm

We could iterate GIS to improve the estimate of the clique potentials beyond the accuracy of the multinomial approximation. Indeed, with sufficient number of iterations we would be guaranteed to converge to the correct potentials. The main difficulty lies in estimating ψa(t) for t>0. At t=0 this expectation is just the mean histogram with respect to the uniform distribution, αa, which can be calculated efficiently as shown in Appendix B.

In this section, we define a new algorithm called BKGIS.

Results

The BKGIS algorithm, as defined above, is slightly unstable because the histogram expectations are estimated only approximately. To circumvent this problem, we modified the basic GIS update equation so that it makes more conservative updates and avoids instabilities:λak,(t+1)ak,(t)+β(1/M)(logψak,obslogψak,(t)),where β<1 is a coefficient that sets the scale of the update (β=1 corresponds to standard GIS). We chose β=0.2 as a compromise between stability and speed of convergence.

To test the

Discussion

This paper applied the GIS algorithm [3] to the problem of learning MRF potentials [12] from image histograms. We introduced a new algorithm called BKGIS which used a Bethe-Kikuchi approximation [4], [9] and a CCCP algorithm [10] to speed up a crucial stage of the GIS algorithm. In addition, we demonstrated that the first iteration of GIS, or BKGIS, corresponds to an approximate method for estimating potentials [2].

We note that our method can be generalized to apply to any set of filters (of

Acknowledgements

We would like to thank Jonathan Yedida for helpful email correspondence. This work was supported by the National Institute of Health (NEI) with grant number RO1-EY 12691-01.

References (12)

  • J.M. Coughlan, A.L. Yuille, A phase space approach to minimax entropy learning and the minutemax approximation,...
  • J.M. Coughlan, A.L. Yuille, The g factor: relating distributions on features to distributions on images, Advances in...
  • J.N. Darroch et al.

    Generalized iterative scaling for log-linear models

    The Annals of Mathematical Statistics

    (1972)
  • C. Domb et al.
    (1972)
  • S.M. Konishi, A.L. Yuille, J.M. Coughlan, S.C. Zhu, Fundamental bounds on edge detection: an information theoretic...
  • A.B. Lee et al.

    Occlusion models of natural images: a statistical study of a scale-invariant dead leaf model

    International Journal of Computer Vision

    (2001)
There are more references available in the full text version of this article.

Cited by (1)

View full text