Distributed evolutionary Monte Carlo for Bayesian computing
Introduction
In recent decades Markov chain Monte Carlo (MCMC) methods have been used as indispensable tools to simulate complex and intractable functions in a variety of scientific fields. The fundamental algorithm was proposed by Metropolis et al. (1953) and modified by Hastings (1970), which applies a transition kernel to drive a Markov chain to explore the target function. However, the Metropolis–Hastings (MH) algorithm might perform poorly when the target function is high dimensional or has many modes separated by high barriers, and can be easily trapped at a local mode. In the literature, many alternative algorithms have been proposed to overcome these problems, including parallel tempering (Geyer, 1991), Exchange Monte Carlo (Hukushima and Nemeto, 1996), Dynamic Weighting (Wong and Liang, 1997) and others.
Recently, investigators ventured into the population-based MCMC methods (Laskey and Myers, 2003), which evolve multiple Markov chains simultaneously and update each chain by borrowing information from others. Adaptive direction sampling (ADS) algorithm (Gilks et al., 1994) is an early example of the population-based method. The ADS algorithm updates one member of the current population by a line sampling along a direction toward another member. Liu et al. (2000) proposed a conjugate gradient Monte Carlo algorithm that modifies the ADS algorithm by forcing the sampling direction toward a local mode. To exchange information more efficiently, some modern optimization techniques have been introduced into the statistical literature. Liang and Wong, 2000, Liang and Wong, 2001 proposed an Evolutionary Monte Carlo (EMC) algorithm that is based on the powerful genetic algorithm (Holland, 1975, Goldberg, 1989). The Differential-Evolutionary Monte Carlo (DE-MC) algorithm proposed by Ter Braak (2006) relies on its counterpart, the differential evolution algorithm (Storn and Price, 1997). In these two algorithms, a population of Markov chains are updated in parallel by evolution operators such as mutation and crossover. In the EMC algorithm, the Markov chains can also embed a temperature ladder to incorporate the features of parallel tempering.
In this paper, we propose a new evolutionary Monte Carlo algorithm for real-valued problems, called Distributed Evolutionary Monte Carlo (DGMC), which originates from the distributed genetic algorithm (Tanese, 1989). The distributed genetic algorithm (DGA) divides the whole population into a set of subpopulations and runs the genetic algorithm on each subpopulation. A major advantage of the DGA algorithm is that it allows information exchange between subpopulations through migration. Periodically some members are selected from a subpopulation and migrate to a different subpopulation. The DGA algorithm improves the performance of the single-population genetic algorithm by preventing premature convergence that often occurs in practice (Ryan, 1996). Our DGMC algorithm incorporates the DGA algorithm into the MCMC framework, which drives the Markov chains through three genetic operators: mutation, crossover and migration. Table 1 compares the features of several popular population MCMC algorithms. We prove that the DGMC algorithm has the target function as its stationary distribution. The effectiveness of the DGMC algorithm is demonstrated by applications to two multimodal distributions and a real data example.
The remainder of the paper is organized as follows. Section 2 reviews the distributed genetic algorithm. In Section 3, we describe the DGMC algorithm and prove its stationary properties. The proof is deferred to the Appendix. Section 4 illustrates the use of the DGMC algorithm through three numerical examples. Section 5 concludes the paper with a brief discussion.
Section snippets
Distributed genetic algorithm
The genetic algorithm emulates the natural evolution process and works as a powerful optimization technique by leading a population of potential solutions to a global optimum. Suppose that is the target function, which is referred to as a fitness function. A potential solution is called a chromosome, where is called the gene at locus . The value of is called a genotype. To seek the optima, the genetic algorithm updates a population of chromosomes iteratively by
Distributed evolutionary Monte Carlo
Suppose that we have a population of Markov chains that are divided equally into subpopulations. Let denote the current samples (chromosomes) of subpopulation , where is a -dimensional vector for . Denote the samples of the whole population. In the following three sections, we define, respectively, three genetic operators (mutation, crossover and migration) for the DGMC algorithm. Furthermore, we show that is invariant with
A bimodal example
We first consider an example about simulating from a mixture of two five-dimensional normal distributions where and is a 5×5 identity matrix. The example was also considered in Ter Braak (2006). The distance between the two modes of the target distribution is , which puts a great challenge to the traditional MH algorithm.
To apply the DGMC algorithm, we used two subpopulations and each subpopulation had five chains. The migration rate was
Conclusion and discussion
The DGMC algorithm has the Markov states augmented as a population instead of a single chain, which falls in the generic framework of the population Markov Chain Monte Carlo. The DGMC algorithm consists of three genetic operators: mutation, crossover and migration. The mutation operator is usually efficient in exploring the local modes. The crossover operator is able to help the chains travel through the regions with very low probabilities and is thus more powerful to explore the global space.
Acknowledgments
The authors thank the associate editor and the two reviewers for their helpful comments on improving the manuscript. This research of Kam-Wah Tsui was supported in part by NSF grant DMS-0604931.
References (23)
- et al.
- et al.
Markov Chain Monte Carlo maximum likelihood
- et al.
Adaptive direction sampling
The Statistician
(1994) - Goldberg, D.E., Segrest, P., 1987. Genetic algorithms with sharing for multimodal function optimization. in: Proc. 2nd...
Genetic Algorithms in Search, Optimization, & Machine Learning
(1989)Monte Carlo sampling methods using Markov Chains and their applications
Biometrika
(1970)- et al.
Statistical Tools for Nonlinear Regression
(2004) - et al.
Exchange Monte Carlo method and application to spin class simulations
J. Phys. Soc Japan.
(1996) Adaption in Natural and Artificial Systems
(1975)
Population Markov Chain Monte Carlo
Machine Learning
Cited by (15)
Bayesian variable selection in non-homogeneous hidden Markov models through an evolutionary Monte Carlo method
2020, Computational Statistics and Data AnalysisCitation Excerpt :EMC can be included in the class of population MCMC (PMCMC), as defined by Blackmond-Laskey and Myers (2003), who compared the performance of random walk Metropolis, genetic, and PMCMC algorithms. Other significant contributions to EMC and PMCMC methods are as follows: EMC was generalized by Liang (2005) to sample from a distribution defined in variable dimensional spaces, with specific mutation, crossover, exchange operators designed for Bayesian neural networks; the genetic algorithm called Differential Evolution was combined with PMCMC by Ter Braak (2006); four moves for the EMC were developed by Goswani and Liu (2007), who also introduced a strategy for designing the temperatures; a review of PMCMC methods was presented by Jasra et al. (2007a), who subsequently proposed a trans-dimensional PMCMC (Jasra et al., 2007b); PMCMC methods were used in model choice by Friel and Pettitt (2008) and Calderhead and Girolami (2009) to compute the marginal likelihood via power posteriors; a stereo matching method using PMCMC was designed by Kim et al. (2009); a PMCMC called Distributed EMC was introduced by Hu and Tsui (2010); a sampling algorithm based upon EMC for variable selection in linear models when a large number of covariates are available was proposed by Bottolo and Richardson (2010); PMCMC algorithms to generate multiple history matched models in reservoir modelling were presented by Mohamed et al. (2012); and four gradient-free MCMC samplers (including PMCMC) applied to dynamical causal models in neuroimaging were compared by Sengupta et al. (2015). The plan of the paper is as follows.
Bayesian characterization of Young's modulus of viscoelastic materials in laminated structures
2013, Journal of Sound and VibrationParallel hierarchical sampling: A general-purpose interacting Markov chains Monte Carlo algorithm
2012, Computational Statistics and Data AnalysisCitation Excerpt :Second, PHS is used to approximate marginal posterior inferences for the structure of a treed survival model. Despite the added complexity of these model uncertainty samplers, the PHS samplers presented can be implemented sequentially on single workstations as well as in distributed computing environments using standard message-passing software (Ren and Orkoulas, 2007; Hu and Tsui, 2010). Section 6 includes a critical discussion of PHS and of some research directions in the field of multiple chains MCMC.
Parametric identification of elastic modulus of polymeric material in laminated glasses
2012, IFAC Proceedings Volumes (IFAC-PapersOnline)Evolutionary Markov chain Monte Carlo algorithms for optimal monitoring network designs
2012, Statistical MethodologyCitation Excerpt :The algorithm is also equally applicable to other combinatorial optimization problems and in general to problems whose search space can be represented as a collection of binary sequences. The idea of simulating from the expected utility surface through an artificial distribution may also be coupled into other recently developed evolutionary Monte Carlo schemes such as [30,10] for optimization in discrete spaces, and [22,17,31,32] for optimization in continuous spaces. This is subject of future research.
A comprehensive Bayesian approach for model updating and quantification of modeling errors
2011, Probabilistic Engineering MechanicsCitation Excerpt :The population is updated by mutation (Metropolis update in a single chain), crossover (partial states swapping between different chains) and exchange operators (full state swapping between adjacent chains), which is very similar to the genetic algorithm. This is why it is called an evolutionary MCMC [39–41]. Here we give a quick view to understand how the evolutionary MCMC can help to overcome the difficulties encountered in model updating.