Abstract
We review adaptive Markov chain Monte Carlo algorithms (MCMC) as a mean to optimise their performance. Using simple toy examples we review their theoretical underpinnings, and in particular show why adaptive MCMC algorithms might fail when some fundamental properties are not satisfied. This leads to guidelines concerning the design of correct algorithms. We then review criteria and the useful framework of stochastic approximation, which allows one to systematically optimise generally used criteria, but also analyse the properties of adaptive MCMC algorithms. We then propose a series of novel adaptive algorithms which prove to be robust and reliable in practice. These algorithms are applied to artificial and high dimensional scenarios, but also to the classic mine disaster dataset inference problem.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Ahn, J.-H., Oh, J.-H.: A constrained EM algorithm for principal component analysis. Neural Comput. 15, 57–65 (2003)
Andradóttir, S.: A stochastic approximation algorithm with varying bounds. Oper. Res. 43(6), 1037–1048 (1995)
Andrieu, C.: Discussion of Haario, H., Laine, M., Lehtinen, M., Saksman, E.: Markov chain Monte Carlo methods for high dimensional inversion in remote sensing (December 2003). J. R. Stat. Soc. Ser. B 66(3), 497–813 (2004)
Andrieu, C., Atchadé, Y.F.: On the efficiency of adaptive MCMC algorithms. Electron. Commun. Probab. 12, 336–349 (2007)
Andrieu, C., Doucet, A.: Discussion of Brooks, S.P., Giudici, P., Roberts, G.O.: Efficient construction of reversible jump Markov chain Monte Carlo proposal distributions. Part 1. J. R. Stat. Soc. B 65, 3–55 (2003)
Andrieu, C., Jasra, A.: Efficient and principled implementation of the tempering procedure. Tech. Rep. University of Bristol (2008)
Andrieu, C., Moffa, G.: A Gaussian copula approach for adaptation in discrete scenarios (2008, in preparation)
Andrieu, C., Moulines, É.: On the ergodicity properties of some adaptive MCMC algorithms. Ann. Appl. Probab. 16(3), 1462–1505 (2006)
Andrieu, C., Robert, C.P.: Controlled MCMC for optimal sampling. Tech. Rep. 0125, Cahiers de Mathématiques du Ceremade, Université Paris-Dauphine (2001)
Andrieu, C., Tadić, V.B.: The boundedness issue for controlled MCMC algorithms. Tech. Rep. University of Bristol (2007)
Andrieu, C., Moulines, É., Priouret, P.: Stability of stochastic approximation under verifiable conditions. SIAM J. Control Optim. 44(1), 283–312 (2005)
Atchadé, Y.F.: An adaptive version for the Metropolis adjusted Langevin algorithm with a truncated drift. Methodol. Comput. Appl. Probab. 8, 235–254 (2006)
Atchadé, Y.F., Fort, G.: Limit Theorems for some adaptive MCMC algorithms with subgeometric kernels. Tech. Rep. (2008)
Atchadé, Y.F., Liu, J.S.: The Wang-Landau algorithm in general state spaces: applications and convergence analysis. Technical report Univ. of Michigan (2004)
Atchadé, Y.F., Rosenthal, J.S.: On adaptive Markov chain Monte Carlo algorithms. Bernoulli 11, 815–828 (2005)
Bai, Y., Roberts, G.O., Rosenthal, J.S.: On the Containment Condition for Adaptive Markov Chain Monte Carlo Algorithms. Tech. Rep. University of Toronto (2008)
Bédard, M.: Optimal acceptance rates for metropolis algorithms: moving beyond 0.234. Tech. Rep. University of Montréal (2006)
Bédard, M.: Weak convergence of metropolis algorithms for non-i.i.d. target distributions. Ann. Appl. Probab. 17, 1222–1244 (2007)
Bennet, J.E., Racine-Poon, A., Wakefield, J.C.: MCMC for nonlinear hierarchical models. In: MCMC in Practice. Chapman & Hall, London (1996)
Benveniste, A., Métivier, M., Priouret, P.: Adaptive Algorithms and Stochastic Approximations. Springer, Berlin (1990)
Besag, J., Green, P.J.: Spatial statistics and Bayesian computation. J. R. Stat. Soc. Ser. B Stat. Methodol. 55, 25–37 (1993)
Borkar, V.S.: Topics in Controlled Markov Chains. Longman, Harlow (1990)
Browne, W.J., Draper, D.: Implementation and performance issues in the Bayesian and likelihood fitting of multilevel models. Comput. Stat. 15, 391–420 (2000)
Cappé, O., Douc, R., Gullin, A., Marin, J.-M., Robert, C.P.: Adaptive Importance Sampling in General Mixture Classes. Preprint (2007)
Ceperley, D., Chester, G.V., Kalos, M.H.: Monte Carlo simulation of a many fermion study. Phys. Rev. B 16(7), 3081–3099 (1977)
Chauveau, D., Vandekerkhove, P.: Improving convergence of the Hastings-Metropolis algorithm with an adaptive proposal. Scand. J. Statist. 29(1), 13–29 (2001)
Chen, H.F., Guo, L., Gao, A.J.: Convergence and robustness of the Robbins-Monro algorithm truncated at randomly varying bounds. Stoch. Process. Their Appl. 27(2), 217–231 (1988)
Chib, S., Greenberg, E., Winkelmann, R.: Posterior simulation and Bayes factors in panel count data models. J. Econ. 86, 33–54 (1998)
de Freitas, N., Højen-Sørensen, P., Jordan, M., Russell, S.: Variational MCMC. In: Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence, pp. 120–127. Morgan Kaufman, San Mateo (2001). ISBN:1-55860-800-1
Delmas, J.-F., Jourdain, B.: Does waste-recycling really improve Metropolis-Hastings Monte Carlo algorithm? Tech. Rep. Cermics, ENPC (2007)
Delyon, B.: General results on the convergence of stochastic algorithms. IEEE Trans. Automat. Control 41(9), 1245–1256 (1996)
Delyon, B., Juditsky, A.: Accelerated stochastic approximation. SIAM J. Optim. 3(4), 868–881 (1993)
Douglas, C.: Simple adaptive algorithms for cholesky, LDL T, QR, and eigenvalue decompositions of autocorrelation matrices for sensor array data. In: Signals, Systems and Computers, 2001, Conference Record of the Thirty-Fifth Asilomar Conference, vol. 21, pp. 1134–1138 (2001)
Erland, S.: On Adaptivity and Eigen-Decompositions of Markov Chains. Ph.D. thesis Norwegian University of Science and Technology (2003)
Frenkel, D.: Waste-recycling Monte Carlo. In: Computer Simulations In Condensed Matter: from Materials to Chemical Biology. Lecture Notes in Physics, vol. 703, pp. 127–138. Springer, Berlin (2006)
Gåsemyr, J.: On an adaptive Metropolis-Hastings algorithm with independent proposal distribution. Scand. J. Stat. 30(1), 159–173 (2003). ISSN 0303-6898
Gåsemyr, J., Natvig, B., Nygård, C.S.: An application of adaptive independent chain Metropolis–Hastings algorithms in Bayesian hazard rate estimation. Methodol. Comput. Appl. Probab. 6(3), 293–302(10) (2004)
Gelfand, A.E., Sahu, S.K.: On Markov chain Monte Carlo acceleration. J. Comput. Graph. Stat. 3(3), 261–276 (1994)
Gelman, A., Roberts, G., Gilks, W.: Efficient Metropolis jumping rules. In: Bayesian Statistics, vol. 5. Oxford University Press, New York (1995)
Geyer, C.J., Thompson, E.A.: Annealing Markov chain Monte Carlo with applications to ancestral inference. J. Am. Stat. Assoc. 90, 909–920 (1995)
Ghasemi, A., Sousa, E.S.: An EM-based subspace tracker for wireless communication applications. In: Vehicular Technology Conference. VTC-2005-Fall. IEEE 62nd, pp. 1787–1790 (2005)
Gilks, W.R., Roberts, G.O., George, E.I.: Adaptive direction sampling. The Statistician 43, 179–189 (1994)
Gilks, W.R., Roberts, G.O., Sahu, S.K.: Adaptive Markov chain Monte Carlo through regeneration. J. Am. Stat. Assoc. 93, 1045–1054 (1998)
Giordani, P., Kohn, R.: Efficient Bayesian inference for multiple change-point and mixture innovation models. Sveriges Riksbank Working Paper No. 196 (2006)
Green, P.J.: Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82, 711–732 (1995)
Green, P.J.: Trans-dimensional Markov chain Monte Carlo. In: Green, P.J., Hjort, N.L., Richardson, S. (eds.) Highly Structured Stochastic Systems. Oxford Statistical Science Series, vol. 27, pp. 179–198. Oxford University Press, London (2003)
Green, P.J., Mira, A.: Delayed rejection in reversible jump Metropolis-Hastings. Biometrica 88(3) (2001)
Haario, H., Saksman, E., Tamminen, J.: Adaptive proposal distribution for random walk Metropolis algorithm. Comput. Stat. 14(3), 375–395 (1999)
Haario, H., Saksman, E., Tamminen, J.: An adaptive Metropolis algorithm. Bernoulli 7(2), 223–242 (2001)
Haario, H., Laine, M., Mira, A., Saksman, E.: DRAM: Efficient adaptive MCMC (2003)
Haario, H., Laine, M., Lehtinen, M., Saksman, E.: Markov chain Monte Carlo methods for high dimensional inversion in remote sensing. J. R. Stat. Soc. Ser. B 66(3), 591–607 (2004)
Haario, H., Saksman, E., Tamminen, J.: Componentwise adaptation for high dimensional MCMC. Comput. Stat. 20, 265–274 (2005)
Hastie, D.I.: Towards automatic reversible jump Markov chain Monte Carlo. Ph.D. thesis Bristol University, March 2005
Holden, L.: Adaptive chains. Tech. Rep. Norwegian Computing Center (1998)
Holden, L. et al.: History matching using adaptive chains. Tech. Report Norwegian Computing Center (2002)
Kesten, H.: Accelerated stochastic approximation. Ann. Math. Stat. 29(1), 41–59 (1958)
Kim, S., Shephard, N., Chib, S.: Stochastic volatility: likelihood inference and comparison with ARCH models. Rev. Econ. Stud. 65, 361–393 (1998)
Laskey, K.B., Myers, J.: Population Markov chain Monte Carlo. Mach. Learn. 50(1–2), 175–196 (2003)
Liu, J., Liang, F., Wong, W.H.: The use of multiple-try method and local optimization in Metropolis sampling. J. Am. Stat. Assoc. 95, 121–134 (2000)
Mykland, P., Tierney, L., Yu, B.: Regeneration in Markov chain samplers. J. Am. Stat. Assoc. 90, 233–241 (1995)
Nott, D.J., Kohn, R.: Adaptive sampling for Bayesian variable selection. Biometrika 92(4), 747–763 (2005)
Pasarica, C., Gelman, A.: Adaptively scaling the Metropolis algorithm using the average squared jumped distance. Tech. Rep. Department of Statistics, Columbia University (2003)
Plakhov, A., Cruz, P.: A stochastic approximation algorithm with step-size adaptation. J. Math. Sci. 120(1), 964–973 (2004)
Ramponi, A.: Stochastic adaptive selection of weights in the simulated tempering algorithm. J. Ital. Stat. Soc. 7(1), 27–55 (1998)
Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Stat. 22, 400–407 (1951)
Robert, C.P., Casella, G.: Monte Carlo Statistical Methods. Springer, Berlin (1999)
Roberts, G.O., Rosenthal, J.: Optimal scaling of discrete approximation to Langevin diffusion. J. R. Stat. Soc. B 60, 255–268 (1998)
Roberts, G.O., Rosenthal, J.S.: Examples of adaptive MCMC. Technical Report University of Toronto (2006)
Roberts, G.O., Rosenthal, J.S.: Coupling and ergodicity of adaptive MCMC. J. Appl. Probab. 44(2), 458–475 (2007)
Roberts, G.O., Gelman, A., Gilks, W.: Weak convergence and optimal scaling of random walk Metropolis algorithms. Ann. Appl. Probab. 7, 110–120 (1997)
Roweis, S.: EM algorithms for PCA and SPCA. Neural Inf. Process. Syst. 10, 626–632 (1997)
Sahu, S.K., Zhigljavsky, A.A.: Adaptation for self regenerative MCMC. Available from http://www.maths.soton.ac.uk/staff/Sahu/research/papers/self.html
Saksman, E., Vihola, M.: On the ergodicity of the adaptive Metropolis algorithm on unbounded domains (2008). arXiv:0806.2933
Sherlock, C., Roberts, G.O.: Optimal scaling of the random walk Metropolis on elliptically symmetric unimodal targets. Tech. Rep. University of Lancaster (2006)
Sims, C.A.: Adaptive Metropolis-Hastings algorithm or Monte Carlo kernel estimation. Tech. report Princeton University (1998)
Spall, J.C.: Adaptive stochastic approximation by the simultaneous perturbation method. IEEE Trans. Automat. Control 45(10), 1839–1853 (2000)
Stramer, O., Tweedie, R.L.: Langevin-type models II: self-targeting candidates for MCMC algorithms. Methodol. Comput. Appl. Probab. 1(3), 307–328 (1999)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Tierney, L., Mira, A.: Some adaptive Monte Carlo methods for Bayesian inference. Stat. Med. 18, 2507–2515 (1999)
Tipping, M.E., Bishop, C.M.: Probabilistic principal component analysis. J. R. Stat. Soc. Ser. B Stat. Methodol. 61, 611–622 (1999)
Winkler, G.: Image Analysis, Random Fields and Markov Chain Monte Carlo Methods: A Mathematical Introduction. Stochastic Modelling and Applied Probability. Springer, Berlin (2003)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Andrieu, C., Thoms, J. A tutorial on adaptive MCMC. Stat Comput 18, 343–373 (2008). https://doi.org/10.1007/s11222-008-9110-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11222-008-9110-y