Data Augmentation

Wu, Ying Nian

doi:10.1007/978-0-387-31439-6_741

Ying Nian Wu²

647 Accesses

Definition

Data augmentation is a Markov chain Monte Carlo algorithm for sampling from a Bayesian posterior distribution

Background

Data augmentation was originally developed by Tanner and Wong [10] as a stochastic counterpart of the EM algorithm [1], and it is closely related to the Gibbs sampler [2]. Thus, the basic setup of data augmentation is similar to the EM algorithm.

Theory

Let y be the observed data and z be the missing data or latent variable. Let p(y, z | θ) be the probability distribution of the complete data (y, z), with θ being the unknown parameter. The marginal distribution of the observed data y is $p(y\vert \theta ) = \int \nolimits \nolimits p(y,z\vert \theta )dz$. Let p(θ) be the prior distribution of θ. The goal is to draw Monte Carlo samples from the posterior distribution $p(\theta \vert y) \propto p(\theta )p(y\vert \theta )$.

The data augmentation algorithm is an iterative algorithm. It starts from an initial value θ₀. Let (θ_t, z_t) be the values of θ...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 899.99; Price excludes VAT (USA)

Hardcover Book: USD 899.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc B 39:1–38
MathSciNet MATH Google Scholar
Geman S, Geman D (1984) Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Trans Pattern Anal Mach Intell 6:721–741
Article MATH Google Scholar
Higdon DM (1998) Auxiliary variable methods for Markov chain Monte Carlo with applications. J Am Stat Assoc 93:585–595
Article MATH Google Scholar
Liu JS, Wu YN (1999) Parameter expansion for data augmentation. J Am Stat Assoc 94(448):1264–1274
Article MATH Google Scholar
Liu JS, Wong WH, Kong A (1994) Covariance structure of the Gibbs sampler with applications to the comparisons of estimators and augmentation schemes. Biometrika 81:27–40
Article MathSciNet MATH Google Scholar
Liu C, Rubin DB, Wu YN (1998) Parameter expansion to accelerate EM: the PX-EM algorithm. Biometrika 85(4): 755–770
Article MathSciNet MATH Google Scholar
Meng XL, van Dyk D (1997) The EM algorithm – an old folk-song sung to a fast new tune. J R Stat Soc B 59:511–567
Article MATH Google Scholar
Meng XL, van Dyk D (1999) Seeking efficient data augmentation schemes via conditional and marginal augmentation. Biometrika 86:301–320
Article MathSciNet MATH Google Scholar
Swendsen RH, Wang J (1987) Nonuniversal critical dynamics in Monte Carlo simulations. Phys Rev Lett 58:86–88
Article Google Scholar
Tanner MA, Wong WH (1987) The calculation of posterior distributions by data augmentation. J Am Stat Assoc 82: 528–540
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Statistics, UCLA, BOX 951554, 8971 MSB, 90095-1554, Los Angeles, CA, USA
Ying Nian Wu

Authors

Ying Nian Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ying Nian Wu .

Editor information

Editors and Affiliations

Institute of Industrial Science, The University of Tokyo, Tokyo, Japan
Katsushi Ikeuchi

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Wu, Y.N. (2014). Data Augmentation. In: Ikeuchi, K. (eds) Computer Vision. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-31439-6_741

Download citation

DOI: https://doi.org/10.1007/978-0-387-31439-6_741
Published: 05 February 2016
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-30771-8
Online ISBN: 978-0-387-31439-6
eBook Packages: Computer ScienceReference Module Computer Science and Engineering

Publish with us

Policies and ethics