Benchmarking techniques for reconciling Bayesian small area models at distinct geographic levels

Janicki, Ryan; Vesper, Andrew

doi:10.1007/s10260-017-0379-x

Benchmarking techniques for reconciling Bayesian small area models at distinct geographic levels

Original Paper
Published: 24 March 2017

Volume 26, pages 557–581, (2017)
Cite this article

Statistical Methods & Applications Aims and scope Submit manuscript

Ryan Janicki¹ &
Andrew Vesper²

237 Accesses
6 Altmetric
Explore all metrics

Abstract

In sample surveys, there is often insufficient sample size to obtain reliable direct estimates for parameters of interest for certain domains. Precision can be increased by introducing small area models which ‘borrow strength’ by connecting different areas through use of explicit linking models, area-specific random effects, and auxiliary covariate information. One consequence of the use of small area models is that small area estimates at a lower (for example, county) geographic level typically will not aggregate to the estimate at the corresponding higher (for example, state) geographic level. Benchmarking is the statistical procedure for reconciling these differences. This paper provides new perspectives for the benchmarking problem, especially for complex Bayesian small area models which require Markov Chain Monte Carlo estimation. Two new approaches to Bayesian benchmarking are introduced: one procedure based on minimum discrimination information, and another procedure for fully Bayesian self-consistent conditional benchmarking. Notably the proposed procedures construct adjusted posterior distributions whose first and higher order moments are consistent with the benchmarking constraints. It is shown that certain existing benchmarked estimators are special cases of the proposed methodology under normality, giving a distributional justification for the use of benchmarked estimates. Additionally, a ‘flexible’ benchmarking constraint is introduced, where the higher geographic level estimate is not considered fixed, and is simultaneously adjusted, along with lower level estimates.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

mind, A methodology for multivariate small area estimation with multiple random effects

Article 23 November 2023

Estimation of small area counts with the benchmarking property

Article 29 November 2018

Estimation of small area proportions under a bivariate logistic mixed model

Article Open access 16 September 2022

Notes

SAHIE estimates can be found at www.census.gov/did/www/sahie/data/index.html.

References

Azzalini A (1985) A class of distributions which includes the normal ones. Scand J Stat 12:171–178
MATH MathSciNet Google Scholar
Battese GE, Harter RH, Fuller WA (1988) An error-components model for prediction of county crop areas using survey and satellite data. J Am Stat Assoc 83:28–36
Article Google Scholar
Bell WR, Datta GS, Ghosh M (2013) Benchmarking small area estimators. Biometrika 100:189–202
Article MATH MathSciNet Google Scholar
Berger JO (1985) Statistical decision theory and Bayesian analysis, 2nd edn. Springer, New York
Book MATH Google Scholar
Datta GS, Ghosh M, Steorts R, Maples J (2011) Bayesian benchmarking with applications to small area estimation. TEST 20:574–588
Article MATH MathSciNet Google Scholar
Fay RE, Herriot RA (1979) Estimates of income from small places: an application of James–Stein procedures to census data. J Am Stat Assoc 74:269–277
Article Google Scholar
Gelfand AE, Smith AFM (1990) Sampling-based approaches to calculating marginal densities. J Am Stat Assoc 85:398–409
Article MATH MathSciNet Google Scholar
Ghosh M, Steorts R (2013) Two-stage Bayesian benchmarking as applied to small area estimation. TEST 22(4):670–687
Article MATH MathSciNet Google Scholar
Ghosh M, Kubokawa T, Kawakubo Y (2015) Benchmarked empirical bayes methods in multiplicative area-level models with risk evaluation. Biometrika 102:647–659
Article MATH MathSciNet Google Scholar
Isaki CT, Tsay JH, Fuller WA (2000) Estimation of census adjustment factors. Surv Methodol 26:31–42
Google Scholar
Jaynes ET (1957) Information theory and statistical mechanics. Phys Rev 106:620–630
Article MATH MathSciNet Google Scholar
Knottnerus P (2003) Sample survey theory: some pythagorean perspectives. Springer, New York
Book MATH Google Scholar
Kullback S (1959) Information theory and statistics. Wiley, New York
MATH Google Scholar
Kullback S, Liebler RA (1951) On information and sufficiency. Ann Math Stat 22:79–86
Article MATH MathSciNet Google Scholar
Nandram B, Toto MCS, Choi JW (2011) A Bayesian benchmarking of the Scott–Smith model for small areas. J Stat Comput Simul 81:1593–1608
Article MATH MathSciNet Google Scholar
Pfeffermann D, Barnard CH (1991) New estimators for small-area means with applications to the assessment of farmland values. J Bus Econ Stat 9:73–84
Google Scholar
R Development Core Team (2011) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0. http://www.R-project.org/
Rao JNK, Molina I (2015) Small area estimation, 2nd edn. Wiley, New York
Seber GAF (2008) A matrix handbook for statisticians. Wiley, Hoboken
MATH Google Scholar
Sostra K, Traat I (2009) Optimal domain estimation under summation restriction. J Stat Plann Inference 139:3928–3941
Article MATH MathSciNet Google Scholar
Toto MCS, Nandram B (2010) A Bayesian predictive inference for small area means incorporating covariates and sampling weights. J Stat Plann Inference 140:2963–2979
Article MATH MathSciNet Google Scholar
Wang J, Fuller WA, Qu Y (2008) Small area estimation under a restriction. Surv Methodol 34:29–36
Google Scholar
You Y, Rao JNK (2002) A pseudo-empirical best linear unbiased prediction approach to small area estimation using survey weights. Can J Stat 30:431–439
Article MATH MathSciNet Google Scholar
You Y, Rao JNK, Dick P (2004) Benchmarking hierarchical Bayes small area estimators in the Canadian census undercoverage estimation. Stat Transit 6:631–640
Google Scholar

Download references

Acknowledgements

This report is released to inform interested parties of ongoing research and to encourage discussion of work in progress. The views expressed are those of the authors and not necessarily those of the U. S. Census Bureau. The authors wish to thank the SAHIE group at the U. S. Census Bureau for many discussions about benchmarking problems related to health insurance estimation. The authors are also grateful to the associate editor and referee for their careful and detailed reviews, which greatly improved the paper.

Author information

Authors and Affiliations

U.S. Census Bureau, 4600 Silver Hill Road, Washington, DC, 20233, USA
Ryan Janicki
Deloitte Consulting LLP, 50 South Sixth Street, Suite 2800, Minneapolis, MN, 55401, USA
Andrew Vesper

Authors

Ryan Janicki
View author publications
You can also search for this author in PubMed Google Scholar
Andrew Vesper
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andrew Vesper.

Appendix 1: Minimum discrimination information

Using the same notation as in the main text, assume the posterior distribution, $\pi $, conditional on the data and auxiliary information, is $\varvec{\theta } \sim N_{m + 1} \left( \tilde{\varvec{\theta }}, \varvec{\varSigma } \right) $. The goal of this section is to compute the distribution $\pi ^*$ which minimizes the K–L divergence (7) from the posterior distribution $\pi $, subject to the benchmarking constraints (9). This constrained minimization problem can be solved in a straightforward way using Lagrange multipliers; however, the calculations are long and tedious. The MDI distribution can be found in a more direct way due to the structure of the constraints and properties of the normal distribution.

Kullback (1959) shows that the solution $\pi ^*$ is a member of an exponential family, and if $\pi $ is Gaussian, so is $\pi ^*$. Hence, $\pi ^* \sim N_{m + 1} \left( \varvec{\mu ^*}, \varvec{\varSigma }^* \right) $, where $\varvec{\mu ^*}$ and $\varvec{\varSigma }^*$ are chosen to satisfy equation (9). Using the properties of the multivariate normal distribution, it is easy to show that

$$\begin{aligned} KL \left( \pi ^*, \pi \right)= & {} \frac{1}{2} \bigg ( \text {tr}\left( \varvec{\varSigma }^{-1} \varvec{\varSigma }^* \right) + \left( \tilde{\varvec{\theta }} - \varvec{\mu }^* \right) ^T \varvec{\varSigma }^{-1} \left( \tilde{\varvec{\theta }} - \varvec{\mu }^* \right) \nonumber \\&\quad + \log \frac{ \det \varvec{\varSigma } }{ \det \varvec{\varSigma }^*} - \left( m + 1 \right) \bigg ). \end{aligned}$$

(23)

Let $E^* \left( \cdot \right) $ and $Var^* \left( \cdot \right) $ denote the expectation and variance operators, respectively, with respect to $\pi ^*$. Then the constraints (9) can be written

$$\begin{aligned} E^* \left( T_1 \left( \varvec{\theta } \right) \right) = E^* \left( \theta \right) - E^* \left( \varvec{C}^T \varvec{\theta } \right) = 0 \end{aligned}$$

(24)

and, using (24),

$$\begin{aligned} E^* \left( T_2 \left( \varvec{\theta } \right) \right) = Var^* \left( \theta \right) - Var^* \left( \varvec{C}^T \varvec{\theta } \right) , \end{aligned}$$

(25)

so that the first constraint (24) is a function only of $\varvec{\mu }^*$ while the second constraint (25) is a function only of $\varvec{\varSigma }^*$. Since (23) can be written as the sum of two terms, one which is a function of $\tilde{\varvec{\theta }}$ and $\varvec{\mu }^*$, and the other which is a function of $\varvec{\varSigma }^*$, the optimization problem can be simplified by minimizing (23) over $\varvec{\mu }^*$ subject to constraint (24), and then minimizing (23) over $\varvec{\varSigma }^*$ subject to constraint (25). Furthermore, since the terms in (23) involving $\varvec{\varSigma }^*$ do not involve $\tilde{\varvec{\theta }}$, the covariance of $\pi ^*$ will not be a function of $\tilde{\varvec{\theta }}$, so we may set $\tilde{\varvec{\theta }} = \varvec{0}$ when solving for $\varvec{\varSigma }^*$, without loss of generality. We can then use a result of Kullback (1959) which states that for a general restriction T, the solution to the minimization problem

$$\begin{aligned} \min _{\pi ^* \left( \varvec{\theta } \right) } \int \pi ^* \left( \varvec{\theta } \right) \log \frac{\pi ^* \left( \varvec{\theta } \right) }{\pi \left( \varvec{\theta } \right) } d \varvec{\theta } \text { such that } \int T \left( \varvec{\theta } \right) \pi ^* \left( \varvec{\theta } \right) d \varvec{\theta } = 0 \end{aligned}$$

is given by $\pi ^* \left( \varvec{\theta }\right) \propto e^{\tau ^* T \left( \varvec{\theta }\right) } \pi \left( \varvec{\theta }\right) $, where $\tau ^*$ is the solution to $\frac{d}{d \tau } \log M_2(\tau ^*) = 0$ and $M_2(\tau ) = \int {e^{\tau T ( \varvec{\theta })} \pi ( \varvec{\theta }) d \varvec{\theta }}$.

1.1 Appendix 1.1: Flexible benchmarking constraint

The first moment condition, $T_1$ in (8), can be used to calculate $\varvec{\mu ^*}$, which is the mean of $\pi ^*$, the solution to

$$\begin{aligned} \min _{\pi ^* \left( \varvec{\theta }\right) } \int \pi ^* \left( \varvec{\theta }\right) \log \frac{ \pi ^* \left( \varvec{\theta }\right) }{ \pi \left( \varvec{\theta }\right) } d \varvec{\theta }\ \ \text {such that} \int T_1 \left( \varvec{\theta }\right) \pi ^* \left( \varvec{\theta }\right) \, d \varvec{\theta } = 0. \end{aligned}$$

Based on the moment generating function of the multivariate normal distribution,

$$\begin{aligned} M_2 ( \tau ) = E \left[ e^{ \tau T_1 ( \varvec{\theta }) } \right] = E \left[ e^{\tau {\varvec{R}}^T \varvec{\theta }} \right] = \text {exp}\left[ \tau \tilde{\varvec{\theta }}^T {\varvec{R}}+ \frac{1}{2} \tau ^{2} {\varvec{R}}^T \varvec{\varSigma } {\varvec{R}}\right] . \end{aligned}$$

The solution to the equation $\frac{\partial }{\partial \tau } \log M_2 \left( \tau \right) = 0$ is $\tau ^* = - \tilde{\varvec{\theta }}^T {\varvec{R}}( {\varvec{R}}^T \varvec{\varSigma } {\varvec{R}})^{-1}$; the MDI distribution, $\pi ^*$, is then

$$\begin{aligned}&\pi ^* ( \varvec{\theta }) \propto e^{\tau ^* T_1 ( \varvec{\theta })} \pi ( \varvec{\theta }) \\&\quad \propto \text {exp}\left\{ - \tilde{\varvec{\theta }}^T {\varvec{R}}( {\varvec{R}}^T \varvec{\varSigma } {\varvec{R}})^{-1} {\varvec{R}}^T \varvec{\theta }- \frac{1}{2} ( \varvec{\theta }- \tilde{\varvec{\theta }})^T \varvec{\varSigma }^{-1} ( \varvec{\theta }- \tilde{\varvec{\theta }} ) \right\} \\&\quad \propto \text {exp}\left\{ - \frac{1}{2} \bigg [ \varvec{\theta }^T \varvec{\varSigma }^{-1} \varvec{\theta }- 2 \tilde{\varvec{\theta }}^T \left( \varvec{\varSigma }^{-1} - {\varvec{R}}( {\varvec{R}}^T \varvec{\varSigma } {\varvec{R}})^{-1} {\varvec{R}}^T \right) \varvec{\theta }\bigg ] \right\} , \end{aligned}$$

so that the multivariate normal MDI distribution satisfying the first moment benchmarking constraint has mean

$$\begin{aligned} \varvec{\mu }^* = \tilde{\varvec{\theta }} - \varvec{\varSigma } {\varvec{R}}( {\varvec{R}}^T \varvec{\varSigma } {\varvec{R}})^{-1} {\varvec{R}}^T \tilde{\varvec{\theta }}. \end{aligned}$$

(26)

The second moment condition, $T_2$, can be used to calculate $\varvec{\varSigma }^*$, which is the covariance of $\pi ^*$, the solution to

$$\begin{aligned} \min _{\pi ^* ( \varvec{\theta })} \int \pi ^* ( \varvec{\theta }) \log \frac{ \pi ^* ( \varvec{\theta }) }{ \pi ( \varvec{\theta }) } d \varvec{\theta }\ \ \text {such that} \int T_2 ( \varvec{\theta }) \pi ^* ( \varvec{\theta }) \, d \varvec{\theta } = 0. \end{aligned}$$

Without loss of generality, assume $\tilde{\varvec{\theta }} = {\mathbf {0}}$. Under this assumption,

$$\begin{aligned} {\varvec{C}}^T \varvec{\theta }_s = \sum ^m_{j = 1} {c_j \theta _j} \sim N ( 0, \sigma ^2_s ), \end{aligned}$$

where $\sigma ^2_s = {\varvec{C}}^T \varvec{\varSigma }_s {\varvec{C}}$. Thus, we have the following independent distributions:

$$\begin{aligned} \frac{\left( {\varvec{C}}^T \varvec{\theta }_s \right) ^2}{\sigma ^2_s} \sim \chi ^2_1, \ \ \frac{\theta ^2}{\sigma ^2} \sim \chi ^2_1. \end{aligned}$$

Using the moment generating function of the Chi-Squared distribution,

$$\begin{aligned} M_2 ( \tau )= & {} E \left[ e^{\tau T_2 ( \varvec{\theta })} \right] = E \left[ e^{ \tau \left( \theta ^2 - \left( {\varvec{C}}^T \varvec{\theta }_s \right) ^2 \right) } \right] \\= & {} E \left[ e^{\tau \theta ^2} \right] E \left[ e^{- \tau \left( {\varvec{C}}^T \varvec{\theta }_s \right) ^2} \right] = ( 1 - 2 \tau \sigma ^2 )^{- \frac{1}{2}} ( 1 + 2 \tau \sigma ^2_s )^{- \frac{1}{2}}. \end{aligned}$$

Solving $ \partial \log M_2 \left( \tau \right) / \partial \tau = 0 $ for $\tau $ gives $\tau ^* = \left( \sigma ^2_s - \sigma ^2 \right) / \left( 4 \sigma ^2_s \sigma ^2 \right) $, hence the MDI distribution, $\pi ^*$, is

$$\begin{aligned}&\pi ^* ( \varvec{\theta }) \propto e^{\tau ^* T_2 ( \varvec{\theta })} \pi ( \varvec{\theta }) \propto \text {exp}\left[ \tau ^* \left( \theta ^2 - \left( {\varvec{C}}^T \varvec{\theta }_s \right) ^2 \right) - \frac{1}{2 \sigma ^2} \theta ^2 - \frac{1}{2} \varvec{\theta }_s^T \varvec{\varSigma }^{-1}_s \varvec{\theta }_s\right] \\&\quad \propto \text {exp}\left[ - \frac{1}{2} \left( \frac{1}{\sigma ^2} - 2 \tau ^* \right) \theta ^2 \right] \text {exp}\left[ - \frac{1}{2} \varvec{\theta }_s^T \left( 2 \tau ^* {\varvec{C}}{\varvec{C}}^T + \varvec{\varSigma }^{-1}_s \right) \varvec{\theta }_s\right] . \end{aligned}$$

This factorization shows that the multivariate normal MDI distribution satisfying the second moment benchmarking constraint has covariance matrix

$$\begin{aligned} \varvec{\varSigma }^* = \left[ \begin{array}{cc} \frac{2 \sigma ^2_s \sigma ^2}{\sigma ^2_s + \sigma ^2} &{} {\mathbf {0}} \\ {\mathbf {0}} &{} \left( \frac{\sigma ^2_s - \sigma ^2}{2 \sigma ^2_s \sigma ^2} {\varvec{C}}{\varvec{C}}^T + \varvec{\varSigma }^{-1}_s \right) ^{-1} \end{array} \right] . \end{aligned}$$

(27)

Combining Eqs. (26) and (27), the MDI distribution satisfying the first and second moment benchmarking restrictions is $\varvec{\theta }^* \sim N_{m + 1} \left( \varvec{\mu ^*}, \varvec{\varSigma }^* \right) $.

1.2 Appendix 1.2: Fixed benchmarking constraint

If the higher level parameters are assumed fixed and known, the moment restrictions are

$$\begin{aligned} T^{\text {fix}}_1 \left( \varvec{\theta }_s \right) = \hat{\theta } - \sum ^m_{j = 1} {c_j \theta _j}, \ \ \ T^{\text {fix}}_2 \left( \varvec{\theta }_s \right) = \left( D + \hat{\theta }^2\right) - \left( \sum ^m_{j = 1} c_j \theta _j \right) ^2. \end{aligned}$$

Using the same procedures as above, the first and second moments of the MDI distribution can be computed separately, and it can be shown that the MDI distribution is $\varvec{\theta }^* \sim N_m \left( \varvec{\mu }^*, \varvec{\varSigma }^* \right) $, where

$$\begin{aligned} \varvec{\mu }^* = \tilde{\varvec{\theta }}_s + \varvec{\varSigma }_s {\varvec{C}}\left( {\varvec{C}}^T \varvec{\varSigma }_s {\varvec{C}}\right) ^{-1} \left( \hat{\theta } - \sum ^m_{j = 1} {c_j \tilde{\theta }_j} \right) , \end{aligned}$$

(28)

and

$$\begin{aligned} \varvec{\varSigma }^* = \left( \left( \frac{\sigma ^2_s - D}{D \sigma ^2_s} \right) {\varvec{C}}{\varvec{C}}^T + \varvec{\varSigma }^{-1}_s \right) ^{-1}. \end{aligned}$$

(29)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Janicki, R., Vesper, A. Benchmarking techniques for reconciling Bayesian small area models at distinct geographic levels. Stat Methods Appl 26, 557–581 (2017). https://doi.org/10.1007/s10260-017-0379-x

Download citation

Accepted: 12 March 2017
Published: 24 March 2017
Issue Date: November 2017
DOI: https://doi.org/10.1007/s10260-017-0379-x

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Benchmarking techniques for reconciling Bayesian small area models at distinct geographic levels

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

mind, A methodology for multivariate small area estimation with multiple random effects

Estimation of small area counts with the benchmarking property

Estimation of small area proportions under a bivariate logistic mixed model

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix 1: Minimum discrimination information

Appendix 1: Minimum discrimination information

1.1 Appendix 1.1: Flexible benchmarking constraint

1.2 Appendix 1.2: Fixed benchmarking constraint

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Subscribe and save

Buy Now