Skip to main content
Log in

Bayesian model averaging of possibly similar nonparametric densities

  • Original Paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

We consider an alternative or expanded data environment where we have sample data from a set of densities that are thought to be similar (as measured by Kullback–Leibler divergence). While estimation methods could easily be applied to each individual sample separately, the purpose of this manuscript is to develop an estimator that: (1) offers greater efficiency if in fact the set of densities are similar while seemingly not losing any if the set of densities are dissimilar; (2) does not require knowledge about the form or extent of similarities between the densities; (3) can be used with parametric or nonparametric methods; (4) allows for correlated data; and (5) is relatively easy to implement. Simulations indicate finite sample performance—in particular small sample performance—is quite promising. Interestingly, in the case where both similar and dissimilar densities are in the set of possible densities, the proposed estimator appropriately puts weight on the similar and not on the dissimilar densities. We apply the proposed estimator to recover a set of county crop yield densities and their corresponding crop insurance premium rates.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. We note that KL divergence is not a true distance measure like Hellinger distance because it does not satisfy the triangle equality criterion.

  2. Kullback–Leibler divergence between density f(x) and g(x) is defined as \(KL(f(x),g(x)) = \int log (\frac{g(x)}{f(x)})g(x)dx\).

  3. All simulation results using mean integrated squared error are available from the authors.

  4. The densities are: (1) standard normal; (2) skewed unimodal; (3) strongly skewed unimodal; (4) kurtotic unimodal; (5) outlier; (6) bimodal; (7) separated bimodal; (8) asymmetric bimodal; (9) trimodal. The remaining densities are very perverse shapes that would not represent yield densities.

References

  • Battese GE, Harter RM, Fuller WA (1988) An error-components model for prediction of county crop areas using survey and satellite data. J Am Stat Assoc 83:28–36

    Article  Google Scholar 

  • Chung Y, Dunson DB (2012) Nonparametric Bayes conditional distribution modeling with variable selection. J Am Stat Assoc 104:1646–1660

  • Congressional Budget Office (2014) H.R. 2642, Agricultural Act of 2014: Cost Estimate. Congressional Budget Office, Washington

    Google Scholar 

  • Diebolt J, Robert CP (1994) Estimation of finite mixture distributions through Bayesian sampling. J R Stat Soc Ser B (Methodol) 56:363–375

  • Draper D (1995) Assessment and propagation of model uncertainty. J R Stat Soc Ser B (Methodol) 57(1):45–97

    MathSciNet  MATH  Google Scholar 

  • Dunson DB (2010) Nonparametric Bayes applications to biostatistics. Bayesian nonparametr 28:223

    Article  MathSciNet  Google Scholar 

  • Elbers C, Lanjouw JO, Lanjouw P (2003) Micro-level estimation of poverty and inequality. Econometrica 71:355–364

    Article  MATH  Google Scholar 

  • Fay RE III, Herriot RA (1979) Estimates of income for small places: an application of James–Stein procedures to census data. J Am Stat Assoc 74:269–277

    Article  MathSciNet  Google Scholar 

  • Ghosh M, Rao J (1994) Small area estimation: an appraisal. Stat Sci 9:55–76

  • Green PJ, Richardson S (2001) Modelling heterogeneity with and without the Dirichlet process. Scand J Stat 28:355–375

    Article  MathSciNet  MATH  Google Scholar 

  • Harri A, Coble K, Ker AP, Goodwin BJ (2011) Relaxing heteroscedasticity assumptions in area-yield crop insurance rating. Am J Agric Econ 93(3):707–717

    Article  Google Scholar 

  • Hjort NL, Glad IK (1995) Nonparametric density estimation with a parametric start. Ann Stat 23(3):882–904

    Article  MathSciNet  MATH  Google Scholar 

  • Hoeting JA, Madigan D, Raftery AE, Volinsky CT (1999) Bayesian model averaging: a tutorial. Stat Sci 14(4):382–401

    Article  MathSciNet  MATH  Google Scholar 

  • Jones M, Linton O, Nielsen J (1995) A simple bias reduction method for density estimation. Biometrika 82(2):327–338

    Article  MathSciNet  MATH  Google Scholar 

  • Jones M, Signorini D (1997) A comparison of higher-order bias kernel density estimators. J Am Stat Assoc 92(439):1063–1073

    Article  MathSciNet  MATH  Google Scholar 

  • Kass RE, Raftery AE (1995) Bayes factors. J Am Stat Assoc 90:773–795

    Article  MathSciNet  MATH  Google Scholar 

  • Ker AP (2016) Nonparametric estimation of possibly similar densities. Stat Probab Lett 117:23–30

    Article  MathSciNet  MATH  Google Scholar 

  • Leamer EE (1978) Specification searches: Ad hoc inference with nonexperimental data, vol 53. Wiley, Hoboken

    MATH  Google Scholar 

  • Madigan D, Raftery AE (1994) Model selection and accounting for model uncertainty in graphical models using Occam’s window. J Am Stat Assoc 89:1535–1546

    Article  MATH  Google Scholar 

  • Marker DA (1999) Organization of small area estimators using a generalized linear regression framework. J Off Stat 15:1

    Google Scholar 

  • Marron JS, Wand MP (1992) Exact mean integrated squared error. Ann Stat 20(2):712–736

    Article  MathSciNet  MATH  Google Scholar 

  • Nadaraya EA (1964) On estimating regression. Theory Probab Appl 9:141–142

    Article  MATH  Google Scholar 

  • Ormoneit D, Tresp V (1998) Averaging, maximum penalized likelihood and Bayesian estimation for improving Gaussian mixture probability density estimates. IEEE Trans Neural Netw 9:639–650

    Article  Google Scholar 

  • Pfeffermann D (2002) Small area estimation: new developments and directions. Int Stat Rev 70:125–143

    MATH  Google Scholar 

  • Raftery AE (1996) Approximate Bayes factors and accounting for model uncertainty in generalised linear models. Biometrika 83(2):251–266

    Article  MathSciNet  MATH  Google Scholar 

  • Raftery AE, Madigan D, Hoeting JA (1997) Bayesian model averaging for linear regression models. J Am Stat Assoc 92(437):179–191

    Article  MathSciNet  MATH  Google Scholar 

  • Rao JNK (2003) Small area estimation. Wiley, New York

  • Richardson S, Green PJ (1997) On Bayesian analysis of mixtures with an unknown number of components. J R Stat Soc Ser B (Methodol) 59:731–792

    Article  MathSciNet  MATH  Google Scholar 

  • Roberts HV (1965) Probabilistic prediction. J Am Stat Assoc 60:50–62

    Article  MathSciNet  MATH  Google Scholar 

  • Roeder K, Wasserman L (1997) Practical Bayesian density estimation using mixtures of normals. J Am Stat Assoc 92:894–902

    Article  MathSciNet  MATH  Google Scholar 

  • Volinsky CT, Madigan D, Raftery AE, Kronmal RA (1997) Bayesian model averaging in proportional hazard models: assessing the risk of a stroke. J R Stat Soc Ser C (Appl Stat) 46(4):433–448

    Article  MATH  Google Scholar 

  • Watson GS (1964) Smooth regression analysis. Sankhyā Indian J Stat Ser A 26:359–372

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alan P. Ker.

Appendix

Appendix

See Tables 3, 4, 5, 6 and 7.

Table 3 Worst case scenario, h chosen by min ISE
Table 4 Mixed case scenario, h chosen by Min ISE
Table 5 Best case scenario, h chosen by Min ISE
Table 6 Density function of candidate densities in shifting Moment simulation
Table 7 KL divergence of shifting moment simulations

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ker, A.P., Liu, Y. Bayesian model averaging of possibly similar nonparametric densities. Comput Stat 32, 349–365 (2017). https://doi.org/10.1007/s00180-016-0700-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-016-0700-4

Keywords

Navigation