Skip to main content
Log in

Objective Bayesian testing on the common mean of several normal distributions under divergence-based priors

  • Original Paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

This paper considers the problem of testing on the common mean of several normal distributions. We propose a solution based on a Bayesian model selection procedure in which no subjective input is considered. We construct the proper priors for testing hypotheses about the common mean based on measures of divergence between competing models. This method is called the divergence-based priors (Bayarri and García-Donato in J R Stat Soc B 70:981–1003, 2008). The behavior of the Bayes factors based DB priors is compared with the fractional Bayes factor in a simulation study and compared with the existing tests in two real examples.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Bayarri MJ, García-Donato G (2007) Extending conventional priors for testing general hypotheses in linear model. Biometrika 94:135–152

    Article  MathSciNet  MATH  Google Scholar 

  • Bayarri MJ, García-Donato G (2008) Generalization of Jeffreys divergence-based priors for Bayesian hypothesis testing. J R Stat Soc B 70:981–1003

    Article  MathSciNet  Google Scholar 

  • Berger JO, Bernardo JM (1992) On the development of reference priors (with discussion). In: Bernardo JM et al (eds) Bayesian statistics IV. Oxford University Press, Oxford, pp 35–60

    Google Scholar 

  • Berger JO, Pericchi LR (1996) The intrinsic Bayes factor for model selection and prediction. J Am Stat Assoc 91:109–122

    Article  MathSciNet  MATH  Google Scholar 

  • Berger JO, Mortera J (1999) Default Bayes factors for nonnested hypothesis testing. J Am Stat Assoc 94:542–554

    Article  MathSciNet  MATH  Google Scholar 

  • Bertolino F, Racugno W, Moreno E (2000) Bayesian model selection approach to analysis of variance under heteroscedasticity. J R Stat Soc D 49:503–517

    Article  Google Scholar 

  • Brown LD, Cohen A (1974) Estimation of a common mean and recovery of interblock information. Ann Stat 8:205–211

    MathSciNet  MATH  Google Scholar 

  • Chang CH, Pal N (2008) Testing on the common mean of several normal distributions. Comput Stat Data Anal 53:321–333

    Article  MathSciNet  MATH  Google Scholar 

  • Cohen A, Sackrowitz HB (1974) On estimating of the common mean of two normal populations. Ann Stat 2:1274–1282

    Article  MATH  Google Scholar 

  • Datta GS, Ghosh M (1995) Some remarks on noninformative priors. J Am Stat Assoc 90:1357–1363

    Article  MathSciNet  MATH  Google Scholar 

  • De Santis F, Spezzaferri F (1999) Methods for default and robust Bayesian model comparison: the fractional Bayes factor approach. Int Stat Rev 67:267–286

    Article  MATH  Google Scholar 

  • Eberhardt KR, Reeve CP, Spiegelman CH (1989) A minimax approach to combining means, with practical examples. Chemometr Intell Lab Syst 5:129–148

    Article  Google Scholar 

  • Fairweather WR (1972) A method of obtaining an exact confidence interval for the common mean of several normal populations. Appl Stat 21:229–233

    Article  Google Scholar 

  • García-Donato G, Sun D (2007) Objective priors for hypothesis testing in one-way random effects models. Can J Stat 35:303–320

    Article  MathSciNet  MATH  Google Scholar 

  • Ghosh M, Kim YH (2001) Interval estimation of the common mean of several normal populations: a Bayes-frequentist synthesis. In: Saleh AKMdE (ed) Data analysis from statistical foundations. New Science publishers, New York, pp 277–294

    Google Scholar 

  • Jeffreys H (1961) Theory of probability, 3rd edn. Oxford University Press, Oxford

  • Jordan SM, Krishnamoorthy K (1996) Exact confidence intervals for the common mean of several normal populations. Biometrics 52:77–86

    Article  MATH  Google Scholar 

  • Kass RE, Wasserman L (1995) A reference Bayesian test for nested hypotheses and its relationship to the Schwarz criterion. J Am Stat Assoc 90:928–934

  • Krishnamoorthy K, Lu Y (2003) Inference on the common mean of several normal populations based on the generalized variable method. Biometrics 59:237–247

    Article  MathSciNet  MATH  Google Scholar 

  • Li X, Williamson PP (2014) Testing on the common mean of normal distributions using Bayesian methods. J Stat Comput Simul 84:1363–1380

    Article  MathSciNet  Google Scholar 

  • Lin SH, Lee JC (2005) Generalized inferences on the common mean of several normal populations. J Stat Plan Inference 134:568–582

    Article  MathSciNet  MATH  Google Scholar 

  • Meier P (1953) Variance of weighted mean. Biometrics 9:59–73

    Article  MathSciNet  Google Scholar 

  • Montgomery DC (1991) Design and analysis of experiments, 8th edn. Wiley, New York

  • Moreno E (1997) Bayes factor for intrinsic and fractional priors in nested models: Bayesian robustness. In: Yadolah D (ed) L1-statistical procedures and related topics, 31. Institute of Mathematical Statistics, Hayward, pp 257–270

    Chapter  Google Scholar 

  • Moreno E, Bertolino F, Racugno W (1998) An intrinsic limiting procedure for model selection and hypotheses testing. J Am Stat Assoc 93:1451–1460

    Article  MathSciNet  MATH  Google Scholar 

  • O’Hagan A (1995) Fractional Bayes factors for model comparison (with discussion). J R Stat Soc B 57:99–138

  • O’Hagan A (1997) Properties of intrinsic and fractional Bayes factors. Test 6:101–118

  • Pérez J, Berger JO (2002) Expected posterior prior distributions for model selection. Biometrika 89:491–512

    Article  MathSciNet  MATH  Google Scholar 

  • Sinha BK (1985) Unbiased estimation of the variance of the Graybill-Deal estimator of the common mean of several normal populations. Can J Stat 13:243–247

    Article  MathSciNet  MATH  Google Scholar 

  • Smith AFM, Spiegelhalter DJ (1980) Bayes factor and choice criteria for linear models. J R Stat Soc B 42:213–220

  • Snedecor GW (1950) The statistical part of the scientific method. Ann N. Y. Acad Sci 52:792–799

  • Yu PL, Sun Y, Sinha BK (1999) On exact confidence intervals for the common mean of several normal populations. J Stat Plan Inference 81:263–277

    Article  MathSciNet  MATH  Google Scholar 

  • Zellner A, Siow A (1980) Posterior odds ratio for selected regression hypotheses. In: Bernardo JM et al (eds) Bayesian Statistics 1. University Press, Valencia, pp 585–603

    Google Scholar 

  • Zellner A, Siow A (1984) Basic issues in econometrics. University of Chicago Press, Chicago

    Google Scholar 

  • Zhou L, Mathew T (1993) Combining independent tests in linear models. J Am Stat Assoc 88:650–655

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yongku Kim.

Appendices

Appendix 1: Proof of Theorem 1

Consider model \(M_1\),

$$\begin{aligned} M_1: f_1(\mathbf{x}\vert \mu _{0}, \sigma _1,\ldots ,\sigma _{k})= \prod _{i=1}^k N(\mathbf{x}_i \vert \mu _0,\sigma _i^2), \pi ^N(\sigma _1,\ldots ,\sigma _{k})=\prod _{i=1}^k \sigma _i^{-1}\quad \end{aligned}$$
(21)

and model \(M_2\)

$$\begin{aligned} M_2: f_2(\mathbf{x}\vert \mu , \sigma _1,\ldots ,\sigma _{k})= \prod _{i=1}^k N(\mathbf{x}_i \vert \mu ,\sigma _i^2), \pi ^N(\mu ,\sigma _1,\ldots ,\sigma _{k})=\prod _{i=1}^k \sigma _i^{-1}.\qquad \end{aligned}$$
(22)

Let \({\varvec{\theta }}=\mu \) and \({\varvec{\nu }}=(\sigma _1,\ldots ,\sigma _k)\). Then the Kullback–Leibler directed divergence \(KL[({\varvec{\theta }}_0,{\varvec{\nu }}):({\varvec{\theta }},{\varvec{\nu }})]\) is given by

$$\begin{aligned} KL[({\varvec{\theta }}_0,{\varvec{\nu }}):({\varvec{\theta }},{\varvec{\nu }})]= & {} \int \log \left( { \prod _{i=1}^k N(\mathbf{x}_i \vert \mu ,\sigma _i^2) \over \prod _{i=1}^k N(\mathbf{x}_i \vert \mu _0,\sigma _i^2)}\right) \left( \prod _{i=1}^k N(\mathbf{x}_i \vert \mu ,\sigma _i^2)\right) d\mathbf{x}\nonumber \\= & {} {1\over 2}\sum _{i=1}^{k} {n_i(\mu _0-\mu )^2\over \sigma _i^2}. \end{aligned}$$

Moreover, the Kullback–Leibler directed divergence \(KL[({\varvec{\theta }},{\varvec{\nu }}):({\varvec{\theta }}_0,{\varvec{\nu }})]\) is given by

$$\begin{aligned} KL[({\varvec{\theta }},{\varvec{\nu }}):({\varvec{\theta }}_0,{\varvec{\nu }})]= & {} \int \log \left( { \prod _{i=1}^k N(\mathbf{x}_i \vert \mu _0,\sigma _i^2) \over \prod _{i=1}^k N(\mathbf{x}_i \vert \mu ,\sigma _i^2)}\right) \left( \prod _{i=1}^k N(\mathbf{x}_i \vert \mu _0,\sigma _i^2)\right) d\mathbf{x}\nonumber \\= & {} {1\over 2}\sum _{i=1}^{k} {n_i(\mu _0-\mu )^2\over \sigma _i^2}. \end{aligned}$$

Therefore the sum divergence measure is

$$\begin{aligned} D^S[({\varvec{\theta }},{\varvec{\theta }}_0)\vert {\varvec{\nu }})]= & {} \sum _{i=1}^{k} {n_i(\mu _0-\mu )^2\over \sigma _i^2}. \end{aligned}$$

Additionally, since \(KL[({\varvec{\theta }}_0,{\varvec{\nu }}):({\varvec{\theta }},{\varvec{\nu }})]\) and \(KL[({\varvec{\theta }},{\varvec{\nu }}):({\varvec{\theta }}_0,{\varvec{\nu }})]\) is the same, the minimum divergence measure is the same as the sum divergence measure.

We take the effective sample size \(n^*=n\). This effective sample size is a key part in the definition of DB prior, ensuring that the information incorporated into the prior is equivalent to the information contained in an imaginary sample size (Bayarri and García-Donato 2008; García-Donato and Sun 2007). The idea of using unitary sample information for priors in model selection has been proved to be successful for many authors (see Smith and Spiegelhalter 1980; Kass and Wasserman 1995).

Since

$$\begin{aligned} {\bar{D}}^S[({\varvec{\theta }},{\varvec{\theta }}_0)\vert {\varvec{\nu }})]= \sum _{i=1}^{k}{n_i\over n} \left( {\mu _0-\mu \over \sigma _i}\right) ^2, \end{aligned}$$

and thus,

$$\begin{aligned} c_S(q,{\varvec{\nu }})= & {} \int (1+{\bar{D}}^S[({\varvec{\theta }},{\varvec{\theta }}_0) \vert {\varvec{\nu }}])^{-q}\pi ^N({\varvec{\theta }}\vert {\varvec{\nu }})\hbox {d}{\varvec{\theta }}\\= & {} \int _{-\infty }^{\infty }\left[ 1+ \sum _{i=1}^{k}{n_i\over n} \left( {\mu _0-\mu \over \sigma _i}\right) ^2\right] ^{-q}\hbox {d}\mu \\\le & {} \int _{-\infty }^{\infty } (c_{11}\theta _1^2+2c_{12}\theta _1+c_{22})^{-1} d\theta _1 < \infty , \end{aligned}$$

if \(q> {1\over 2}\). Thus, the conditional sum DB prior with \(q_*^S=1\) is given by

$$\begin{aligned} \pi ^D(\mu \vert \sigma _1,\ldots ,\sigma _k) =c_S^{-1} \left[ 1+\sum _{i=1}^{k}{n_i\over n}\left( {\mu _0-\mu \over \sigma _i}\right) ^2\right] ^{-1}, \end{aligned}$$

where \(c_S\equiv c_S (q_*^S,\sigma _1,\ldots ,\sigma _k) =\int _{-\infty }^{\infty } [1+\sum _{i=1}^{k}{n_i\over n}({\mu _0-\mu \over \sigma _i})^2]^{-1}\hbox {d}\mu .\) This proves Theorem 1. \(\square \)

Appendix 2: Minimal training sample size

Let \(x_m(l)=(x_{11},x_{21},x_{22},\ldots , x_{k1},x_{k2})\) denote minimal training sample. Under model \(M_1\), \(x_{ij}\) is independent and identically distributed \(N(\mu _0,\tau _i^2)\), and under \(M_2\), they are independent and identically distributed \(N(\mu ,\sigma _i^2)\). Thus, under model \(M_1\), the marginal density is given by

$$\begin{aligned} m_1(\mathbf{z})= & {} \int _0^{\infty }\ldots \int _0^{\infty } \left( \sqrt{2\pi }\right) ^{-2k+1}\sigma _1^{-2} \exp \left\{ -{(x_{11}-\mu _0)^2 \over 2\sigma _1^2} \right\} \\&\times \left( \prod _{i=2}^{k}\sigma _i^{-3}\right) \exp \left\{ -\sum _{i=2}^{k} \sum _{j=1}^{2} {(x_{ij}-\mu _0)^2 \over 2\sigma _i^2} \right\} \hbox {d}\sigma _1\ldots \hbox {d}\sigma _k\\= & {} \left( \sqrt{2\pi }\right) ^{-2k+1} 2^{-k}\left[ {(x_{11}-\mu _0)^2\over 2}\right] ^{-{1\over 2}} \prod _{i=2}^{k} \left[ {\sum _{j=1}^2(x_{ij}-\mu _0)^2 \over 2}\right] ^{-1}. \end{aligned}$$

Therefore, the \(m_1(\mathbf{z})\) goes to infinity if \(x_{11}\) is equal to \(\mu _0\).

Let \(x_m(l)=(x_{11},x_{12},\ldots , x_{k1},x_{k2})\) denote the minimal training sample. Under model \(M_1\), \(x_{ij}\) is independent and identically distributed \(N(\mu _0,\tau _i^2)\), and under \(M_2\), they are independent and identically distributed \(N(\mu ,\sigma _i^2)\). Thus, under model \(M_1\),

$$\begin{aligned} m_1(\mathbf{z})= \left( \sqrt{2\pi }\right) ^{-2k} \prod _{i=1}^{k} \left[ {\sum _{j=1}^2(x_{ij}-\mu _0)^2}\right] ^{-1}<\infty . \end{aligned}$$

Further, under model \(M_2\),

$$\begin{aligned} m_2(\mathbf{z})= & {} \left( \sqrt{2\pi }\right) ^{-2k} \int _{-\infty }^{\infty }\prod _{i=1}^{k} \left[ {S_i^2+2({\bar{x}}_i-\mu _0)^2 }\right] ^{-1}\hbox {d}\mu \\= & {} \left( \sqrt{2\pi }\right) ^{-2k} \left( \prod _{i=1}^{k} S_i^2\right) \int _{-\infty }^{\infty }\prod _{i=1}^{k} \left[ {1+{2({\bar{x}}_i-\mu _0)^2\over S_i^2}}\right] ^{-1}\hbox {d}\mu \\\le & {} \left( \sqrt{2\pi }\right) ^{-2k} \left( \prod _{i=1}^{k} S_i^2 \right) \int _{-\infty }^{\infty } \left[ {1+{2({\bar{x}}_i-\mu _0)^2\over S_i^2}}\right] ^{-1}\hbox {d}\mu . \end{aligned}$$

Therefore, this integral is finite because it is a form of the Student’s t density. The minimal training sample size 2k is proved. \(\square \)

Appendix 3: Proof of Theorem 2

The likelihood function under model \(M_1\) is

$$\begin{aligned} L_1(\sigma _1,\ldots ,\sigma _k\vert \mathbf {x} ) = \left( \sqrt{2\pi }\right) ^{-n} \left( \prod _{i=1}^{k}\sigma _i^{-n_i}\right) \exp \left\{ -\sum _{i=1}^k {1\over 2\sigma _i^2}\left[ S_i^2+n_i({\bar{x}}_i-\mu _0)^2\right] \right\} , \end{aligned}$$
(23)

where \(S_i^2=\sum _{j=1}^{n_i} (x_{ij}-{\bar{x}}_i)^2\) and \({\bar{x}}_i =\sum _{j=1}^{n_i} x_{ij}/n_i, i=1,\ldots ,k\). Additionally, under model \(M_1\), the reference prior for \((\sigma _1,\ldots ,\sigma _k)\) is

$$\begin{aligned} \pi _1^N(\sigma _1,\ldots ,\sigma _k) \propto \prod _{i=1}^k \sigma _i^{-1}. \end{aligned}$$
(24)

Then from the likelihood (23) and the reference prior (24), \(m_1(\mathbf {x}\) under model \(M_1\) is given by

$$\begin{aligned} m_1(\mathbf {x})= & {} \int _{0}^{\infty }\ldots \int _{0}^{\infty } L_1(\sigma _1,\ldots ,\sigma _k\vert \mathbf {x}) \pi _{1}^{N}(\sigma _1,\ldots ,\sigma _k)\hbox {d}\sigma _1 \ldots \hbox {d}\sigma _k \nonumber \\= & {} \left( \sqrt{2\pi }\right) ^{-n}2^{-k} \prod _{i=1}^k {\varGamma }\left[ n_i \over 2\right] \left\{ {S_i^2+n_i({\bar{x}}_i-\mu _0)^2\over 2}\right\} ^{-{n_i\over 2}}. \end{aligned}$$
(25)

For model \(M_2\), the reference prior for \((\mu ,\sigma _1,\ldots ,\sigma _k)\) is

$$\begin{aligned} \pi ^N (\mu ,\sigma _1,\ldots ,\sigma _k) \propto \prod _{i=1}^k \sigma _i^{-1}. \end{aligned}$$
(26)

The likelihood function under model \(M_2\) is

$$\begin{aligned} L_2(\mu ,\sigma _1,\ldots ,\sigma _k \vert \mathbf {x}) \!=\!\left( \sqrt{2\pi }\right) ^{-n} \left( \prod _{i=1}^{k}\sigma _i^{-n_i}\right) \exp \left\{ -\sum _{i=1}^k {1\over 2\sigma _i^2}\left[ S_i^2+n_i({\bar{x}}_i-\mu )^2\right] \right\} \end{aligned}$$
(27)

Thus, from the likelihood (27) and the reference prior (26), the \(m_2(\mathbf {x})\) under model \(M_2\) is given as follows.

$$\begin{aligned} m_2(\mathbf {x})= & {} \int _{-\infty }^{\infty }\int _{0}^{\infty }\ldots \int _{0}^{\infty } L_2(\mu ,\sigma _1,\ldots ,\sigma _k\vert \mathbf {x}) \pi _{2}^{N}(\mu ,\sigma _1,\ldots ,\sigma _k) \hbox {d}\sigma _1 \ldots \hbox {d}\sigma _k \hbox {d}\mu \nonumber \\= & {} \left( \sqrt{2\pi }\right) ^{-n}2^{-k} \prod _{i=1}^k {\varGamma }\left[ n_i \over 2\right] \int _{-\infty }^{\infty } \prod _{i=1}^k \left\{ {S_i^2+n_i({\bar{x}}_i-\mu )^2\over 2}\right\} ^{-{n_i\over 2}}\hbox {d}\mu . \end{aligned}$$
(28)

Therefore, \(B_{21}^N\) is given by

$$\begin{aligned} B_{21} ^{N}(\mathbf {x}) = \int _{-\infty }^{\infty } \prod _{i=1}^k \left\{ {S_i^2+n_i({\bar{x}}_i-\mu )^2\over S_i^2+n_i({\bar{x}}_i-\mu _0)^2}\right\} ^{-{n_i\over 2}}\hbox {d}\mu . \end{aligned}$$
(29)

Further, \(\pi ^N(\mu \vert \sigma _1,\ldots ,\sigma _k)=1\). Hence, Theorem 2 is proved. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kang, S.G., Lee, W.D. & Kim, Y. Objective Bayesian testing on the common mean of several normal distributions under divergence-based priors. Comput Stat 32, 71–91 (2017). https://doi.org/10.1007/s00180-016-0699-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-016-0699-6

Keywords

Navigation