Globally and symmetrically identified Bayesian multinomial probit model

Pan, Maolin; Gu, Minggao; Wu, Xianyi; Fan, Xiaodan

doi:10.1007/s11222-023-10232-4

Globally and symmetrically identified Bayesian multinomial probit model

Original Paper
Published: 11 April 2023

Volume 33, article number 68, (2023)
Cite this article

Statistics and Computing Aims and scope Submit manuscript

231 Accesses
1 Altmetric
Explore all metrics

Abstract

Bayesian multinomial probit models have been widely used to analyze discrete choice data. Existing methods have some shortcomings in parameter identification or sensitivity of posterior inference to labeling of choice objects. The main task of this study is to simultaneously deal with these problems. First we propose a globally and symmetrically identified multinomial probit model with covariance matrix positive semidefinite. However, it is difficult to design an efficient Bayesian algorithm to fit the model directly because it is infeasible to sample a positive semidefinite matrix from a regular distribution. Then we develop a projected model for the above proposed model by projection technique. This projected model is equivalent to the former one, but equips with a positive definite covariance matrix. Finally, based on the latter model, we develop an efficient Bayesian algorithm to fit it by using modern Markov chain Monte Carlo techniques. Through simulation studies and an analysis of clothes detergent purchases data, we demonstrated that our approach not only solved the identifiability problem, but also showed robustness and satisfactory estimation accuracy, while the computation costs were comparable.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Bayesian method for multinomial probit model

Article 03 December 2022

A Bayesian Multinomial Probit MODEL FOR THE ANALYSIS OF PANEL CHOICE DATA

Article 10 December 2014

Estimation of multinomial probit-kernel integrated choice and latent variable model: comparison on one sequential and two simultaneous approaches

Article 11 June 2015

References

Albert, J., Chib, S.: Bayesian analysis of binary and polychotomous response data. J. Am. Stat. Assoc. 88, 669–679 (1993)
Anderson, S.P., de Palma, A., Thisse, J.F.: Discrete Choice Theory of Product Differentiation. MIT Press, Cambridge (1992)
Ben-Akiva, M., Lerman, S.R.: Discrete Choice Analysis: Theory and Application to Predict Travel Demand. MIT Press, Cambridge (1985)
Google Scholar
Burgette, L., Nordheim, E.: The trace restriction: an alternative identification strategy for the Bayesian multinomial probit model. J. Bus. Econ. Stat. 30, 404–410 (2012)
Article MathSciNet Google Scholar
Burgette, L., Puelz, D., Hahn, P.: A symmetric prior for multinomial probit models. Bayesian Anal. 16(3), 991–1008 (2021)
Article MathSciNet Google Scholar
de Bekker-Grob, E.W., Ryan, M., Gerard, K.: Discrete choice experiments in health economics: a review of the literature. Health Econ. 21, 145–172 (2012)
Article Google Scholar
Fong, D.K.H., Kim, S., Chen, Z., et al.: A Bayesian multinomial probit model for the analysis of panel choice data. Psychometrika 81, 161–183 (2016)
Article MathSciNet MATH Google Scholar
Hausman, J.A., Wise, D.A.: A conditional probit model for qualitative choice: discrete decisions recognizing interdependence and heterogeneous preferences. Econometrica 46, 403–426 (1978)
Article MathSciNet MATH Google Scholar
Hoff, P.D.: A First Course in Bayesian Statistical Methods. Springer Press, New York (2009)
Book MATH Google Scholar
Imai, K., van Dyk, D.: A Bayesian analysis of the multinomial probit model using marginal data augmentation. Journal of Econometrics 124, 311–334 (2005a)
Article MathSciNet MATH Google Scholar
Imai, K., van Dyk, D.: MNP: R package for fitting the multinomial probit model. J. Stat. Softw. 14, 1–32 (2005b)
Article Google Scholar
Keane, M.P.: A note on identification in the multinomial probit model. J. Bus. Econ. Stat. 10, 193–200 (1992)
Google Scholar
Kruschke, J.K.: Bayesian estimation supersedes the t test. J. Exp. Psychol. Gen. 142(2), 573–603 (2013)
Article Google Scholar
McCulloch, R., Rossi, P.: An exact likelihood analysis of the multinomial probit model. J. Econom. 64, 207–240 (1994)
Article MathSciNet Google Scholar
McCulloch, R., Polson, N., Rossi, P.: A Bayesian analysis of the multinomial probit model with fully identified parameters. J. Econom. 99, 173–193 (2000)
Article MATH Google Scholar
Nobile, A.: A hybrid markov chain for the bayesian analysis of the multinomial probit model. Stat. Comput. 8, 229–242 (1998)
Article Google Scholar
Quinn, K.M., Martin, A.D., Whitford, A.B.: Voter choice in multi-party democracies: a test of competing theories and models. Am. J. Political Sci. 43(4), 1231–1247 (1999)
Article Google Scholar
Small, K.A., Rosen, H.S.: Applied welfare economics with discrete choice models. Econometrica 49, 105–130 (1981)
Article MathSciNet MATH Google Scholar
Tanner, M., Wong, W.: The calculation of posterior distributions by data augmentation. J. Am. Stat. Assoc. 82, 528–540 (1987)
Article MathSciNet MATH Google Scholar
Yang, S., Allenby, G.M.: Modeling interdependent consumer preferences. J. Mark. Res. 40, 282–294 (2003)
Article Google Scholar
Yu, P.L.H.: Bayesian analysis of order-statistics models for ranking data. Psychometrika 65, 281–299 (2000)
Article MATH Google Scholar

Download references

Acknowledgements

This research is partially supported by two grants from National Natural Science Foundation of China (11501287 and 71571096) and three grants from the Research Grants Council General Research Fund of the Hong Kong Special Administrative Region, China (14303819, 14203915 and 14173817).

Author information

Authors and Affiliations

Department of Mathematics, Nanjing University, Nanjing, People’s Republic of China
Maolin Pan
Department of Statistics, The Chinese University of Hong Kong, Hong Kong, People’s Republic of China
Minggao Gu & Xiaodan Fan
School of Statistics, East China Normal University, Shanghai, People’s Republic of China
Xianyi Wu

Authors

Maolin Pan
View author publications
You can also search for this author in PubMed Google Scholar
Minggao Gu
View author publications
You can also search for this author in PubMed Google Scholar
Xianyi Wu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaodan Fan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaodan Fan.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1: Prediction bias regarding BPH’s symmetric MNP model

To start with, we recall the MNP model defined in Eq. (7),

$$\begin{aligned} W=X\beta + \epsilon ,\quad \epsilon \sim \text {N}(0, \Sigma ), \end{aligned}$$

(20)

where W, X and $\beta ,\Sigma $ are of the same definition as in Sect. 3.1. As we discussed before, the parameters in the model (20) are not identified unless some restrictions are imposed on them. Previous studies usually fix the first diagonal entry $\sigma _{11}$ of $\Sigma $ to be some fixed positive value. For ease of explanation, here we first scale the model (20) with restriction $\sigma _{11}=1$ and take it as the baseline model. Then for the model (20) scaled by other identification methods, denoted by

$$\begin{aligned} {\tilde{W}}=X{\tilde{\beta }} + {\tilde{\epsilon }},\quad {\tilde{\epsilon }}\sim \text {N}(0, {\tilde{\Sigma }} ), \end{aligned}$$

(21)

with ${\tilde{\sigma }}_{11}=\alpha ^{2}$ where $\alpha $ is a fixed positive value, we have ${\tilde{\Sigma }}=\alpha ^{2}\Sigma $ and ${\tilde{\beta }}=\alpha \beta $. Furthermore, we have ${\tilde{W}}=\alpha W$ in distribution, and for $ j=1,2,\ldots , p$,

$$\begin{aligned} P({\tilde{w}}_{j}>\max _{k\ne j}{\tilde{w}}_{k})=P(w_{j}>\max _{k\ne j}w_{k}). \end{aligned}$$

Next, we consider the BPH’s partial trace restriction on the model (20). Suppose the b-th diagonal entry of covariance matrix being picked out and restrict the sum of the remaining diagonal entries to $p-1$, which yields the identified model as follows

$$\begin{aligned} W^{(b)}=X\beta ^{(b)} + \epsilon ^{(b)},\quad \epsilon ^{(b)}\sim \text {N}(0, \Sigma ^{(b)} ), \end{aligned}$$

(22)

where $\quad tr(\Sigma ^{(b)}_{-b})=p-1$. As in the model (21), there exists $\alpha _{b}^{2}=\frac{p-1}{tr(\Sigma _{-b})}$, such that $\beta ^{(b)}=\alpha _{b}\beta , \Sigma ^{(b)}=\alpha _{b}^{2}\Sigma $ and $W^{(b)}=\alpha _{b}W$ in distribution. The BPH’s symmetric MNP model averages the above models and obtains posterior mean estimate of coefficients vector and covariance matrix as follows,

$$\begin{aligned} \beta ^{*}=&f_{1}\beta ^{(1)}+f_{2}\beta ^{(2)}+\cdots +f_{p}\beta ^{(p)}\\=&(f_{1}\alpha _{1}+f_{2}\alpha _{2}+\cdots +f_{p}\alpha _{p})\beta ,\\ \Sigma ^{*}=&f_{1}\Sigma ^{(1)}+f_{2}\Sigma ^{(2)}+\cdots +f_{p}\Sigma ^{(p)}\\=&(f_{1}\alpha _{1}^{2}+f_{2}\alpha _{2}^{2}+\cdots +f_{p}\alpha _{p}^{2})\Sigma , \end{aligned}$$

where $f_{j}$ denotes the posterior probability of the event {b=j}. Under BPH’s uniform prior on parameter $b\in \{1,2,\ldots , p\}$, we have $f_{j}>0$ for all j. By Jensen’s inequality,

$$\begin{aligned}{} & {} (f_{1}\alpha _{1}+f_{2}\alpha _{2}+\cdots +f_{p}\alpha _{p})^{2}\\{} & {} \qquad \le f_{1}\alpha _{1}^{2}+f_{2}\alpha _{2}^{2}+\cdots +f_{p}\alpha _{p}^{2}, \end{aligned}$$

the equality holds if and only if $\alpha _{1}=\alpha _{2}=\cdots =\alpha _{p}$. For simplicity, denote

$$\begin{aligned} \alpha _{\beta }=&f_{1}\alpha _{1}+f_{2}\alpha _{2}+\cdots +f_{p}\alpha _{p},\\ \alpha _{\sigma }=&\sqrt{f_{1}\alpha _{1}^{2}+f_{2}\alpha _{2}^{2}+\cdots +f_{p}\alpha _{p}^{2}}. \end{aligned}$$

For given covariate matrix X, denote $X\beta $ by $\mu $ with elements $\mu _{i}, i=1,2,\ldots ,p$. Then the latent random vector $W^{*}$ defined by BPH’s symmetric MNP model follows the normal distribution

$$\begin{aligned} W^{*}=N(\alpha _{\beta }\mu , \alpha _{\sigma }^{2}\Sigma ). \end{aligned}$$

Take $\alpha =\alpha _{\sigma }$, by the MNP model defined in Eq. (21), the latent random vector

$$\begin{aligned} {\tilde{W}}\sim N(\alpha _{\sigma }\mu , \alpha _{\sigma }^{2}\Sigma ). \end{aligned}$$

Without loss of generality, we assume $\mu _{1}\ge \mu _{k}, k=2,\ldots ,p$, at least one inequality holds. In addition, suppose not all diagonal elements of $\Sigma $ are equal, which results in that not all $\alpha _{i}, i=1,2,\ldots ,p$, are equal. Then we have $\alpha _{\beta }<\alpha _{\sigma }$ and further

$$\begin{aligned}&P(w_{1}^{*}>w_{k}^{*}, k\in P_{1})\\ =&P(w_{1}^{*}-\alpha _{\beta }\mu _{1}>w_{k}^{*}-\alpha _{\beta }\mu _{1}, k\in P_{1})\\ =&P(w_{1}^{*}-\alpha _{\beta }\mu _{1}>w_{k}^{*}-\alpha _{\beta }\mu _{k}+\alpha _{\beta }(\mu _{k}-\mu _{1}), k\in P_{1})\\ =&P({\tilde{w}}_{1}-\alpha _{\sigma }\mu _{1}>{\tilde{w}}_{k}-\alpha _{\sigma }\mu _{k}+\alpha _{\beta }(\mu _{k}-\mu _{1}), k\in P_{1})\\ <&P({\tilde{w}}_{1}-\alpha _{\sigma }\mu _{1}>{\tilde{w}}_{k}-\alpha _{\sigma }\mu _{k}+\alpha _{\sigma }(\mu _{k}-\mu _{1}), k\in P_{1})\\ =&P({\tilde{w}}_{1}>{\tilde{w}}_{k}, k\in P_{1})\\ =&P(w_{1}>w_{k}, k\in P_{1}),\\ \end{aligned}$$

where $P_{1}=\{2,3,\ldots ,p\}$. The third equality holds because $W^{*}-\alpha _{\beta }\mu ={\tilde{W}}-\alpha _{\sigma }\mu $ in distribution. The last equality holds because ${\tilde{W}}=\alpha _{\sigma }W$ in distribution. The above inequality says that the probability of choosing the object with label 1 resulting from BPH’s symmetric MNP model will be smaller than that from baseline model. In other words, the BPH’s symmetric MNP model will distort the true choice probabilities unless all the diagonal entries of the true covariance matrix are equal.

Appendix 2: Prior for covariance matrix with trace augmented restriction

Suppose the $p\times p$ matrix ${\tilde{A}}\sim \text {InvWishart}(\nu ,\Psi )$, where $\Psi $ is a positive definite $p\times p$ matrix, and $\nu (\ge p)$ is the degree of freedom. Transform ${\tilde{A}}$ to $(\alpha ^{2}, A)$ as follows,

$$\begin{aligned} \alpha ^{2}=\frac{\text {tr}({\tilde{A}}(I+J))}{p-1}, \quad A=\frac{{\tilde{A}}}{\alpha ^{2}}, \end{aligned}$$

where I is the $p\times p$ identity matrix, J is the $p\times p$ matrix with all entries equal to 1. Let $1\{\cdot \}$ be the indication function, then the joint distribution of $\alpha ^{2}$ and A is

$$\begin{aligned} p(\alpha ^2, A)\propto&\mid A\mid ^{\frac{v+p+1}{2}} \exp \{-\frac{\text {tr}(\Psi A^{-1})}{2\alpha ^2}\} (\alpha ^2) ^{-\frac{vp+2}{2}}\\ {}&\cdot 1\{\text {tr}(A(I+J))=p-1\}, \end{aligned}$$

and the marginal distribution of A is

$$\begin{aligned} p(A)\propto&\mid A\mid ^{-\frac{v+p+1}{2}}\cdot [\text {tr}(\Psi A^{-1})]^{-\frac{vp}{2}}\\ {}&\cdot 1\{\text {tr}(A(I+J))=p-1\}. \end{aligned}$$

Proof

Set ${\tilde{A}}_{ex} ={\tilde{A}}(I+J)$, and make the following transformations

$$\begin{aligned} \alpha _{ex}^{2}=\frac{\text {tr}({\tilde{A}}_{ex})}{p-1}, \quad A_{ex}=\frac{{\tilde{A}}_{ex}}{\alpha _{ex}^{2}}. \end{aligned}$$

(23)

By the distribution assumption of ${\tilde{A}}$, we know

$$\begin{aligned} p({\tilde{A}})\propto \mid {\tilde{A}}\mid ^{-\frac{v+p+1}{2}} \exp \{-\frac{\text {tr}(\Psi {\tilde{A}}^{-1})}{2}\}, \end{aligned}$$

which induces

$$\begin{aligned} p({\tilde{A}}_{ex})&\propto \mid {\tilde{A}}_{ex}\mid ^{-\frac{v+p+1}{2}} \exp \{-\frac{\text {tr}(\Psi _{ex} {\tilde{A}}_{ex}^{-1})}{2}\}\cdot Jacobian_{1}\\&\propto \mid {\tilde{A}}_{ex}\mid ^{-\frac{v+p+1}{2}} \exp \{-\frac{\text {tr}(\Psi _{ex} {\tilde{A}}_{ex}^{-1})}{2}\}, \end{aligned}$$

where $\Psi _{ex}=\Psi (I+J)$. The last proportion holds because $Jacobian_{1}$ is constant as regard to ${\tilde{A}}_{ex}$. Combining Eq. (7) of Burgette and Nordheim (2012) and Eq. (23), we have

$$\begin{aligned} p(\alpha _{ex}^{2}, A_{ex})\propto&\mid A_{ex}\mid ^{-\frac{v+p+1}{2}} \exp \{-\frac{\text {tr}(\Psi _{ex} A_{ex}^{-1})}{2\alpha _{ex}^2}\} \nonumber \\ {}&\cdot (\alpha _{ex}^2)^{-\frac{vp+2}{2}}\cdot 1\{\text {tr}(A_{ex})=p-1\}. \end{aligned}$$

(24)

Since $\alpha ^{2}=\alpha _{ex}^{2}$, $A=A_{ex}(I+J)^{-1}$ and the Jacobian of such transformation is constant, from Eq. (24) we have

$$\begin{aligned} p(\alpha ^{2}, A)\propto&\mid A\mid ^{-\frac{v+p+1}{2}} \exp \{-\frac{\text {tr}(\Psi A^{-1})}{2\alpha ^2}\} (\alpha ^2)^{-\frac{vp+2}{2}}\nonumber \\ {}&\cdot 1\{\text {tr}(A(I+J))=p-1\}. \end{aligned}$$

(25)

By integrating (25) over $\alpha ^{2}$, we have

$$\begin{aligned} p(A)\propto&\mid A\mid ^{-\frac{v+p+1}{2}}\cdot [\text {tr}(\Psi A^{-1})]^{-\frac{vp}{2}} \\ {}&\cdot 1\{\text {tr}(A(I+J))=p-1\}. \end{aligned}$$

$\square $

Appendix 3: Proof of equal posterior distributions

Posterior distributions of the same parameters under different projected GSI models are equal to each other.

Proof

Without loss of generality, we only consider the posterior distributions of $\beta _{-1}$ and $\Sigma _{-1}$. For all $b\ge 2$,

$$\begin{aligned}&p_{b}(\beta _{-b}, \Sigma _{-b}\mid ProJ_{b},W)\\&\quad = p(W_{-b}\mid \beta _{-b},\Sigma _{-b})\cdot p(\beta _{-b})\cdot p(\Sigma _{-b})\\&\quad =\exp \{-\frac{1}{2}\sum _{i=1}^{n}(W_{i,-b}-X_{i,-b}\beta _{-b})^{T}\Sigma _{-b}^{-1}\\&\quad \cdot (W_{i,-b}-X_{i,-b}\beta _{-b})\}\\&\quad \cdot (2\pi )^{-\frac{n(p-1)}{2}}\mid \Sigma _{-b}\mid ^{-\frac{n}{2}}\cdot p(\beta _{-b})\cdot p(\Sigma _{-b})\\&\quad p_{1}(\beta _{-1}, \Sigma _{-1}\mid ProJ_{b},W)\\&\quad = p_{b}(\beta _{-1}^{b},D_{b}\Sigma _{-1}D_{b}^{T}\mid ProJ_{b},W)\\&\quad \cdot \mid J_{\beta _{-b}\rightarrow \beta _{-1}}\mid \cdot \mid J_{\Sigma _{-b}\rightarrow \Sigma _{-1}}\mid \\&\quad =\exp \{-\frac{1}{2}\sum _{i=1}^{n}(W_{i,-b}-X_{i,-b}\beta _{-1}^{b})^{T} (D_{b}\Sigma _{-1}D_{b}^{T})^{-1}\\&\quad \cdot (W_{i,-b}-X_{i,-b}\beta _{-1}^{b})\}\cdot (2\pi )^{-\frac{n(p-1)}{2}}\\&\quad \cdot \mid D_{b}\Sigma _{-1}D_{b}^{T}\mid ^{-\frac{n}{2}}\cdot p(\beta _{-1}^{b})\cdot p(D_{b}\Sigma _{-1}D_{b}^{T})\cdot 1\cdot 1\\&\quad =\exp \{-\frac{1}{2}\sum _{i=1}^{n}(D_{b}^{-1}W_{i,-b}-D_{b}^{-1}X_{i,-b}\beta _{-1}^{b})^{T} \Sigma _{-1}^{-1}\\&\quad \cdot (D_{b}^{-1}W_{i,-b}-D_{b}^{-1}X_{i,-b}\beta _{-1}^{b})\}\cdot (2\pi )^{-\frac{n(p-1)}{2}}\\&\quad \cdot \mid \Sigma _{-1}\mid ^{-\frac{n}{2}}\cdot p(\beta _{-1})\cdot p(\Sigma _{-1})\\&\quad =\exp \{-\frac{1}{2}\sum _{i=1}^{n}(W_{i,-1}-X_{i,-1}\beta _{-1})^{T} \Sigma _{-1}^{-1}\\&\quad \cdot (W_{i,-1}-X_{i,-1}\beta _{-1})\}\cdot (2\pi )^{-\frac{n(p-1)}{2}}\\&\quad \cdot \mid \Sigma _{-1}\mid ^{-\frac{n}{2}}\cdot p(\beta _{-1})\cdot p(\Sigma _{-1})\\&\quad =p_{1}(\beta _{-1}, \Sigma _{-1}\mid ProJ_{1}, W) \end{aligned}$$

where

$$\begin{aligned} \beta _{-1}^{b}= \left( \begin{array}{c} D_{b}\delta _{1,-1}\\ \vdots \\ D_{b}\delta _{q_{1},-1}\\ \gamma \end{array}\right) . \end{aligned}$$

$\square $

Appendix 4: Derivation of the posterior of b

Assume the prior of b is uniform on the set P={$1,2,\ldots , p$}, then the posterior of b given Y is also uniform on P.

Proof

$$\begin{aligned}&p(W_{-1}\mid b=1)\\&\quad =\int f(W_{-1}\mid b=1,\beta _{-1},\Sigma _{-1}) p(\beta _{-1}) p(\Sigma _{-1})\textrm{d}\beta _{-1}\textrm{d}\Sigma _{-1} \end{aligned}$$

where

$$\begin{aligned}&f(W_{-1}\mid b=1,\beta _{-1},\Sigma _{-1})\\&=\exp \{-\frac{1}{2}(W_{-1}-X_{-1}\beta _{-1})^{T} \Sigma _{-1}^{-1}(W_{-1}-X_{-1}\beta _{-1})\}\\&\cdot (2\pi )^{-\frac{p-1}{2}}\mid \Sigma _{-1}\mid ^{-\frac{1}{2}}\\&p(\beta _{-1})=(2\pi )^{-\frac{q_{1}(p-1)+q_{2}}{2}}\mid B\mid ^{-\frac{1}{2}}\exp \{-\frac{1}{2}\beta _{-1}^{T}B^{-1}\beta _{-1}\}. \end{aligned}$$

Since $W_{-p}=D_{p}W_{-1}$, we have

$$\begin{aligned}&p(W_{-p}\mid b=1)\\ =&\int f(D_{p}^{-1}W_{-p}\mid b=1,\beta _{-1},\Sigma _{-1}) p(\beta _{-1}) p(\Sigma _{-1})\textrm{d}\beta _{-1}\textrm{d}\Sigma _{-1}. \end{aligned}$$

Further,

$$\begin{aligned}&f(D_{p}^{-1}W_{-p}\mid b=1,\beta _{-1},\Sigma _{-1}) \\&\quad =\exp \{-\frac{1}{2}(W_{-p}-D_{p}X_{-1}\beta _{-1})^{T}(D_{p}\Sigma _{-1}D_{p}^{T})^{-1}\\ {}&\cdot (W_{-p}-D_{p}X_{-1}\beta _{-1})\}\cdot (2\pi )^{-\frac{p-1}{2}}\mid \Sigma _{-1}\mid ^{-\frac{1}{2}}\\&\quad =\exp \{-\frac{1}{2}(W_{-p}-X_{-p}\beta _{-p})^{T}\Sigma _{-p}^{-1}(W_{-p}-X_{-p}\beta _{-p})\}\\&\quad \cdot (2\pi )^{-\frac{p-1}{2}}\mid \Sigma _{-p}\mid ^{-\frac{1}{2}}\\&\quad =f(W_{-p}\mid b=p,\beta _{-p},\Sigma _{-p}), \end{aligned}$$

where $\Sigma _{-p}=D_{p}\Sigma _{-1}D_{p}^{T}$ and $\delta _{k,-p}=D_{p}\delta _{k,-1}, k=1,\ldots , q_{2}$, $\delta _{k,-p}$ and $\delta _{k,-1}$ are components of $\beta _{-p}$ and $\beta _{-1}$ respectively. Since the absolute values of the Jacobian $J_{\Sigma _{-1}\rightarrow \Sigma _{-p}}$ and $J_{\beta _{-1}\rightarrow \beta _{-p}}$ are both equal to 1, we have

$$\begin{aligned}&p(W_{-p}\mid b=1) \\&\quad \int f(W_{-p}\mid b=p,\beta _{-p},\Sigma _{-p}) p(\beta _{-p}) p(\Sigma _{-p})\textrm{d}\beta _{-p}\textrm{d}\Sigma _{-p}\\&\quad = p(W_{-p}\mid b=p). \end{aligned}$$

By similar deduction, we get equalities as follows

$$\begin{aligned} p(W_{-p}\mid b=1)= & {} p(W_{-p}\mid b=2)= \cdots \\= & {} p(W_{-p}\mid b=p). \end{aligned}$$

Because Y is uniquely determined by $W_{-b}$,

$$\begin{aligned} P(Y\mid b=1)= P(Y\mid b=2)= \cdots =P(Y\mid b=p). \end{aligned}$$

According to the Bayes’s rule,

$$\begin{aligned} P(b=j\mid Y)= \frac{P(Y\mid b=j)P(b=j)}{P(Y)}=\frac{1}{p}, \quad j\in P, \end{aligned}$$

since the prior of b is uniform. That is, the posterior of b given Y is uniform on P. $\square $

Appendix 5: Bayesian estimation of $\theta $

Suppose $Y_{1}, Y_{2},\ldots ,Y_{n}\sim i.i.d.$ N$(\theta , \frac{1}{\phi })$. Then their joint density is given by

$$\begin{aligned}&p(y_{1},\ldots ,y_{n}\mid \theta , \phi )\\&\quad =(2\pi )^{-\frac{n}{2}}\phi ^{\frac{n}{2}}\exp \{-\frac{\phi }{2}\sum _{i=1}^{n}(y_{i}-\theta )^{2}\}. \end{aligned}$$

A conjugate prior distribution for $(\theta , \phi )$ is normal-gamma distribution. In detail,

$$\begin{aligned} \theta \mid \phi \sim N(\mu _{0}, \frac{1}{\kappa _{0}\phi }),\quad \phi \sim Ga(\frac{\nu _{0}}{2}, \frac{SS_{0}}{2}). \end{aligned}$$

We get the joint posterior distribution for $(\theta , \phi )$ as follows

$$\begin{aligned} \theta \mid \phi , y\sim N(\mu _{n}, \frac{1}{\kappa _{n}\phi }),\quad \phi \mid y\sim Ga(\frac{\nu _{n}}{2}, \frac{SS_{n}}{2}) \end{aligned}$$

where

$$\begin{aligned} \kappa _{n}&=\kappa _{0}+n,\quad \nu _{n} =\nu _{0}+n\\ \mu _{n}&=\frac{\kappa _{0}\mu _{0}+n{\bar{y}}}{\kappa _{n}}\\ SS_{n}&=SS_{0}+SS+\frac{n\kappa _{0}}{\kappa _{n}}({\bar{y}}-\mu _{0})^{2} \end{aligned}$$

and ${\bar{y}} =\frac{1}{n}\sum _{i=1}^{n}y_{i},~ SS = \sum _{i=1}^{n}(y_{i}-{\bar{y}})^{2}.$ As pointed out by Hoff (2009), the marginal posterior distribution of $\theta $ has a t distribution, i.e.

$$\begin{aligned} \frac{\theta -\mu _{n}}{\sqrt{\frac{SS_{n}}{\kappa _{n}\nu _{n}}}}\sim t(\nu _{n}). \end{aligned}$$

In the simulation studies in Sects. 4.1 and 4.2, $Y_{1},\ldots ,Y_{50}$ represent the paired differences of average total variations between two competing models. In all cases, we set $\mu _{0}=0$, which means that we have no preference for either of the paired models in advance. And set $\kappa _{0}=1, \nu _{0}=1$. In addition, in view of $\textrm{E}\phi =\frac{\nu _{0}}{SS_{0}}$, we set $SS_{0}\approx \frac{\nu _{0}}{{\hat{\phi }}}$, where ${\hat{\phi }}$ is the estimate of $\phi $, such as taking the reciprocal of corresponding sample variance in our studies.

Figure 8 shows the posterior density plot of $\theta $ in Sect. 4.1, in which the dashed line corresponds to the ATV differences of the BPH model subtracted by the ProJ1 model, and the solid line corresponds to the ATV differences of the AveGSI model subtracted by the ProJ1 model. Figure 9 shows the posterior density plot of $\theta $ in Sect. 4.2, in which the dashed line corresponds to the ATV differences of the BN1 model subtracted by the AveGSI model, and the solid line corresponds to the ATV differences of the ProJ1 model subtracted by the AveGSI model. In Sect. 4.1, posterior density plots of $\theta $ result from comparisons between BPH and GSIs resemble the dashed line in Fig. 8, and the others resemble the solid line. In Sect. 4.2, posterior density plots of $\theta $ result from comparisons between BNs and AveGSI resemble the dashed line in Fig9, and the others resemble the solid line.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Pan, M., Gu, M., Wu, X. et al. Globally and symmetrically identified Bayesian multinomial probit model. Stat Comput 33, 68 (2023). https://doi.org/10.1007/s11222-023-10232-4

Download citation

Received: 29 March 2022
Accepted: 03 March 2023
Published: 11 April 2023
DOI: https://doi.org/10.1007/s11222-023-10232-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Globally and symmetrically identified Bayesian multinomial probit model

Abstract

Access this article

Similar content being viewed by others

A Bayesian method for multinomial probit model

A Bayesian Multinomial Probit MODEL FOR THE ANALYSIS OF PANEL CHOICE DATA

Estimation of multinomial probit-kernel integrated choice and latent variable model: comparison on one sequential and two simultaneous approaches

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix 1: Prediction bias regarding BPH’s symmetric MNP model

Appendix 2: Prior for covariance matrix with trace augmented restriction

Proof

Appendix 3: Proof of equal posterior distributions

Proof

Appendix 4: Derivation of the posterior of b

Proof

Appendix 5: Bayesian estimation of \(\theta \)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Globally and symmetrically identified Bayesian multinomial probit model

Abstract

Access this article

Similar content being viewed by others

A Bayesian method for multinomial probit model

A Bayesian Multinomial Probit MODEL FOR THE ANALYSIS OF PANEL CHOICE DATA

Estimation of multinomial probit-kernel integrated choice and latent variable model: comparison on one sequential and two simultaneous approaches

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix 1: Prediction bias regarding BPH’s symmetric MNP model

Appendix 2: Prior for covariance matrix with trace augmented restriction

Proof

Appendix 3: Proof of equal posterior distributions

Proof

Appendix 4: Derivation of the posterior of b

Proof

Appendix 5: Bayesian estimation of \(\theta \)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation