Bayesian analysis for matrix-variate logistic regression with/without response misclassification

Fang, Junhan; Yi, Grace Y.

doi:10.1007/s11222-023-10286-4

Bayesian analysis for matrix-variate logistic regression with/without response misclassification

Original Paper
Published: 23 August 2023

Volume 33, article number 121, (2023)
Cite this article

Statistics and Computing Aims and scope Submit manuscript

Junhan Fang^1,2 &
Grace Y. Yi^1,2

314 Accesses
Explore all metrics

Abstract

Matrix-variate logistic regression is useful in facilitating the relationship between the binary response and matrix-variates which arise commonly from medical imaging research. However, inference based on such a model is impaired by the presence of the response misclassification and spurious covariates It is imperative to account for the misclassification effects and select active covatiates when employing matrix-variate logistic regression to handle such data. In this paper, we develop Bayesian inferential methods with the horse-shoe prior. We numerically examine the biases induced from the naive analysis which ignores misclassification of responses. The performance of the proposed methods is justified empirically and their usage is illustrated by the application to the Lee Silverman Voice Treatment (LSVT) Companion data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Global–local shrinkage multivariate logit-beta priors for multiple response-type data

Article 03 February 2024

Pretest and shrinkage estimation of the regression parameter vector of the marginal model with multinomial responses

Article 17 November 2022

Variable selection through adaptive elastic net for proportional odds model

Article Open access 19 January 2024

References

Bhattacharya, A., Pati, D., Pillai, N.S., Dunson, D.B.: Dirichlet–Laplace priors for optimal shrinkage. J. Am. Stat. Assoc. 110, 1479–1490 (2015)
Article MathSciNet MATH Google Scholar
Biane, P., Pitman, J., Yor, M.: Probability laws related to the Jacobi theta and Riemann zeta functions, and Brownian excursions. Bull. Am. Math. Soc. 38, 435–465 (2001)
Carvalho, C.M., Polson, N.G., Scott, J.G.: The horseshoe estimator for sparse signals. Biometrika 97, 465–480 (2010)
Article MathSciNet MATH Google Scholar
Choi, H.M., Hobert, J.P.: The Polya-Gamma Gibbs sampler for Bayesian logistic regression is uniformly ergodic. Electron. J. Stat. 7, 2054–2064 (2013)
Article MathSciNet MATH Google Scholar
Dellaportas, P., Stephens, D.A.: Bayesian analysis of errors-in-variables regression models. Biometrics 51, 1085–1095 (1993)
Article MATH Google Scholar
Fang, J., Yi, G.Y.: Matrix-variate logistic regression with measurement error. Biometrika 108, 83–97 (2020)
Article MathSciNet MATH Google Scholar
Gamerman, D.: Sampling from the posterior distribution in generalized linear mixed models. Stat. Comput. 7, 57–68 (1997)
Article Google Scholar
George, E.I., McCulloch, R.E.: Variable selection via Gibbs sampling. J. Am. Stat. Assoc. 88, 881–889 (1993)
Article Google Scholar
Gerlach, R., Stamey, J.: Bayesian model selection for logistic regression with misclassified outcomes. Stat. Model. 7, 255–273 (2003)
Article MathSciNet MATH Google Scholar
Gramacy, R.B., Polson, N.G.: Simulation-based regularized logistic regression. Bayesian. Analysis 7, 567–590 (2012)
MathSciNet MATH Google Scholar
Guhaniyogi, R., Qamar, S., Dunson, D.B.: Bayesian tensor regression. J. Mach. Learn. Res. 18, 2733–2763 (2017)
MathSciNet MATH Google Scholar
Gustafson, P.: Measurement Error and Misclassification in Statistics and Epidemiology: Impacts and Bayesian Adjustments. CRC Press, Boca Raton (2003)
Book MATH Google Scholar
Holmes, C.C., Held, L.: Bayesian auxiliary variable models for binary and multinomial regression. Bayesian Anal. 1, 145–168 (2006)
MathSciNet MATH Google Scholar
Hung, H., Wang, C.-C.: Matrix variate logistic regression model with application to EEG data. Biostatistics 14, 189–202 (2013)
Article Google Scholar
Ishwaran, H., Rao, J.S.: Spike and slab variable selection: frequentist and Bayesian strategies. Ann. Stat. 33, 730–773 (2005)
Article MathSciNet MATH Google Scholar
McInturff, P., Johnson, W.O., Cowling, D., Gardner, I.A.: Modelling risk when binary outcomes are subject to error. Stat. Med. 23, 1095–1109 (2004)
Article Google Scholar
Paulino, C.D., Soares, P., Neuhaus, J.: Binomial regression with misclassification. Biometrics 59, 670–675 (2003)
Article MathSciNet MATH Google Scholar
Polson, N.G., Scott, J.G., Windle, J.: Bayesian inference for logistic models using Polya-Gamma latent variables. J. Am. Stat. Assoc. 108, 1339–1349 (2013)
Article MATH Google Scholar
Polson, N.G., Scott, J.G., Windle, J.: The Bayesian bridge. J. R. Stat. Soc. B 76, 713–733 (2014)
Article MathSciNet MATH Google Scholar
Rekaya, R., Weigel, K.A., Gianola, D.: Threshold model for misclassified binary responses with applications to animal breeding. Biometrics 57, 1123–1129 (2001)
Article MathSciNet MATH Google Scholar
Richardson, S., Gilks, W.R.: A Bayesian approach to measurement error problems in epidemiology using conditional independence models. Am. J. Epidemiol. 138, 430–442 (1993)
Article Google Scholar
Rossi, P.E., Allenby, G.M., McCulloch, R.: Bayesian Statistics and Marketing. Wiley, New York (2005)
Book MATH Google Scholar
Tanner, M.A., Wong, W.H.: The calculation of posterior distributions by data augmentation. J. Am. Stat. Assoc. 82, 528–540 (1987)
Article MathSciNet MATH Google Scholar
Tibshirani, R.: Regression Shrinkage and Selection via the lasso. J. R. Stat. Soc.: Ser. B (Methodol.) 58, 267–288 (1996)
MathSciNet MATH Google Scholar
Tsanas, A., Little, M.A., Fox, C., Ramig, L.O.: Objective automatic assessment of rehabilitative speech treatment in Parkinson’s disease. IEEE Trans. Neural Syst. Rehabil. Eng. 22, 181–190 (2013)
Article Google Scholar
Wei, R., Ghosal, S.: Contraction properties of shrinkage priors in logistic regression. J. Stat. Plan. Inference 207, 215–229 (2020)
Article MathSciNet MATH Google Scholar
Zeger, S.L., Karim, M.: Generalized linear models with random effects: a Gibbs sampling approach. J. Am. Stat. Assoc. 86, 79–86 (1991)
Article MathSciNet Google Scholar
Zellner, A., Rossi, P.E.: Bayesian analysis of dichotomous quantal response models. J. Econom. 25, 365–393 (1984)
Article MathSciNet MATH Google Scholar
Zhou, H., Li, L., Zhu, H.: Tensor regression with applications in neuroimaging data analysis. J. Am. Stat. Assoc. 108, 540–552 (2013)

Download references

Funding

This research was supported by the Natural Sciences and Engineering Research Council of Canada (NSERC). Yi is Canada Research Chair in Data Science (Tier 1). Her research was undertaken, in part, thanks to funding from the Canada Research Chairs program.

Author information

Authors and Affiliations

Department of Statistical and Actuarial Sciences, Department of Computer Science, University of Western Ontario, London, ON, N6A 5B7, Canada
Junhan Fang & Grace Y. Yi
Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, ON, N2L 3G1, Canada
Junhan Fang & Grace Y. Yi

Authors

Junhan Fang
View author publications
You can also search for this author in PubMed Google Scholar
Grace Y. Yi
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

JF and GYY wrote the manuscript text. JF and GYY reviewed the manuscript.

Corresponding author

Correspondence to Grace Y. Yi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix

Full conditional distribution of hyperparameters

As stated in (3.4), the prior distribution for hyperparameters, $\lambda _{\alpha _i}$, $\lambda _{\beta _i}$, $\lambda _{\gamma _i}$ and a are set as the half-Cauchy distribution. Here we express the conditional distribution of $\lambda _{\gamma _i}$ only, and the conditional distribution of other hyperparameters can be derived using the same manner:

$$\begin{aligned} \begin{aligned} \pi (\lambda _{\alpha _i}| \alpha , \beta ,\gamma ,a)&=\pi (\lambda _{\alpha _i}| \alpha _i,a) \\&\propto \pi (\lambda _{\alpha _i}) \cdot \pi (\alpha _i| \lambda _{\alpha _i},a ) \\&\propto \frac{2}{\pi } \frac{1}{1+\lambda ^2_{\alpha _i}} \cdot \exp \left( -\frac{\alpha ^2_i}{2\lambda ^2_{\alpha _i}a^2}\right) . \end{aligned} \end{aligned}$$

$\pi (\lambda _{\alpha _i}| \alpha , \beta ,\gamma ,a)$ The Slice-sampling algorithm (Polson et al. 2014) is used to generate $\lambda _{\alpha _i}$ as in Sect. 3.3.

Full conditional distribution of $\alpha ^{(r)}$

As claimed in (3.3), the conditional distribution of $\alpha ^{(r)}$ is

$$\begin{aligned}{} & {} \pi (\alpha ^{(r)}|w,\beta ^{(r)},\mathcal {B}_{-r},\{\mathbb {Y},x\} ) \propto \bigg \{ \prod ^{n}_{k=1} P(Y_k = y_k|\mathcal {B})\bigg \}\nonumber \\{} & {} \qquad f(w|\mathcal {B})\pi (\alpha ^{(r)}|\beta ^{(r)},\mathcal {B}_{-r})\nonumber \\{} & {} \quad \propto \prod ^{n}_{k=1}\bigg \{ \frac{\exp (\langle x_k, \mathcal {B}\rangle )^{y_k}}{1+\exp (\langle x_k,\mathcal {B}\rangle ) } \bigg \} \textrm{cosh}\bigg ( \frac{|\langle x_k, \mathcal {B}\rangle |}{2}\bigg ) \cdot \nonumber \\{} & {} \quad \quad \exp \bigg \{-\frac{(\langle x_k, \mathcal {B}\rangle )^2 w_k}{2}\bigg \} \pi (\alpha ^{(r)}|\lambda _{\alpha ^{(r)}},a)\nonumber \\{} & {} \quad = 2^{-n} \pi (\alpha ^{(r)}|\lambda _{\alpha ^{(r)}},a) \prod ^{n}_{k=1} \exp \bigg \{y_k (\langle x_k,\mathcal {B}\rangle )- \frac{\langle x_k, \mathcal {B}\rangle }{2}\nonumber \\{} & {} \quad \quad - \frac{(\langle x_k, \mathcal {B}\rangle )^2w_k}{2} \bigg \}\nonumber \\{} & {} \quad \propto \exp \bigg \{-\frac{1}{2}\alpha ^{(r)\intercal } \Sigma ^{-1}_{\alpha ^{(r)}} \alpha ^{(r)}+ \sum ^n_{k=1} \bigg (y_k-\frac{1}{2}\bigg ) \alpha ^{(r)\intercal }x_k \beta ^{(r)}\nonumber \\{} & {} \qquad -\frac{(\alpha ^{(r)\intercal } x_k \beta ^{(r)})^2}{2}w_k \nonumber \\{} & {} \quad \quad - \alpha ^{(r)\intercal } x_k \beta ^{(r)} \bigg ( \langle x_k, \mathcal {B}_{-r} \rangle \bigg ) w_k \bigg \}\nonumber \\{} & {} \quad = \exp \bigg [ -\frac{1}{2}\alpha ^{(r)\intercal } \Sigma ^{-1}_{\alpha ^{(r)}}\alpha ^{(r)} -\frac{1}{2} \alpha ^{(r)\intercal }x^{\intercal }_{\beta ^{(r)}} \Omega (w) x_{\beta ^{(r)}} \alpha ^{(r)} \nonumber \\{} & {} \quad \quad + x_{\beta ^{(r)}}\bigg \{y- \frac{1}{2}{} {\textbf {1}}_n - x_{\mathcal {B}_{-r}}(w) \bigg \}\alpha ^{(r)} \bigg ]\nonumber \\{} & {} \quad = \exp \bigg [ -\frac{1}{2} \alpha ^{(r)\intercal } \bigg \{x^{\intercal }_{\beta ^{(r)}} \Omega (w) x_{\beta ^{(r)}}+\Sigma ^{-1}_{\alpha ^{(r)}} \bigg \}\alpha ^{(r)}\nonumber \\{} & {} \quad + x_{\beta ^{(r)}}y(w)\alpha ^{(r)} \bigg ] \end{aligned}$$

(B.1)

where the third step is from the fact that $\mathrm{cosh(u)=\frac{1+\exp (2u)}{2\exp (u)}}$, $x_{\beta ^{(r)}} = (x_1 \beta ^{(r)},\ldots ,x_n \beta ^{(r)})^\intercal $, $y=(y_1,\ldots ,y_n)^{\intercal }$, $y(w) = y - \frac{1}{2}{} {\textbf {1}}_n-x_{\mathcal {B}_{-r}}(w) $, $x_{\mathcal {B}_{-r}}(w) = \{( \langle x_1, \mathcal {B}_{-r} \rangle )w_1,\ldots , (\langle x_n,\mathcal {B}_{-r} \rangle )w_n\}^\intercal $, ${\textbf {1}}_n$ is an $n \times 1$ unit vector, $\Omega (w) = diag(w)$, and $\Sigma _{\alpha ^{(r)}}=diag(\lambda ^2_{\alpha ^{(r)}}a^2)$. We observe that (B.1) is the kernel of a multivariate normal with mean $m_{\alpha ^{(r)}}(w)$ and covariance $\Sigma _{\alpha ^{(r)}}(w)$ such that

$$\begin{aligned}&m_{\alpha ^{(r)}}(w) = \Sigma _{\alpha ^{(r)}}(w)x_{\beta ^{(r)}}y(w), \\&\Sigma _{\alpha ^{(r)}}(w) = \bigg \{ x^\intercal _{\beta ^{(r)}} \Omega (w) x_{\beta ^{(r)}}+\Sigma ^{-1}_{\alpha ^{(r)}} \bigg \}^{-1}. \end{aligned}$$

Conditional distribution of $\beta ^{(r)}$

The conditional distribution of $\beta ^{(r)}$ is

$$\begin{aligned} \begin{aligned}&\pi (\beta ^{(r)}|w,\alpha ,\{\mathbb {Y},x\}) \propto \bigg \{ \prod ^{n}_{k=1} P(Y_k = y_k|\mathcal {B})\bigg \}\\&\qquad f(w|\mathcal {B})\pi (\beta ^{(r)}|\alpha ^{(r)},\mathcal {B}_{-r})\\&\quad \propto \prod ^{n}_{k=1}\bigg \{ \frac{\exp (\langle x_k, \mathcal {B}\rangle )^{y_k}}{1+\exp (\langle x_k, \mathcal {B}\rangle ) } \bigg \} \textrm{cosh}\bigg ( \frac{|\langle x_k, \mathcal {B}\rangle |}{2}\bigg ) \cdot \\&\quad \quad \exp \bigg \{-\frac{(\langle x_k, \mathcal {B}\rangle )^2 w_k}{2}\bigg \} \pi (\beta ^{(r)}|\lambda _{\beta ^{(r)}},a) \\&\quad = 2^{-n} \pi (\beta ^{(r)}|\lambda _{\beta ^{(r)}},a) \prod ^{n}_{k=1} \exp \bigg \{y_k (\langle x_k, \mathcal {B}\rangle )- \frac{\langle x_k, \mathcal {B}\rangle }{2} \\&\quad \quad - \frac{(\langle x_k, \mathcal {B}\rangle )^2w_k}{2} \bigg \} \\&\quad \propto \exp \bigg \{-\frac{1}{2}\beta ^{(r)\intercal } \Sigma ^{-1}_{\beta ^{(r)}} \beta ^{(r)}+ \sum ^n_{k=1} \bigg (y_k-\frac{1}{2}\bigg ) \alpha ^{(r)\intercal } x_k \beta ^{(r)} \\&\qquad -\frac{(\alpha ^{(r)\intercal } x_k \beta ^{(r)})^2}{2}w_k \\&\quad \quad - \alpha ^{(r)\intercal } x_k \beta ^{(r)} \bigg ( \langle x_k, \mathcal {B}_{-r} \rangle \bigg ) w_k \bigg \} \\&\quad = \exp \bigg [ -\frac{1}{2}\beta ^{(r)\intercal }\Sigma ^{-1}_{\beta ^{(r)}}\beta ^{(r)} -\frac{1}{2} \beta ^{(r)\intercal } x{^\intercal }_{\alpha ^{(r)}} \Omega (w) x_{\alpha ^{(r)}} \beta ^{(r)} + \\&\quad \quad x_{\alpha ^{(r)}}\bigg \{y- \frac{1}{2}{} {\textbf {1}}_n - x_{\mathcal {B}_{-r}}(w) \bigg \}\beta ^{(r)} \bigg ]\\&\quad = \exp \bigg [ -\frac{1}{2} \beta ^{(r)\intercal } \bigg \{x{^\intercal }_{\alpha ^{(r)}} \Omega (w) x_{\alpha ^{(r)}}+\Sigma ^{-1}_{\beta ^{(r)}}\bigg \}\\&\qquad \beta ^{(r)} + x_{\alpha ^{(r)}}y(w)\beta ^{(r)} \bigg ] \end{aligned} \nonumber \\ \end{aligned}$$

(C.1)

where $x_{\alpha ^{(r)}} = ( x^\intercal _1 \alpha ^{(r)},\ldots , x^\intercal _n\alpha ^{(r)} )^\intercal $, and $\Sigma _{\beta ^{(r)}}=diag(\lambda ^2_{\beta ^{(r)}}a^2)$. We observe that (C.1) is the kernel of a multivariate normal with mean $m_{\beta ^{(r)}}(w)$ and covariance $\Sigma _{\beta ^{(r)}}(w)$ such that

$$\begin{aligned} \begin{aligned}&m_{\beta ^{(r)}}(w) = \Sigma _{\beta ^{(r)}}(w)x_{\alpha ^{(r)}}y(w) \ \ \text {and} \ \ \Sigma _{\beta ^{(r)}}(w) \\&\quad = \bigg \{ x^\intercal _{\alpha ^{(r)}} \Omega (w) x_{\alpha ^{(r)}}+\Sigma ^{-1}_{\beta ^{(r)}} \bigg \}^{-1}. \end{aligned} \end{aligned}$$

Derivation of the conditional distribution (4.2)

Noting the following equivalent statements:

$$\begin{aligned} ``H_{k} =1, {Y^{*}_{k}} ={y^{*}_{k}}''\iff ``H_{k} =1, Y_{k} ={y^{*}_{k}}'' \end{aligned}$$

and

$$\begin{aligned} ``H_{k} =0, {Y^{*}_{k}} ={y^{*}_{k}}'' \iff ``H_{k} =0, Y_{k} =1-{y^{*}_{k}}'', \end{aligned}$$

we have that

$$\begin{aligned} \begin{aligned}&P(H_{k}=1|Y^{*}_{k} = y^{*}_{k},x_{k})\\&\quad = \frac{P(H_{k}=1,Y^{*}_{k} = y^{*}_{k}|x_{k})}{P(H_{k}=1,Y^{*}_{k} = y^{*}_{k}|x_{k}) +P(H_{k}=0,Y^{*}_{k} = y^{*}_{k}|x_{k})} \\&\quad = \frac{P(H_{k}=1,Y_{k} = y^{*}_{k}|x_{k})}{P(H_{k}=1,Y_{k} = y^{*}_{k}|x_{k}) +P(H_{k}=0,Y_{k} =1- y^{*}_{k}|x_{k})} \\&\quad = \frac{P(H_{k}=1|Y_{k} = y^{*}_{k},x_{k})P(Y_{k} = y^{*}_{k}|x_{k})}{\begin{array}{c}P(H_{k}=1|Y_{k} = y^{*}_{k},x_{k})P(Y_{k} = y^{*}_{k}|x_{k})\\ +P(H_{k}=0|Y_{k} = 1-y^{*}_{k},x_{k})P(Y_{k} = 1-y^{*}_{k}|x_{k})\end{array}} \\&\quad = \frac{\rho (y^{*}_{k})p_{k}^{y^{*}_{k}}(1-p_{k})^{1-y^{*}_{k}}}{\rho (y^{*}_{k})p_{k}^{y^{*}_{k}}(1-p_{k})^{1-y^{*}_{k}} +(1-\rho (1-y^{*}_{k}))p_{k}^{1-y^{*}_{k}}(1-p_{k})^{y^{*}_{k}}} \end{aligned} \end{aligned}$$

That is, (4.2) holds.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Fang, J., Yi, G.Y. Bayesian analysis for matrix-variate logistic regression with/without response misclassification. Stat Comput 33, 121 (2023). https://doi.org/10.1007/s11222-023-10286-4

Download citation

Received: 12 November 2022
Accepted: 26 July 2023
Published: 23 August 2023
DOI: https://doi.org/10.1007/s11222-023-10286-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bayesian analysis for matrix-variate logistic regression with/without response misclassification

Abstract

Access this article

Similar content being viewed by others

Global–local shrinkage multivariate logit-beta priors for multiple response-type data

Pretest and shrinkage estimation of the regression parameter vector of the marginal model with multinomial responses

Variable selection through adaptive elastic net for proportional odds model

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix

Full conditional distribution of hyperparameters

Full conditional distribution of \(\alpha ^{(r)}\)

Conditional distribution of \(\beta ^{(r)}\)

Derivation of the conditional distribution (4.2)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Bayesian analysis for matrix-variate logistic regression with/without response misclassification

Abstract

Access this article

Similar content being viewed by others

Global–local shrinkage multivariate logit-beta priors for multiple response-type data

Pretest and shrinkage estimation of the regression parameter vector of the marginal model with multinomial responses

Variable selection through adaptive elastic net for proportional odds model

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix

Full conditional distribution of hyperparameters

Full conditional distribution of \(\alpha ^{(r)}\)

Conditional distribution of \(\beta ^{(r)}\)

Derivation of the conditional distribution (4.2)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation