Abstract
Matrix-variate logistic regression is useful in facilitating the relationship between the binary response and matrix-variates which arise commonly from medical imaging research. However, inference based on such a model is impaired by the presence of the response misclassification and spurious covariates It is imperative to account for the misclassification effects and select active covatiates when employing matrix-variate logistic regression to handle such data. In this paper, we develop Bayesian inferential methods with the horse-shoe prior. We numerically examine the biases induced from the naive analysis which ignores misclassification of responses. The performance of the proposed methods is justified empirically and their usage is illustrated by the application to the Lee Silverman Voice Treatment (LSVT) Companion data.
Similar content being viewed by others
References
Bhattacharya, A., Pati, D., Pillai, N.S., Dunson, D.B.: Dirichlet–Laplace priors for optimal shrinkage. J. Am. Stat. Assoc. 110, 1479–1490 (2015)
Biane, P., Pitman, J., Yor, M.: Probability laws related to the Jacobi theta and Riemann zeta functions, and Brownian excursions. Bull. Am. Math. Soc. 38, 435–465 (2001)
Carvalho, C.M., Polson, N.G., Scott, J.G.: The horseshoe estimator for sparse signals. Biometrika 97, 465–480 (2010)
Choi, H.M., Hobert, J.P.: The Polya-Gamma Gibbs sampler for Bayesian logistic regression is uniformly ergodic. Electron. J. Stat. 7, 2054–2064 (2013)
Dellaportas, P., Stephens, D.A.: Bayesian analysis of errors-in-variables regression models. Biometrics 51, 1085–1095 (1993)
Fang, J., Yi, G.Y.: Matrix-variate logistic regression with measurement error. Biometrika 108, 83–97 (2020)
Gamerman, D.: Sampling from the posterior distribution in generalized linear mixed models. Stat. Comput. 7, 57–68 (1997)
George, E.I., McCulloch, R.E.: Variable selection via Gibbs sampling. J. Am. Stat. Assoc. 88, 881–889 (1993)
Gerlach, R., Stamey, J.: Bayesian model selection for logistic regression with misclassified outcomes. Stat. Model. 7, 255–273 (2003)
Gramacy, R.B., Polson, N.G.: Simulation-based regularized logistic regression. Bayesian. Analysis 7, 567–590 (2012)
Guhaniyogi, R., Qamar, S., Dunson, D.B.: Bayesian tensor regression. J. Mach. Learn. Res. 18, 2733–2763 (2017)
Gustafson, P.: Measurement Error and Misclassification in Statistics and Epidemiology: Impacts and Bayesian Adjustments. CRC Press, Boca Raton (2003)
Holmes, C.C., Held, L.: Bayesian auxiliary variable models for binary and multinomial regression. Bayesian Anal. 1, 145–168 (2006)
Hung, H., Wang, C.-C.: Matrix variate logistic regression model with application to EEG data. Biostatistics 14, 189–202 (2013)
Ishwaran, H., Rao, J.S.: Spike and slab variable selection: frequentist and Bayesian strategies. Ann. Stat. 33, 730–773 (2005)
McInturff, P., Johnson, W.O., Cowling, D., Gardner, I.A.: Modelling risk when binary outcomes are subject to error. Stat. Med. 23, 1095–1109 (2004)
Paulino, C.D., Soares, P., Neuhaus, J.: Binomial regression with misclassification. Biometrics 59, 670–675 (2003)
Polson, N.G., Scott, J.G., Windle, J.: Bayesian inference for logistic models using Polya-Gamma latent variables. J. Am. Stat. Assoc. 108, 1339–1349 (2013)
Polson, N.G., Scott, J.G., Windle, J.: The Bayesian bridge. J. R. Stat. Soc. B 76, 713–733 (2014)
Rekaya, R., Weigel, K.A., Gianola, D.: Threshold model for misclassified binary responses with applications to animal breeding. Biometrics 57, 1123–1129 (2001)
Richardson, S., Gilks, W.R.: A Bayesian approach to measurement error problems in epidemiology using conditional independence models. Am. J. Epidemiol. 138, 430–442 (1993)
Rossi, P.E., Allenby, G.M., McCulloch, R.: Bayesian Statistics and Marketing. Wiley, New York (2005)
Tanner, M.A., Wong, W.H.: The calculation of posterior distributions by data augmentation. J. Am. Stat. Assoc. 82, 528–540 (1987)
Tibshirani, R.: Regression Shrinkage and Selection via the lasso. J. R. Stat. Soc.: Ser. B (Methodol.) 58, 267–288 (1996)
Tsanas, A., Little, M.A., Fox, C., Ramig, L.O.: Objective automatic assessment of rehabilitative speech treatment in Parkinson’s disease. IEEE Trans. Neural Syst. Rehabil. Eng. 22, 181–190 (2013)
Wei, R., Ghosal, S.: Contraction properties of shrinkage priors in logistic regression. J. Stat. Plan. Inference 207, 215–229 (2020)
Zeger, S.L., Karim, M.: Generalized linear models with random effects: a Gibbs sampling approach. J. Am. Stat. Assoc. 86, 79–86 (1991)
Zellner, A., Rossi, P.E.: Bayesian analysis of dichotomous quantal response models. J. Econom. 25, 365–393 (1984)
Zhou, H., Li, L., Zhu, H.: Tensor regression with applications in neuroimaging data analysis. J. Am. Stat. Assoc. 108, 540–552 (2013)
Funding
This research was supported by the Natural Sciences and Engineering Research Council of Canada (NSERC). Yi is Canada Research Chair in Data Science (Tier 1). Her research was undertaken, in part, thanks to funding from the Canada Research Chairs program.
Author information
Authors and Affiliations
Contributions
JF and GYY wrote the manuscript text. JF and GYY reviewed the manuscript.
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix
Full conditional distribution of hyperparameters
As stated in (3.4), the prior distribution for hyperparameters, \(\lambda _{\alpha _i}\), \(\lambda _{\beta _i}\), \(\lambda _{\gamma _i}\) and a are set as the half-Cauchy distribution. Here we express the conditional distribution of \(\lambda _{\gamma _i}\) only, and the conditional distribution of other hyperparameters can be derived using the same manner:
\(\pi (\lambda _{\alpha _i}| \alpha , \beta ,\gamma ,a)\) The Slice-sampling algorithm (Polson et al. 2014) is used to generate \(\lambda _{\alpha _i}\) as in Sect. 3.3.
Full conditional distribution of \(\alpha ^{(r)}\)
As claimed in (3.3), the conditional distribution of \(\alpha ^{(r)}\) is
where the third step is from the fact that \(\mathrm{cosh(u)=\frac{1+\exp (2u)}{2\exp (u)}}\), \(x_{\beta ^{(r)}} = (x_1 \beta ^{(r)},\ldots ,x_n \beta ^{(r)})^\intercal \), \(y=(y_1,\ldots ,y_n)^{\intercal }\), \(y(w) = y - \frac{1}{2}{} {\textbf {1}}_n-x_{\mathcal {B}_{-r}}(w) \), \(x_{\mathcal {B}_{-r}}(w) = \{( \langle x_1, \mathcal {B}_{-r} \rangle )w_1,\ldots , (\langle x_n,\mathcal {B}_{-r} \rangle )w_n\}^\intercal \), \({\textbf {1}}_n\) is an \(n \times 1\) unit vector, \(\Omega (w) = diag(w)\), and \(\Sigma _{\alpha ^{(r)}}=diag(\lambda ^2_{\alpha ^{(r)}}a^2)\). We observe that (B.1) is the kernel of a multivariate normal with mean \(m_{\alpha ^{(r)}}(w)\) and covariance \(\Sigma _{\alpha ^{(r)}}(w)\) such that
Conditional distribution of \(\beta ^{(r)}\)
The conditional distribution of \(\beta ^{(r)}\) is
where \(x_{\alpha ^{(r)}} = ( x^\intercal _1 \alpha ^{(r)},\ldots , x^\intercal _n\alpha ^{(r)} )^\intercal \), and \(\Sigma _{\beta ^{(r)}}=diag(\lambda ^2_{\beta ^{(r)}}a^2)\). We observe that (C.1) is the kernel of a multivariate normal with mean \(m_{\beta ^{(r)}}(w)\) and covariance \(\Sigma _{\beta ^{(r)}}(w)\) such that
Derivation of the conditional distribution (4.2)
Noting the following equivalent statements:
and
we have that
That is, (4.2) holds.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Fang, J., Yi, G.Y. Bayesian analysis for matrix-variate logistic regression with/without response misclassification. Stat Comput 33, 121 (2023). https://doi.org/10.1007/s11222-023-10286-4
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11222-023-10286-4