Bayesian inference on longitudinal-survival data with multiple features

Lu, Tao

doi:10.1007/s00180-016-0681-3

Bayesian inference on longitudinal-survival data with multiple features

Original Paper
Published: 18 October 2016

Volume 32, pages 845–866, (2017)
Cite this article

Computational Statistics Aims and scope Submit manuscript

Tao Lu¹

416 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

The modeling of longitudinal and survival data is an active research area. Most of researches focus on improving the estimating efficiency but ignore many data features frequently encountered in practice. In this article, we develop a joint model that concurrently accounting for longitudinal-survival data with multiple features. Specifically, our joint model handles skewness, limit of detection, missingness and measurement errors in covariates which are typical observed in the collection of longitudinal-survival data from many studies. We employ a Bayesian approach for making inference on the joint model. The proposed model and method are applied to an AIDS study. A few alternative models under different conditions are compared. Some interesting results are reported. Simulation studies are conducted to assess the performance of the proposed methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A two-stage approach for Bayesian joint models: reducing complexity while maintaining accuracy

Article Open access 31 July 2023

Danilo Alvares & Valeria Leiva-Yamaguchi

Stable Non-Linear Generalized Bayesian Joint Models for Survival-Longitudinal Data

Article 08 January 2021

Janet van Niekerk, Haakon Bakka & Håvard Rue

Joint models with multiple longitudinal outcomes and a time-to-event outcome: a corrected two-stage approach

Article Open access 04 March 2020

Katya Mauff, Ewout Steyerberg, … Dimitris Rizopoulos

References

Albert PS, Shih JH (2010) On estimating the relationship between longitudinal measurements and time-to-event data using a simple two-stage procedure. Biometrics 66:983–987
Article MathSciNet MATH Google Scholar
Arellano-Valle R, Genton M (2005) On fundamental skew distributions. J Multivar Anal 96:93–116
Article MathSciNet MATH Google Scholar
Azzalini A, Capitanio A (2003) On fundamental skew distributions. J R Stat Soc Ser B 65:367–389
Article MATH Google Scholar
Azzalini A, Genton M (2008) Robust likelihood methods based on the skew-t and related distributions. Int Stat Rev 76:106–129
Article MATH Google Scholar
Brown ER, Ibrahim JG (2003) Bayesian approaches to joint cure-rate and longitudinal models with applications to cancer vaccine trials. Biometrics 59(3):686–693
Article MathSciNet MATH Google Scholar
Celeux G, Forbes F, Robert C, Titterington M (2006) Deviance information criteria for missing data models. Bayesian Anal 1:651–674
Article MathSciNet MATH Google Scholar
Elashoff RM, Li G, Li N (2008) A joint model for longitudinal measurements and survival data in the presence of multiple failure types. Biometrics 64:762–771
Article MathSciNet MATH Google Scholar
Eubank R (1999) Nonparametric regression and spline smoothing. Dekker, New York
MATH Google Scholar
Faucett CL, Thomas DC (1996) Simultaneously modelling censored survival data and repeatedly measured covariates: a Gibbs sampling approach. Stat Med 15(15):1663–1685
Article Google Scholar
Gelman A, Rubin D (1992) Inference from iterative simulation using multiple sequences. Stat Sci 7:457–511
Article Google Scholar
Gelman A, Carlin J, Stern H, Rubin D (2003) Bayesian data analysis. Chapman and Hall, London
MATH Google Scholar
Henderson R, Diggle P, Dobson A (2000) Joint modelling of longitudinal measurements and event time data. Biostatistics 1(4):465–480
Article MATH Google Scholar
Ho H, Lin T (2010) Robust linear mixed models using the skew-t distribution with application to schizophrenia data. Biometr J 52:449–469
Article MathSciNet MATH Google Scholar
Hu W, Li G, Li N (2009) A Bayesian approach to joint analysis of longitudinal measurements and competing risks failure time data. Stat Med 28:1601–1619
Article MathSciNet Google Scholar
Huang X, Li G, Elashoff R, Pan J (2011) A general joint model for longitudinal measurements and competing risks survival data with heterogeneous random effects. Lifetime Data Anal 17:80–100
Article MathSciNet MATH Google Scholar
Hughes J (1999) Mixed effects models with censored data with application to HIV RNA levels. Biometrics 55:625–629
Article MATH Google Scholar
Jara A, Quintana F, Martin E (2008) Linear mixed models with skew-elliptical distributions: a Bayesian approach. Comput Stat Data Anal 52:5033–5045
Article MathSciNet MATH Google Scholar
Lachos V, Bandyopadhyay D, Dey D (2011) Linear and nonlinear mixed-effects models for censored HIV viral loads using normal/independent distributions. Biometrics 67:1594–1604
Article MathSciNet MATH Google Scholar
Li N, Elashoff RM, Li G, Tseng CH (2012) Joint analysis of bivariate longitudinal ordinal outcomes and competing risks survival times with nonparametric distributions for random effects. Stat Med 31:1707–1721
Article MathSciNet Google Scholar
Little R, Rubin D (2002) Statistical analysis with missing data. Wiley, New York
Book MATH Google Scholar
Liu W, Wu L (2007) Simultaneous inference for semiparametric nonlinear mixed-effects models withco variate measurement errors and missing responses. Biometrics 63:342–350
Article MathSciNet MATH Google Scholar
Rizopoulos D (2011) Dynamic predictions and prospective accuracy in joint models for longitudinal and time-to-event data. Biometrics 67(3):819–829
Article MathSciNet MATH Google Scholar
Sahu S, Dey D, Branco M (2003) A new class of multivariate skew distributions with applications to Bayesian regression models. Can J Stat 31:129–150
Article MathSciNet MATH Google Scholar
Wu L (2002) A joint model for nonlinear mixed-effects models with censoring and covariates measured with error. J Am Stat Assoc 97:955–964
Article MATH Google Scholar
Wu L (2007) HIV viral dynamic models with dropouts and missing covariates. Stat Med 26:3342–3357
Article MathSciNet Google Scholar
Wulfsohn MS, Tsiatis AA (1997) A joint model for survival and longitudinal data measured with error. Biometrics 53:330–339
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics and Statistics, University of Nevada, Reno, NV, 89557, USA
Tao Lu

Authors

Tao Lu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tao Lu.

Appendix: Multivariate skew distributions

Different versions of multivariate skew distributions have been introduced in the literature (Sahu et al. 2003; Azzalini and Capitanio 2003; Azzalini and Genton 2008; Jara et al. 2008). A new class of distributions by introducing skewness in multivariate elliptically distributions were developed in publication (Sahu et al. 2003). The class, which is obtained by using transformation and conditioning, contains many standard families including the multivariate skew-normal (SN) and skew-t (ST) distributions as special cases. A k-dimensional random vector ${\varvec{Y}}$ follows a k-variate skew-elliptical (SE) distribution if its probability density function (pdf) is given by

$$\begin{aligned} f({\varvec{y}}|{\varvec{\mu }},{\varvec{\Sigma }},{\varvec{\Delta }};m^{(k)}_{\nu })= 2^k f({\varvec{y}}|{\varvec{\mu }}, {\varvec{A}};m^{(k)}_{\nu })P({\varvec{V}}>\mathbf 0 ), \end{aligned}$$

(14)

where ${\varvec{A}}={\varvec{\Sigma }}+{\varvec{\Delta }}^2$, ${\varvec{\mu }}$ is a location parameter vector, ${\varvec{\Sigma }}$ is a $k \times k$ positive (diagonal) covariance matrix, ${\varvec{\Delta }}=\text {diag}(\delta _1, \delta _2,\ldots , \delta _k)$ is a $k \times k$ skewness matrix with the skewness parameter vector ${\varvec{\delta }}=(\delta _1,\delta _2,\ldots ,\delta _k)^T$; ${\varvec{V}}$ follows the elliptical distribution $El({\varvec{\Delta }}A^{-1}({\varvec{y}}-{\varvec{\mu }}), {\varvec{I}}_{k}-{\varvec{\Delta }}A^{-1}{\varvec{\Delta }}; m^{(k)}_{\nu })$ and the density generator function $m^{(k)}_{\nu }(\zeta )=\frac{\Gamma (k/2)}{\pi ^{k/2}}\frac{m_{\nu }(\zeta )}{\int _0^{\infty }r^{k/2-1}m_{\nu }(\zeta )dr}$, with $m_{\nu }(\zeta )$ being a function such that $\int _0^{\infty }r^{k/2-1}m_{\nu }(\zeta )dr$ exists. The function $m_{\nu }(\zeta )$ provides the kernel of the original elliptical density and may depend on the parameter $\nu $. This SE distribution is denoted by $SE({\varvec{\mu }},{\varvec{\Sigma }},{\varvec{\Delta }};m^{(k)})$. Two examples of $m_{\nu }(\zeta )$, leading to important special cases used throughout the paper, are $m_{\nu }(\zeta )=\exp (-\zeta /2)$ and $m_{\nu }(\zeta )=(1+\zeta /\nu )^{-(\nu +k)/2}$, where $\nu >0$. These two expressions lead to the multivariate SN and ST distributions, respectively. In the latter case, $\nu $ corresponds to the degrees of freedom parameter.

As we know, a normal distribution is a special case of an SN distribution when the skewness parameter is zero, and the ST distribution reduces to the SN distribution when degrees of freedom are large. For completeness, this Appendix briefly summarizes the multivariate ST distribution introduced by Sahu et al. (2003) to be suitable for a Bayesian inference since it is built using the conditional method. For detailed discussions on properties of ST distribution, see Reference Sahu et al. (2003). Assume a k-dimensional random vector ${\varvec{Y}}$ follows a k variate ST distribution with location vector ${\varvec{\mu }}$, $k \times k$ positive (diagonal) covariance matrix ${\varvec{\Sigma }}$ and $k \times k$ skewness matrix ${\varvec{\Delta }}=\text {diag}(\delta _1, \delta _2,\ldots , \delta _k)$ or the degrees of freedom $\nu $.

A k-dimensional random vector ${\varvec{Y}}$ follows an m-variate ST distribution if its probability density function (pdf) is given by

$$\begin{aligned} f({\varvec{y}}|{\varvec{\mu }},{\varvec{\Sigma }},{\varvec{\Delta }},\nu )= 2^k t_{k,\nu }({\varvec{y}}|{\varvec{\mu }}, {\varvec{A}})P({\varvec{V}}>\mathbf 0 ), \end{aligned}$$

(15)

we denote the k-variate t distribution with parameters ${\varvec{\mu }}$, ${\varvec{A}}$ and degrees of freedom $\nu $ by $t_{k,\nu }({\varvec{\mu }}, {\varvec{A}})$ and the corresponding pdf by $t_{k,\nu }({\varvec{y}}|{\varvec{\mu }}, {\varvec{A}})$ henceforth, ${\varvec{V}}$ follows the t distribution $t_{k,\nu +k}$. We denote this distribution by $ST_{k,\nu }({\varvec{\mu }},{\varvec{\Sigma }},{\varvec{\Delta }})$. In particular, when ${\varvec{\Sigma }}=\sigma ^2 {\varvec{I}}_k$ and ${\varvec{\Delta }}=\delta {\varvec{I}}_k$, the Eq. (15) simplifies to

$$\begin{aligned} f({\varvec{y}}|{\varvec{\mu }},\sigma ^2,\delta ,\nu )= 2^k (\sigma ^2+\delta ^2)^{-k/2}\frac{\Gamma ((\nu +k)/2)}{\Gamma (\nu /2)(\nu \pi )^{k/2}} \left\{ 1+\frac{({\varvec{y}}-{\varvec{\mu }})^T({\varvec{y}}-{\varvec{\mu }})}{\nu (\sigma ^2+\delta ^2)} \right\} ^{-(\nu +k)/2} \\ \times \, T_{k,\nu +k} \left[ \left\{ \frac{\nu +(\sigma ^2+\delta ^2)^{-1}({\varvec{y}}-{\varvec{\mu }})^T({\varvec{y}}-{\varvec{\mu }})}{\nu +k}\right\} ^{-1/2}\frac{\delta ({\varvec{y}}-{\varvec{\mu }})}{\sigma \sqrt{\sigma ^2+\delta ^2}}\right] , \end{aligned}$$

where $T_{k,\nu +k}(\cdot )$ denotes the cumulative distribution function (cdf) of $t_{k,\nu +k}(\mathbf 0 ,{\varvec{I}}_k)$. However, unlike in the SN distribution, the ST density can not be written as the product of univariate ST densities. Here ${\varvec{Y}}$ is dependent but uncorrelated. It is noted that when ${\varvec{\delta }}=\mathbf 0 $, the ST distribution reduces to usual the t-distribution. It can be shown that the mean and covariance matrix of the ST distribution $ST_{k,\nu }({\varvec{\mu }},\sigma ^2 {\varvec{I}}_k,{\varvec{\Delta }})$ are given by

$$\begin{aligned} E({\varvec{Y}})= & {} {\varvec{\mu }}+(\nu /\pi )^{1/2}\frac{\Gamma ((\nu -1)/2)}{\Gamma (\nu /2)}{\varvec{\delta }}, \nonumber \\ \text {cov}({\varvec{Y}})= & {} \left[ \sigma ^2{\varvec{I}}_k+{\varvec{\Delta }}^2({\varvec{\delta }})\right] \frac{\nu }{\nu -2}-\frac{\nu }{\pi }\left[ \frac{\Gamma \{(\nu -1)/2\}}{\Gamma (\nu /2)}\right] ^2{\varvec{\Delta }}^2({\varvec{\delta }}). \end{aligned}$$

(16)

In order to have a zero mean, we should assume the location parameter ${\varvec{\mu }}=-(\nu /\pi )^{1/2}$ $\frac{\Gamma ((\nu -1)/2)}{\Gamma (\nu /2)}{\varvec{\delta }}$. According to Lemma 1 of Azzalini and Capitanio (2003), if ${\varvec{Y}}$ follows $ST_{k,\nu }({\varvec{\mu }},{\varvec{\Sigma }},{\varvec{\Delta }})$, it can be represented by

$$\begin{aligned} {\varvec{Y}}={\varvec{\mu }}+\zeta ^{-1/2}{\varvec{X}}\end{aligned}$$

(17)

where $\zeta $ follows a Gamma distribution $\Gamma (\nu /2,\nu /2)$, which is independent of ${\varvec{X}}$, and ${\varvec{X}}$ follows a k-dimensional skew-normal (SN) distribution, denoted by $SN_k(\mathbf 0 ,{\varvec{\Sigma }},{\varvec{\Delta }})$. It follows from (17) that ${\varvec{Y}}|\zeta \sim SN_k({\varvec{\mu }}, \zeta ^{-1}{\varvec{\Sigma }},\zeta ^{-1/2}{\varvec{\Delta }})$. Following studies by Azzalini and Genton (2008), the SN distribution of ${\varvec{Y}}$, conditional on $\zeta $, has a convenient stochastic representation as

$$\begin{aligned} {\varvec{Y}}={\varvec{\mu }}+\zeta ^{-1/2}{\varvec{\Delta }}|{\varvec{X}}_0|+\zeta ^{-1/2}{\varvec{\Sigma }}^{1/2}{\varvec{X}}, \end{aligned}$$

(18)

where ${\varvec{X}}_0$ and ${\varvec{X}}$ are two independent $N_k(\mathbf 0 ,{\varvec{I}}_k)$ random vectors. Note that the expression (18) provides a convenience device for random number generation and for implementation purpose. Let ${\varvec{w}}=\zeta ^{-1/2}|{\varvec{X}}_0|$; then ${\varvec{w}}$, conditional on $\zeta $, follows a k-dimensional normal distribution $N_k(\mathbf 0 , \zeta ^{-1}{\varvec{I}}_k)$ truncated in the space ${\varvec{w}}>\mathbf 0 $ (i.e., the half-normal distribution). Thus, following (Jara et al. 2008), a hierarchical representation of (18) is given by

$$\begin{aligned} {\varvec{Y}}|{\varvec{w}},\zeta \sim N_k({\varvec{\mu }}+{\varvec{\Delta }}{\varvec{w}}, \zeta ^{-1}{\varvec{\Sigma }}),\; {\varvec{w}}|\zeta \sim N_k(\mathbf 0 ,\zeta ^{-1}{\varvec{I}}_k){\varvec{I}}({\varvec{w}}>\mathbf 0 ),\; \zeta \sim \Gamma (\nu /2, \nu /2), \end{aligned}$$

(19)

Note that the ST distribution presented in (19) can be reduced to the following three special cases: (i) as $\nu \rightarrow \infty $ and $\zeta \rightarrow 1$ with probability 1 (i.e., the last distributional specification is omitted), then the hierarchical expression (19) becomes an SN distribution $SN_k({\varvec{\mu }},{\varvec{\Sigma }},{\varvec{\Delta }})$; (ii) as ${\varvec{\Delta }}=\mathbf 0 $, then the hierarchical expression (19) is a standard multivariate t-distribution; (iii) as $\nu \rightarrow \infty $, $\zeta \rightarrow 1$ with probability 1, and ${\varvec{\Delta }}=\mathbf 0 $, then the hierarchical expression (19) reverts to a standard multivariate normal distribution.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lu, T. Bayesian inference on longitudinal-survival data with multiple features. Comput Stat 32, 845–866 (2017). https://doi.org/10.1007/s00180-016-0681-3

Download citation

Received: 11 September 2015
Accepted: 30 August 2016
Published: 18 October 2016
Issue Date: September 2017
DOI: https://doi.org/10.1007/s00180-016-0681-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bayesian inference on longitudinal-survival data with multiple features

Abstract

Access this article

Similar content being viewed by others

A two-stage approach for Bayesian joint models: reducing complexity while maintaining accuracy

Stable Non-Linear Generalized Bayesian Joint Models for Survival-Longitudinal Data

Joint models with multiple longitudinal outcomes and a time-to-event outcome: a corrected two-stage approach

References

Author information

Authors and Affiliations

Corresponding author

Appendix: Multivariate skew distributions

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Bayesian inference on longitudinal-survival data with multiple features

Abstract

Access this article

Similar content being viewed by others

A two-stage approach for Bayesian joint models: reducing complexity while maintaining accuracy

Stable Non-Linear Generalized Bayesian Joint Models for Survival-Longitudinal Data

Joint models with multiple longitudinal outcomes and a time-to-event outcome: a corrected two-stage approach

References

Author information

Authors and Affiliations

Corresponding author

Appendix: Multivariate skew distributions

Appendix: Multivariate skew distributions

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation