Abstract
The modeling of longitudinal and survival data is an active research area. Most of researches focus on improving the estimating efficiency but ignore many data features frequently encountered in practice. In this article, we develop a joint model that concurrently accounting for longitudinal-survival data with multiple features. Specifically, our joint model handles skewness, limit of detection, missingness and measurement errors in covariates which are typical observed in the collection of longitudinal-survival data from many studies. We employ a Bayesian approach for making inference on the joint model. The proposed model and method are applied to an AIDS study. A few alternative models under different conditions are compared. Some interesting results are reported. Simulation studies are conducted to assess the performance of the proposed methods.
Similar content being viewed by others
References
Albert PS, Shih JH (2010) On estimating the relationship between longitudinal measurements and time-to-event data using a simple two-stage procedure. Biometrics 66:983–987
Arellano-Valle R, Genton M (2005) On fundamental skew distributions. J Multivar Anal 96:93–116
Azzalini A, Capitanio A (2003) On fundamental skew distributions. J R Stat Soc Ser B 65:367–389
Azzalini A, Genton M (2008) Robust likelihood methods based on the skew-t and related distributions. Int Stat Rev 76:106–129
Brown ER, Ibrahim JG (2003) Bayesian approaches to joint cure-rate and longitudinal models with applications to cancer vaccine trials. Biometrics 59(3):686–693
Celeux G, Forbes F, Robert C, Titterington M (2006) Deviance information criteria for missing data models. Bayesian Anal 1:651–674
Elashoff RM, Li G, Li N (2008) A joint model for longitudinal measurements and survival data in the presence of multiple failure types. Biometrics 64:762–771
Eubank R (1999) Nonparametric regression and spline smoothing. Dekker, New York
Faucett CL, Thomas DC (1996) Simultaneously modelling censored survival data and repeatedly measured covariates: a Gibbs sampling approach. Stat Med 15(15):1663–1685
Gelman A, Rubin D (1992) Inference from iterative simulation using multiple sequences. Stat Sci 7:457–511
Gelman A, Carlin J, Stern H, Rubin D (2003) Bayesian data analysis. Chapman and Hall, London
Henderson R, Diggle P, Dobson A (2000) Joint modelling of longitudinal measurements and event time data. Biostatistics 1(4):465–480
Ho H, Lin T (2010) Robust linear mixed models using the skew-t distribution with application to schizophrenia data. Biometr J 52:449–469
Hu W, Li G, Li N (2009) A Bayesian approach to joint analysis of longitudinal measurements and competing risks failure time data. Stat Med 28:1601–1619
Huang X, Li G, Elashoff R, Pan J (2011) A general joint model for longitudinal measurements and competing risks survival data with heterogeneous random effects. Lifetime Data Anal 17:80–100
Hughes J (1999) Mixed effects models with censored data with application to HIV RNA levels. Biometrics 55:625–629
Jara A, Quintana F, Martin E (2008) Linear mixed models with skew-elliptical distributions: a Bayesian approach. Comput Stat Data Anal 52:5033–5045
Lachos V, Bandyopadhyay D, Dey D (2011) Linear and nonlinear mixed-effects models for censored HIV viral loads using normal/independent distributions. Biometrics 67:1594–1604
Li N, Elashoff RM, Li G, Tseng CH (2012) Joint analysis of bivariate longitudinal ordinal outcomes and competing risks survival times with nonparametric distributions for random effects. Stat Med 31:1707–1721
Little R, Rubin D (2002) Statistical analysis with missing data. Wiley, New York
Liu W, Wu L (2007) Simultaneous inference for semiparametric nonlinear mixed-effects models withco variate measurement errors and missing responses. Biometrics 63:342–350
Rizopoulos D (2011) Dynamic predictions and prospective accuracy in joint models for longitudinal and time-to-event data. Biometrics 67(3):819–829
Sahu S, Dey D, Branco M (2003) A new class of multivariate skew distributions with applications to Bayesian regression models. Can J Stat 31:129–150
Wu L (2002) A joint model for nonlinear mixed-effects models with censoring and covariates measured with error. J Am Stat Assoc 97:955–964
Wu L (2007) HIV viral dynamic models with dropouts and missing covariates. Stat Med 26:3342–3357
Wulfsohn MS, Tsiatis AA (1997) A joint model for survival and longitudinal data measured with error. Biometrics 53:330–339
Author information
Authors and Affiliations
Corresponding author
Appendix: Multivariate skew distributions
Appendix: Multivariate skew distributions
Different versions of multivariate skew distributions have been introduced in the literature (Sahu et al. 2003; Azzalini and Capitanio 2003; Azzalini and Genton 2008; Jara et al. 2008). A new class of distributions by introducing skewness in multivariate elliptically distributions were developed in publication (Sahu et al. 2003). The class, which is obtained by using transformation and conditioning, contains many standard families including the multivariate skew-normal (SN) and skew-t (ST) distributions as special cases. A k-dimensional random vector \({\varvec{Y}}\) follows a k-variate skew-elliptical (SE) distribution if its probability density function (pdf) is given by
where \({\varvec{A}}={\varvec{\Sigma }}+{\varvec{\Delta }}^2\), \({\varvec{\mu }}\) is a location parameter vector, \({\varvec{\Sigma }}\) is a \(k \times k\) positive (diagonal) covariance matrix, \({\varvec{\Delta }}=\text {diag}(\delta _1, \delta _2,\ldots , \delta _k)\) is a \(k \times k\) skewness matrix with the skewness parameter vector \({\varvec{\delta }}=(\delta _1,\delta _2,\ldots ,\delta _k)^T\); \({\varvec{V}}\) follows the elliptical distribution \(El({\varvec{\Delta }}A^{-1}({\varvec{y}}-{\varvec{\mu }}), {\varvec{I}}_{k}-{\varvec{\Delta }}A^{-1}{\varvec{\Delta }}; m^{(k)}_{\nu })\) and the density generator function \(m^{(k)}_{\nu }(\zeta )=\frac{\Gamma (k/2)}{\pi ^{k/2}}\frac{m_{\nu }(\zeta )}{\int _0^{\infty }r^{k/2-1}m_{\nu }(\zeta )dr}\), with \(m_{\nu }(\zeta )\) being a function such that \(\int _0^{\infty }r^{k/2-1}m_{\nu }(\zeta )dr\) exists. The function \(m_{\nu }(\zeta )\) provides the kernel of the original elliptical density and may depend on the parameter \(\nu \). This SE distribution is denoted by \(SE({\varvec{\mu }},{\varvec{\Sigma }},{\varvec{\Delta }};m^{(k)})\). Two examples of \(m_{\nu }(\zeta )\), leading to important special cases used throughout the paper, are \(m_{\nu }(\zeta )=\exp (-\zeta /2)\) and \(m_{\nu }(\zeta )=(1+\zeta /\nu )^{-(\nu +k)/2}\), where \(\nu >0\). These two expressions lead to the multivariate SN and ST distributions, respectively. In the latter case, \(\nu \) corresponds to the degrees of freedom parameter.
As we know, a normal distribution is a special case of an SN distribution when the skewness parameter is zero, and the ST distribution reduces to the SN distribution when degrees of freedom are large. For completeness, this Appendix briefly summarizes the multivariate ST distribution introduced by Sahu et al. (2003) to be suitable for a Bayesian inference since it is built using the conditional method. For detailed discussions on properties of ST distribution, see Reference Sahu et al. (2003). Assume a k-dimensional random vector \({\varvec{Y}}\) follows a k variate ST distribution with location vector \({\varvec{\mu }}\), \(k \times k\) positive (diagonal) covariance matrix \({\varvec{\Sigma }}\) and \(k \times k\) skewness matrix \({\varvec{\Delta }}=\text {diag}(\delta _1, \delta _2,\ldots , \delta _k)\) or the degrees of freedom \(\nu \).
A k-dimensional random vector \({\varvec{Y}}\) follows an m-variate ST distribution if its probability density function (pdf) is given by
we denote the k-variate t distribution with parameters \({\varvec{\mu }}\), \({\varvec{A}}\) and degrees of freedom \(\nu \) by \(t_{k,\nu }({\varvec{\mu }}, {\varvec{A}})\) and the corresponding pdf by \(t_{k,\nu }({\varvec{y}}|{\varvec{\mu }}, {\varvec{A}})\) henceforth, \({\varvec{V}}\) follows the t distribution \(t_{k,\nu +k}\). We denote this distribution by \(ST_{k,\nu }({\varvec{\mu }},{\varvec{\Sigma }},{\varvec{\Delta }})\). In particular, when \({\varvec{\Sigma }}=\sigma ^2 {\varvec{I}}_k\) and \({\varvec{\Delta }}=\delta {\varvec{I}}_k\), the Eq. (15) simplifies to
where \(T_{k,\nu +k}(\cdot )\) denotes the cumulative distribution function (cdf) of \(t_{k,\nu +k}(\mathbf 0 ,{\varvec{I}}_k)\). However, unlike in the SN distribution, the ST density can not be written as the product of univariate ST densities. Here \({\varvec{Y}}\) is dependent but uncorrelated. It is noted that when \({\varvec{\delta }}=\mathbf 0 \), the ST distribution reduces to usual the t-distribution. It can be shown that the mean and covariance matrix of the ST distribution \(ST_{k,\nu }({\varvec{\mu }},\sigma ^2 {\varvec{I}}_k,{\varvec{\Delta }})\) are given by
In order to have a zero mean, we should assume the location parameter \({\varvec{\mu }}=-(\nu /\pi )^{1/2}\) \(\frac{\Gamma ((\nu -1)/2)}{\Gamma (\nu /2)}{\varvec{\delta }}\). According to Lemma 1 of Azzalini and Capitanio (2003), if \({\varvec{Y}}\) follows \(ST_{k,\nu }({\varvec{\mu }},{\varvec{\Sigma }},{\varvec{\Delta }})\), it can be represented by
where \(\zeta \) follows a Gamma distribution \(\Gamma (\nu /2,\nu /2)\), which is independent of \({\varvec{X}}\), and \({\varvec{X}}\) follows a k-dimensional skew-normal (SN) distribution, denoted by \(SN_k(\mathbf 0 ,{\varvec{\Sigma }},{\varvec{\Delta }})\). It follows from (17) that \({\varvec{Y}}|\zeta \sim SN_k({\varvec{\mu }}, \zeta ^{-1}{\varvec{\Sigma }},\zeta ^{-1/2}{\varvec{\Delta }})\). Following studies by Azzalini and Genton (2008), the SN distribution of \({\varvec{Y}}\), conditional on \(\zeta \), has a convenient stochastic representation as
where \({\varvec{X}}_0\) and \({\varvec{X}}\) are two independent \(N_k(\mathbf 0 ,{\varvec{I}}_k)\) random vectors. Note that the expression (18) provides a convenience device for random number generation and for implementation purpose. Let \({\varvec{w}}=\zeta ^{-1/2}|{\varvec{X}}_0|\); then \({\varvec{w}}\), conditional on \(\zeta \), follows a k-dimensional normal distribution \(N_k(\mathbf 0 , \zeta ^{-1}{\varvec{I}}_k)\) truncated in the space \({\varvec{w}}>\mathbf 0 \) (i.e., the half-normal distribution). Thus, following (Jara et al. 2008), a hierarchical representation of (18) is given by
Note that the ST distribution presented in (19) can be reduced to the following three special cases: (i) as \(\nu \rightarrow \infty \) and \(\zeta \rightarrow 1\) with probability 1 (i.e., the last distributional specification is omitted), then the hierarchical expression (19) becomes an SN distribution \(SN_k({\varvec{\mu }},{\varvec{\Sigma }},{\varvec{\Delta }})\); (ii) as \({\varvec{\Delta }}=\mathbf 0 \), then the hierarchical expression (19) is a standard multivariate t-distribution; (iii) as \(\nu \rightarrow \infty \), \(\zeta \rightarrow 1\) with probability 1, and \({\varvec{\Delta }}=\mathbf 0 \), then the hierarchical expression (19) reverts to a standard multivariate normal distribution.
Rights and permissions
About this article
Cite this article
Lu, T. Bayesian inference on longitudinal-survival data with multiple features. Comput Stat 32, 845–866 (2017). https://doi.org/10.1007/s00180-016-0681-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-016-0681-3