Skip to main content
Log in

Modelling the association in bivariate survival data by using a Bernstein copula

  • Original paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

Bivariate or multivariate survival data arise when a sample consists of clusters of two or more subjects which are correlated. This paper considers clustered bivariate survival data which is possibly censored. Two approaches are commonly used in modelling such type of correlated data: random effect models and marginal models. A random effect model includes a frailty model and assumes that subjects are independent within a cluster conditionally on a common non-negative random variable, the so-called frailty. In contrast, the marginal approach models the marginal distribution directly and then imposes a dependency structure through copula functions. In this manuscript, Bernstein copulas are used to account for the correlation in modelling bivariate survival data. A two-stage parametric estimation method is developed to estimate in the first stage the parameters in the marginal models and in the second stage the coefficients of the Bernstein polynomials in the association. Hereby we use a penalty parameter to make the fit desirably smooth. In this aspect linear constraints are introduced to ensure uniform univariate margins and we use quadratic programming to fit the model. We perform a Simulation study and illustrate the method on a real data set.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Availability of data and material

Not applicable.

References

  • Babu GJ, Canty AJ, Chaubey YP (2002) Application of Bernstein polynomials for smooth estimation of a distribution and density function. J Stat Plan Inference 105:377–392

    Article  MathSciNet  Google Scholar 

  • Bouezmarni T, Rombouts JVK, Taamouti A (2010) Asymptotic properties of the Bernstein density copula estimator for \(\alpha \)-mixing data. J Multivar Anal 101:1–10

    Article  MathSciNet  Google Scholar 

  • Bouyé E, Gaussel N, Salmon M (2002) Investigating dynamic dependence using copulae. FERC Working Paper

  • Brown BM, Chen SX (1999) Beta-Bernstein smoothing for regression curves with compact support. Scand Stat Theory Appl 26(1):47–59

    Article  MathSciNet  Google Scholar 

  • Chang IS, Hsiung CA, Wu YJ, Yang CC (2005) Bayesian survival analysis using Bernstein polynomials. Scand Stat Theory Appl 32(3):447–466

    Article  MathSciNet  Google Scholar 

  • Chang IS, Chien LC, Hsiung CA, Wen CC, Wu YJ (2007) Shape restricted regression with random Bernstein polynomials. Lecture Notes-Monograph Series 54:187–202

    Article  MathSciNet  Google Scholar 

  • Cherubini U, Luciano E, Vecchiato W (2004) Copula Methods in Finance. John Wiley & Sons, West Sussex, England

    Book  Google Scholar 

  • Choudhuri N, Ghosal S, Roy A (2004) Bayesian estimation of the spectral density of a time series. J Am Stat Assoc 99:1050–1059

    Article  MathSciNet  Google Scholar 

  • Doha EH, Bhrawy AH, Saker MA (2011) On the derivatives of Bernstein polynomials: an application for the solution of high even-order differential equations. Bound Value Probl 2011:829543

    Article  MathSciNet  Google Scholar 

  • Eilers PHC, Marx BD (1996) Flexible smoothing with B-splines and penalties. Stat Sci 11(2):89–121

    Article  MathSciNet  Google Scholar 

  • Hu L (2006) Dependence patterns across financial markets: a mixed copula approach. Appl Financial Econ 16(10):717–729

    Article  Google Scholar 

  • Hu T, Zhou Q, Sun J (2017) Regression analysis of bivariate current status data under the proportional hazards model. The Can J Stat 45(4):410–424

    Article  MathSciNet  Google Scholar 

  • Hurvich CM, Tsai CL (1989) Regression and time series model selection in small samples. Biometrika 76(2):297–307

    Article  MathSciNet  Google Scholar 

  • Janssen P, Swanepoel J, Veraverbeke N (2012) Large sample behavior of the Bernstein copula estimator. J Stat Plan Inference 142:1189–1197

    Article  MathSciNet  Google Scholar 

  • Kauermann G, Schellhase C (2014) Flexible pair-copula estimation in d-vines using bivariate penalized splines. Stat Comput 24(6):1081–1100

    Article  MathSciNet  Google Scholar 

  • Kauermann G, Schellhase C, Ruppert D (2013) Flexible copula density estimation with penalized hierarchical b-splines. Scand Stat Theory Appl 40:685–705

    Article  MathSciNet  Google Scholar 

  • Killiches M, Kraus D, Czado C (2018) Model distances for vine copulas in high dimensions. Stat Comput 28:323–341

    Article  MathSciNet  Google Scholar 

  • Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22:79–86

    Article  MathSciNet  Google Scholar 

  • Laevens H, Deluyker H, Schukken YH, De ML, Vandermeersch R, De ME, De KA (1997) Influence of parity and stage of lactation on the somatic cell count in bacteriologically negative dairy cows. J Dairy Sci 80:3219–3226

    Article  Google Scholar 

  • Li Y, Prentice R, Lin X (2008) Semiparametric maximum likelihood estimation in normal transformation models for bivariate survival data. Biometrika 95(4):947–960

    Article  MathSciNet  Google Scholar 

  • Longin F, Solnik B (2001) Extreme correlation of international equity markets. J Finance 56:651–678

    Article  Google Scholar 

  • McNeil AJ, Frey R, Embrechts P (2005) Quantitative Risk Management: Concepts, Techniques, and Tools. Princeton University Press, Princeton

    MATH  Google Scholar 

  • Nagler T, Schellhase C, Czado C (2017) Nonparametric estimation of simplified vine copula models: comparison of methods. Depend Model 5(1):99–120

    Article  MathSciNet  Google Scholar 

  • Nelsen RB (2006) An Introduction to Copulas. Springer, New York

    MATH  Google Scholar 

  • Osman M, Ghosh SK (2012) Nonparametric regression models for right-censored data using Bernstein polynomials. Comput Stat Data Anal 56:559–573

    MathSciNet  MATH  Google Scholar 

  • Patton AJ (2001) Modelling Time-Varying Exchange Rate Dependence Using the Conditional Copula. Discussion Paper, Department of Economics, UCSD, pp 01–09

  • Petrone S (1999) Bayesian density estimation using bernstein polynomials. Can J Stat 27(1):105–126

    Article  MathSciNet  Google Scholar 

  • Pfeifer D, Tsatedem HA, Mändle A, Girschig C (2016) New copulas based on general partitions-of-unity and their applications to risk management. Depend Model 4:123–140

    MathSciNet  MATH  Google Scholar 

  • Prenen L, Braekers R, Duchateau L (2017) Extending the archimedean copula methodology to model multivariate survival data grouped in clusters of variable size. JR Stat Soc Series B 79:483–505

    Article  MathSciNet  Google Scholar 

  • Prenen L, Braekers R, Duchateau L (2018) Investigating the correlation structure of quadrivariate udder infection times through hierarchical archimedean copulas. Lifetime Data Anal 24(4):719–742

    Article  MathSciNet  Google Scholar 

  • Rank J (2006) Copulas: from theory to application in finance. RISK Books, London

    Google Scholar 

  • Rockinger M, Jondeau E (2001) Conditional Dependency of Financial Series: An Application of Copulas. HEC Department of Finance Working Paper No. 723

  • Ruppert D, Wand MP, Carroll RJ (2003) Semiparametric Regression. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Ruppert D, Wand MP, Carroll RJ (2009) Semiparametric regression during 2003–2007. Electron J Stat 3:1193–1256

    Article  MathSciNet  Google Scholar 

  • Sancetta A, Satchell S (2004) The bernstein copula and its applications to modeling and approximations of multivariate distributions. Econ Theory 20:535–562

    Article  MathSciNet  Google Scholar 

  • Shih JH, Louis TA (1995) Inferences on the association parameter in copula models for bivariate survival data. Biometrics 51:1384–1399

    Article  MathSciNet  Google Scholar 

  • Stein ML (1990) A comparison of generalized cross validation and modified maximum likelihood for estimating the parameters of a stochastic process. Ann Stat 18:1139–1157

    MathSciNet  MATH  Google Scholar 

  • Tenbusch A (1997) Nonparametric curve estimation with bernstein estimates. Metrika 45(1):1–30

    Article  MathSciNet  Google Scholar 

  • Wahba G (1990) Spline models for observational data. SIAM, Philadelphia, PA

    Book  Google Scholar 

  • Weingessel A (2019) quadprog: Functions to solve Quadratic Programming Problems. R package version 1.5-8. Available on CRAN

  • Wood SN (2017) Generalized additive models: an introduction with R. Chapman & Hall/CRC Press, London/Boca Raton

    Book  Google Scholar 

Download references

Acknowledgements

We would like to thank the editors and anonymous referees for valuable comments and insightful suggestions, which helped us to improve the manuscript. For the simulations we used the infrastructure of the Flemish Supercomputer Center, funded by the Hercules Foundation and the Flemish Government-department Economics, Science and Innovation.

Funding

Not applicable.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mirza Nazmul Hasan.

Ethics declarations

Conflicts of interest

The authors declare that they have no potential conflicts of interest.

Code availability

The computational code added as a supplementary text.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 60 KB)

Appendices

Appendix A: Penalty matrix for penalizing second order derivatives

In this appendix we show the penalty matrix based on the second order derivatives.

$$\begin{aligned}&\int \Big (\frac{\partial ^2c_B(u_1,u_2)}{(\partial u_1)^2}\Big )^2 du_1du_2\\&\quad ={\varvec{\beta }^T}\int \Big [\frac{\partial ^2}{(\partial u_1)^2}\Big \{{\varvec{\phi }}_{\mathbf{m}}(u_{1})\otimes {\varvec{\phi }}_{\mathbf{m}}(u_{2})\Big \}\Big ]^T \Big [\frac{\partial ^2}{(\partial u_1)^2}\Big \{{\varvec{\phi }}_{\mathbf{m}}(u_{1})\otimes {\varvec{\phi }}_{\mathbf{m}}(u_{2})\Big \}\Big ]du_1du_2{\varvec{\beta }}\\&\quad = {\varvec{\beta }^T} \int \Big [\Big (\frac{\partial ^2}{(\partial u_1)^2}{\varvec{\phi }}_{\mathbf{m}}(u_{1})\Big )^T\Big (\frac{\partial ^2}{(\partial u_1)^2}{\varvec{\phi }}_{\mathbf{m}}(u_{1})\Big )\Big ]du_1 \otimes \Big [\Big ({\varvec{\phi }}_{\mathbf{m}}(u_{2})\Big )^T\Big ({\varvec{\phi }}_{\mathbf{m}}(u_{2})\Big )\Big ]{\varvec{\beta }}\\&\quad ={\varvec{\beta }^T}P_{u_1}{\varvec{\beta }} \end{aligned}$$

where,

$$\begin{aligned} P_{u_1} =\int \Big [\Big (\frac{\partial ^2}{(\partial u_1)^2}{\varvec{\phi }}_{\mathbf{m}}(u_{1})\Big )^T\Big (\frac{\partial ^2}{(\partial u_1)^2}{\varvec{\phi }}_{\mathbf{m}}(u_{1})\Big )\Big ]du_1 \otimes \Big [\Big ({\varvec{\phi }}_{\mathbf{m}}(u_{2})\Big )^T\Big ({\varvec{\phi }}_{\mathbf{m}}(u_{2})\Big )\Big ] \end{aligned}$$

The integral of the second order derivatives of Bernstein polynomials are calculated easily. The second order derivative of (3) equals (Doha et al. 2011)

$$\begin{aligned} \frac{\partial ^2}{(\partial u)^2}\phi _{mv}(u)=\frac{(m+1)!}{(m-2)!}\sum _{k = \max (0, v+2-m)}^{min(v,2)}(-1)^{k+2}\left( {\begin{array}{c}2\\ k\end{array}}\right) \phi _{m-2,v-k}(u) \end{aligned}$$

This is rewritten as,

$$\begin{aligned} \frac{\partial ^2}{(\partial u)^2}\phi _{mv}(u)=(\phi _{m-2,v}(u)B)w \end{aligned}$$

with

$$\begin{aligned} B= \begin{bmatrix} 1 &{} -2 &{} 1 &{} 0 &{}\ldots &{} 0 \\ 0 &{} 1 &{} -2 &{} 1 &{}\ddots &{} \vdots \\ \vdots &{} \ddots &{}\ddots &{}\ddots &{} \ddots &{} 0\\ 0 &{} \ldots &{} 0 &{} 1 &{} -2 &{} 1\\ \end{bmatrix} ,\quad B\in {\mathbf {R}}^{(m-1)\times (m+1)} \end{aligned}$$

and \(w=\frac{(m+1)!}{(m-2)!}\). Therefore, the matrix \(P_{u_1}\) and \(P_{u_2}\) are equivalent to

$$\begin{aligned} P_{u_1}&= \Big [wB^T\int \phi _{m-2,v_1}(u_1)\phi _{m-2,v_1}(u_1)du_1Bw\Big ] \otimes \Big [\Big ({\varvec{\phi }}_{\mathbf{m}}(u_{2})\Big )^T{\varvec{\phi }}_{\mathbf{m}}(u_{2})\Big ]\\ P_{u_2}&= \Big [\Big ({\varvec{\phi }}_{\mathbf{m}}(u_{1})\Big )^T{\varvec{\phi }}_{\mathbf{m}}(u_{1})\Big ]\otimes \Big [wB^T\int \phi _{m-2,v_2}(u_2)\phi _{m-2,v_2}(u_2)du_2Bw\Big ] \end{aligned}$$

So, the penalty can be written as a quadratic form \(\lambda \varvec{\beta }^T P_{int} \varvec{\beta }\), where \(\lambda \) is the penalty parameter steering the amount of smoothness and \(P_{int}=P_{u_1}+P_{u_2}\).

Appendix B: tables and Graphs

See Tables 6 and 7 and Fig. 5

Table 6 Corrected AIC of the estimated model based on data-driven \(\lambda \) from the simulated data with 25% censoring
Table 7 KLD and IAE of the estimated model based on data-driven \(\lambda \) and penalized integrated second order derivatives
Fig. 5
figure 5

Density difference plots for the Clayton (first row), Gumbel (second row), Frank (third row) and Gaussian copula (fourth row) with 25% censoring

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hasan, M.N., Braekers, R. Modelling the association in bivariate survival data by using a Bernstein copula. Comput Stat 37, 781–815 (2022). https://doi.org/10.1007/s00180-021-01154-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-021-01154-8

Keywords

Navigation