Abstract
Bivariate or multivariate survival data arise when a sample consists of clusters of two or more subjects which are correlated. This paper considers clustered bivariate survival data which is possibly censored. Two approaches are commonly used in modelling such type of correlated data: random effect models and marginal models. A random effect model includes a frailty model and assumes that subjects are independent within a cluster conditionally on a common non-negative random variable, the so-called frailty. In contrast, the marginal approach models the marginal distribution directly and then imposes a dependency structure through copula functions. In this manuscript, Bernstein copulas are used to account for the correlation in modelling bivariate survival data. A two-stage parametric estimation method is developed to estimate in the first stage the parameters in the marginal models and in the second stage the coefficients of the Bernstein polynomials in the association. Hereby we use a penalty parameter to make the fit desirably smooth. In this aspect linear constraints are introduced to ensure uniform univariate margins and we use quadratic programming to fit the model. We perform a Simulation study and illustrate the method on a real data set.
Similar content being viewed by others
Availability of data and material
Not applicable.
References
Babu GJ, Canty AJ, Chaubey YP (2002) Application of Bernstein polynomials for smooth estimation of a distribution and density function. J Stat Plan Inference 105:377–392
Bouezmarni T, Rombouts JVK, Taamouti A (2010) Asymptotic properties of the Bernstein density copula estimator for \(\alpha \)-mixing data. J Multivar Anal 101:1–10
Bouyé E, Gaussel N, Salmon M (2002) Investigating dynamic dependence using copulae. FERC Working Paper
Brown BM, Chen SX (1999) Beta-Bernstein smoothing for regression curves with compact support. Scand Stat Theory Appl 26(1):47–59
Chang IS, Hsiung CA, Wu YJ, Yang CC (2005) Bayesian survival analysis using Bernstein polynomials. Scand Stat Theory Appl 32(3):447–466
Chang IS, Chien LC, Hsiung CA, Wen CC, Wu YJ (2007) Shape restricted regression with random Bernstein polynomials. Lecture Notes-Monograph Series 54:187–202
Cherubini U, Luciano E, Vecchiato W (2004) Copula Methods in Finance. John Wiley & Sons, West Sussex, England
Choudhuri N, Ghosal S, Roy A (2004) Bayesian estimation of the spectral density of a time series. J Am Stat Assoc 99:1050–1059
Doha EH, Bhrawy AH, Saker MA (2011) On the derivatives of Bernstein polynomials: an application for the solution of high even-order differential equations. Bound Value Probl 2011:829543
Eilers PHC, Marx BD (1996) Flexible smoothing with B-splines and penalties. Stat Sci 11(2):89–121
Hu L (2006) Dependence patterns across financial markets: a mixed copula approach. Appl Financial Econ 16(10):717–729
Hu T, Zhou Q, Sun J (2017) Regression analysis of bivariate current status data under the proportional hazards model. The Can J Stat 45(4):410–424
Hurvich CM, Tsai CL (1989) Regression and time series model selection in small samples. Biometrika 76(2):297–307
Janssen P, Swanepoel J, Veraverbeke N (2012) Large sample behavior of the Bernstein copula estimator. J Stat Plan Inference 142:1189–1197
Kauermann G, Schellhase C (2014) Flexible pair-copula estimation in d-vines using bivariate penalized splines. Stat Comput 24(6):1081–1100
Kauermann G, Schellhase C, Ruppert D (2013) Flexible copula density estimation with penalized hierarchical b-splines. Scand Stat Theory Appl 40:685–705
Killiches M, Kraus D, Czado C (2018) Model distances for vine copulas in high dimensions. Stat Comput 28:323–341
Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22:79–86
Laevens H, Deluyker H, Schukken YH, De ML, Vandermeersch R, De ME, De KA (1997) Influence of parity and stage of lactation on the somatic cell count in bacteriologically negative dairy cows. J Dairy Sci 80:3219–3226
Li Y, Prentice R, Lin X (2008) Semiparametric maximum likelihood estimation in normal transformation models for bivariate survival data. Biometrika 95(4):947–960
Longin F, Solnik B (2001) Extreme correlation of international equity markets. J Finance 56:651–678
McNeil AJ, Frey R, Embrechts P (2005) Quantitative Risk Management: Concepts, Techniques, and Tools. Princeton University Press, Princeton
Nagler T, Schellhase C, Czado C (2017) Nonparametric estimation of simplified vine copula models: comparison of methods. Depend Model 5(1):99–120
Nelsen RB (2006) An Introduction to Copulas. Springer, New York
Osman M, Ghosh SK (2012) Nonparametric regression models for right-censored data using Bernstein polynomials. Comput Stat Data Anal 56:559–573
Patton AJ (2001) Modelling Time-Varying Exchange Rate Dependence Using the Conditional Copula. Discussion Paper, Department of Economics, UCSD, pp 01–09
Petrone S (1999) Bayesian density estimation using bernstein polynomials. Can J Stat 27(1):105–126
Pfeifer D, Tsatedem HA, Mändle A, Girschig C (2016) New copulas based on general partitions-of-unity and their applications to risk management. Depend Model 4:123–140
Prenen L, Braekers R, Duchateau L (2017) Extending the archimedean copula methodology to model multivariate survival data grouped in clusters of variable size. JR Stat Soc Series B 79:483–505
Prenen L, Braekers R, Duchateau L (2018) Investigating the correlation structure of quadrivariate udder infection times through hierarchical archimedean copulas. Lifetime Data Anal 24(4):719–742
Rank J (2006) Copulas: from theory to application in finance. RISK Books, London
Rockinger M, Jondeau E (2001) Conditional Dependency of Financial Series: An Application of Copulas. HEC Department of Finance Working Paper No. 723
Ruppert D, Wand MP, Carroll RJ (2003) Semiparametric Regression. Cambridge University Press, Cambridge
Ruppert D, Wand MP, Carroll RJ (2009) Semiparametric regression during 2003–2007. Electron J Stat 3:1193–1256
Sancetta A, Satchell S (2004) The bernstein copula and its applications to modeling and approximations of multivariate distributions. Econ Theory 20:535–562
Shih JH, Louis TA (1995) Inferences on the association parameter in copula models for bivariate survival data. Biometrics 51:1384–1399
Stein ML (1990) A comparison of generalized cross validation and modified maximum likelihood for estimating the parameters of a stochastic process. Ann Stat 18:1139–1157
Tenbusch A (1997) Nonparametric curve estimation with bernstein estimates. Metrika 45(1):1–30
Wahba G (1990) Spline models for observational data. SIAM, Philadelphia, PA
Weingessel A (2019) quadprog: Functions to solve Quadratic Programming Problems. R package version 1.5-8. Available on CRAN
Wood SN (2017) Generalized additive models: an introduction with R. Chapman & Hall/CRC Press, London/Boca Raton
Acknowledgements
We would like to thank the editors and anonymous referees for valuable comments and insightful suggestions, which helped us to improve the manuscript. For the simulations we used the infrastructure of the Flemish Supercomputer Center, funded by the Hercules Foundation and the Flemish Government-department Economics, Science and Innovation.
Funding
Not applicable.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
The authors declare that they have no potential conflicts of interest.
Code availability
The computational code added as a supplementary text.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Appendices
Appendix A: Penalty matrix for penalizing second order derivatives
In this appendix we show the penalty matrix based on the second order derivatives.
where,
The integral of the second order derivatives of Bernstein polynomials are calculated easily. The second order derivative of (3) equals (Doha et al. 2011)
This is rewritten as,
with
and \(w=\frac{(m+1)!}{(m-2)!}\). Therefore, the matrix \(P_{u_1}\) and \(P_{u_2}\) are equivalent to
So, the penalty can be written as a quadratic form \(\lambda \varvec{\beta }^T P_{int} \varvec{\beta }\), where \(\lambda \) is the penalty parameter steering the amount of smoothness and \(P_{int}=P_{u_1}+P_{u_2}\).
Appendix B: tables and Graphs
Rights and permissions
About this article
Cite this article
Hasan, M.N., Braekers, R. Modelling the association in bivariate survival data by using a Bernstein copula. Comput Stat 37, 781–815 (2022). https://doi.org/10.1007/s00180-021-01154-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-021-01154-8