Abstract
Nonlinear effects between explanatory and response variables are increasingly present in new surveys. In this paper, we propose a flexible four-parameter semi-parametric cure rate survival model called the sinh Cauchy cure rate distribution. The proposed model is based on the generalized additive models for location, scale and shape, for which any or all parameters of the distribution are parametric linear and/or nonparametric smooth functions of explanatory variables. The new model is used to fit the nonlinear behavior between explanatory variables and cure rate. The biases of the cure rate parameter estimates caused by not incorporating such non-linear effects in the model are investigated using Monte Carlo simulations. We discuss diagnostic measures and methods to select additive terms and their computational implementation. The flexibility of the proposed model is illustrated by predicting lifetime and cure rate proportion as well as identifying factors associated to women diagnosed with breast cancer.
Similar content being viewed by others
References
Altman DG, Lausen B, Sauerbrei W, Schumacher M (1994) Dangers of using “optimal” cutpoints in the evaluation of prognostic factors. J Natl Cancer Inst 86:829–835
Balakrishnan N, Pal S (2012) EM algorithm-based likelihood estimation for some cure rate models. J Stat Theory Pract 6:698–724
Berkson J, Gage RP (1952) Survival curve for cancer patients following treatment. J Am Stat Assoc 47:501–515
Boag JW (1949) Maximum likelihood estimates of the proportion of patients cured by cancer therapy. J R Stat Soc B 11:15–53
Buuren SV, Fredriks M (2001) Worm plot: a simple diagnostic device for modelling growth reference curves. Stat Med 20:1259–1277
Cancho VG, Dey DK, Louzada F (2015) Unified multivariate survival model with a surviving fraction: an application to a Brazilian customer churn data. J Appl Stat 43:572–584
Chen MH, Ibrahim JG, Sinha D (1999) A new Bayesian model for survival data with a surviving fraction. J Am Stat Assoc 94:909–919
Cooner F, Banerjee S, Carlin BP, Sinha D (2007) Flexible cure rate modeling under latent activation schemes. J Am Stat Assoc 102:560–572
Cordeiro GM, Cancho VG, Ortega EMM, Barriga GDC (2016) A model with long-term survivors: negative binomial Birnbaum–Saunders. Commun Stat Theory Methods 45:1370–1387
da Cruz JN, Ortega EMM, Cordeiro GM (2016) The log-odd log-logistic Weibull regression model: modelling, estimation, influence diagnostics and residual analysis. J Stat Comput Simul 86:1516–1538
Dunn PK, Smyth GK (1996) Randomized quantile residuals. J Comput Graph Stat 5:236–244
Eilers PH, Marx BD (1996) Flexible smoothing with B-splines and penalties. Stat Sci 11:89–121
Farewell VT (1982) The use of mixture models for the analysis of survival data with long-term survivors. Biometrics 38:1041–1046
Fitzgibbons PL, Page DL, Weaver D, Thor AD, Allred DC, Clark GM et al (2000) Prognostic factors in breast cancer: College of American Pathologists consensus statement 1999. Arch Pathol Lab Med 124:966–978
Gospodarowicz MK, O’Sullivan B, Sobin LH (eds) (2006) Prognostic factors in cancer. Wiley-Liss, Frankfurt, pp 165–168
Green PJ, Silverman BW (1993) Nonparametric regression and generalized linear models: a roughness penalty approach. CRC Press, Boca Raton
Hashimoto EM, Ortgea EMM, Cancho VG, Cordeiro GM (2015) A new long-term survival model with interval-censored data. Sankhya B 77:207–239
Hashimoto EM, Cordeiro GM, Ortega EMM, Hamedani GG (2016) New flexible regression models generated by gamma random Variables with censored data. Int J Stat Probab 5:9–31
Hastie TJ, Tibshirani RJ (1990) Generalized additive models, vol 43. CRC Press, Boca Raton
Ibrahim JG, Chen MH, Sinha D (2001) Bayesian survival analysis. Springer, New York
Ko A (2009) Everyone’s guide to cancer therapy: how cancer is diagnosed, treated, and managed day to day. Andrews McMeel Publishing, Kansas City
Lagakos SW (1988) Effects of mismodelling and mismeasuring explanatory variables on tests of their association with a response variable. Stat Med 7:257–274
Lanjoni BR, Ortega EMM, Cordeiro GM (2016) Extended Burr XII regression models: theory and applications. J Agric Biol Environ Stat 21:203–224
Lee Y, Nelder JA, Pawitan Y (2006) Generalized linear models with random effects: unified analysis via H-likelihood. CRC Press, Boca Raton
Lønning PE (2007) reast cancer prognostication and prediction: are we making progress? Ann Oncol 18(suppl 8):viii3–viii7
Maller RA, Zhou X (1996) Survival analysis with long-term survivors. Wiley, New York
Morgan TM, Elashoff RM (1986) Effect of categorizing a continuous covariate on the comparison of survival time. J Am Stat Assoc 81:917–921
Ortega EMM, Cordeiro GM, Hashimoto EM, Cooray K (2014) A log-linear regression model for the odd Weibull distribution with censored data. J Appl Stat 41:1859–1880
Ortega EMM, Cordeiro GM, Campelo AK, Kattan MW, Cancho VG (2015) A power series beta Weibull regression model for predicting breast carcinoma. Stat Med 34:1366–1388
Ramires TG, Ortega EMM, Cordeiro GM, Hens N (2016) A bimodal flexible distribution for lifetime data. J Stat Comput Simul 86:2450–2470
Rigby RA, Stasinopoulos DM (2005) Generalized additive models for location, scale and shape. J R Stat Soc Ser C (Appl Stat) 54:507–554
Rigby RA, Stasinopoulos DM (2014) Automatic smoothing parameter selection in GAMLSS with an application to centile estimation. Stat Methods Med Res 23:318–332
Stasinopoulos DM, Rigby RA (2007) Generalized additive models for location scale and shape (GAMLSS) in R. J Stat Softw 23:1–46
Schumacher M, Bastert G, Bojar H, Huebner K et al (1994) Randomized \(2 \times 2\) trial evaluating hormonal treatment and the duration of chemotherapy in node-positive breast cancer patients. German Breast Cancer Study Group. J Clin Oncol 12:2086–2093
Tsodikov AD, Ibrahim JG, Yakovlev AY (2003) Estimating cure rates from survival data: an alternative to two-component mixture models. J Am Stat Assoc 98:1063–1078
Voudouris V, Gilchrist R, Rigby R, Sedgwick J, Stasinopoulos D (2012) Modelling skewness and kurtosis with the BCPE density in GAMLSS. J Appl Stat 39:1279–1293
Acknowledgements
The first author acknowledge the financial support of the “Ciência sem Fronteiras” program of CNPq (Brazil) under the process number 200574/2015-9.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have declared no conflict of interest.
Appendix: Computational codes
Appendix: Computational codes
Here, we present the codes implemented in the GAMLSS package in the software R. The pdf, cdf, qf and the samples generator functions are
Next, we present the codes used in the data analysis.
Rights and permissions
About this article
Cite this article
Ramires, T.G., Hens, N., Cordeiro, G.M. et al. Estimating nonlinear effects in the presence of cure fraction using a semi-parametric regression model. Comput Stat 33, 709–730 (2018). https://doi.org/10.1007/s00180-017-0781-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-017-0781-8