Skip to main content

Estimating nonlinear effects in the presence of cure fraction using a semi-parametric regression model

  • Original Paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

Nonlinear effects between explanatory and response variables are increasingly present in new surveys. In this paper, we propose a flexible four-parameter semi-parametric cure rate survival model called the sinh Cauchy cure rate distribution. The proposed model is based on the generalized additive models for location, scale and shape, for which any or all parameters of the distribution are parametric linear and/or nonparametric smooth functions of explanatory variables. The new model is used to fit the nonlinear behavior between explanatory variables and cure rate. The biases of the cure rate parameter estimates caused by not incorporating such non-linear effects in the model are investigated using Monte Carlo simulations. We discuss diagnostic measures and methods to select additive terms and their computational implementation. The flexibility of the proposed model is illustrated by predicting lifetime and cure rate proportion as well as identifying factors associated to women diagnosed with breast cancer.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  • Altman DG, Lausen B, Sauerbrei W, Schumacher M (1994) Dangers of using “optimal” cutpoints in the evaluation of prognostic factors. J Natl Cancer Inst 86:829–835

    Article  Google Scholar 

  • Balakrishnan N, Pal S (2012) EM algorithm-based likelihood estimation for some cure rate models. J Stat Theory Pract 6:698–724

    Article  MathSciNet  Google Scholar 

  • Berkson J, Gage RP (1952) Survival curve for cancer patients following treatment. J Am Stat Assoc 47:501–515

    Article  Google Scholar 

  • Boag JW (1949) Maximum likelihood estimates of the proportion of patients cured by cancer therapy. J R Stat Soc B 11:15–53

    MATH  Google Scholar 

  • Buuren SV, Fredriks M (2001) Worm plot: a simple diagnostic device for modelling growth reference curves. Stat Med 20:1259–1277

    Article  Google Scholar 

  • Cancho VG, Dey DK, Louzada F (2015) Unified multivariate survival model with a surviving fraction: an application to a Brazilian customer churn data. J Appl Stat 43:572–584

    Article  MathSciNet  Google Scholar 

  • Chen MH, Ibrahim JG, Sinha D (1999) A new Bayesian model for survival data with a surviving fraction. J Am Stat Assoc 94:909–919

    Article  MathSciNet  MATH  Google Scholar 

  • Cooner F, Banerjee S, Carlin BP, Sinha D (2007) Flexible cure rate modeling under latent activation schemes. J Am Stat Assoc 102:560–572

    Article  MathSciNet  MATH  Google Scholar 

  • Cordeiro GM, Cancho VG, Ortega EMM, Barriga GDC (2016) A model with long-term survivors: negative binomial Birnbaum–Saunders. Commun Stat Theory Methods 45:1370–1387

    Article  MathSciNet  MATH  Google Scholar 

  • da Cruz JN, Ortega EMM, Cordeiro GM (2016) The log-odd log-logistic Weibull regression model: modelling, estimation, influence diagnostics and residual analysis. J Stat Comput Simul 86:1516–1538

    Article  MathSciNet  Google Scholar 

  • Dunn PK, Smyth GK (1996) Randomized quantile residuals. J Comput Graph Stat 5:236–244

    Google Scholar 

  • Eilers PH, Marx BD (1996) Flexible smoothing with B-splines and penalties. Stat Sci 11:89–121

    Article  MathSciNet  MATH  Google Scholar 

  • Farewell VT (1982) The use of mixture models for the analysis of survival data with long-term survivors. Biometrics 38:1041–1046

    Article  Google Scholar 

  • Fitzgibbons PL, Page DL, Weaver D, Thor AD, Allred DC, Clark GM et al (2000) Prognostic factors in breast cancer: College of American Pathologists consensus statement 1999. Arch Pathol Lab Med 124:966–978

    Google Scholar 

  • Gospodarowicz MK, O’Sullivan B, Sobin LH (eds) (2006) Prognostic factors in cancer. Wiley-Liss, Frankfurt, pp 165–168

  • Green PJ, Silverman BW (1993) Nonparametric regression and generalized linear models: a roughness penalty approach. CRC Press, Boca Raton

    MATH  Google Scholar 

  • Hashimoto EM, Ortgea EMM, Cancho VG, Cordeiro GM (2015) A new long-term survival model with interval-censored data. Sankhya B 77:207–239

    Article  MathSciNet  MATH  Google Scholar 

  • Hashimoto EM, Cordeiro GM, Ortega EMM, Hamedani GG (2016) New flexible regression models generated by gamma random Variables with censored data. Int J Stat Probab 5:9–31

    Article  Google Scholar 

  • Hastie TJ, Tibshirani RJ (1990) Generalized additive models, vol 43. CRC Press, Boca Raton

    MATH  Google Scholar 

  • Ibrahim JG, Chen MH, Sinha D (2001) Bayesian survival analysis. Springer, New York

    Book  MATH  Google Scholar 

  • Ko A (2009) Everyone’s guide to cancer therapy: how cancer is diagnosed, treated, and managed day to day. Andrews McMeel Publishing, Kansas City

    Google Scholar 

  • Lagakos SW (1988) Effects of mismodelling and mismeasuring explanatory variables on tests of their association with a response variable. Stat Med 7:257–274

    Article  Google Scholar 

  • Lanjoni BR, Ortega EMM, Cordeiro GM (2016) Extended Burr XII regression models: theory and applications. J Agric Biol Environ Stat 21:203–224

    Article  MathSciNet  MATH  Google Scholar 

  • Lee Y, Nelder JA, Pawitan Y (2006) Generalized linear models with random effects: unified analysis via H-likelihood. CRC Press, Boca Raton

    Book  MATH  Google Scholar 

  • Lønning PE (2007) reast cancer prognostication and prediction: are we making progress? Ann Oncol 18(suppl 8):viii3–viii7

    Google Scholar 

  • Maller RA, Zhou X (1996) Survival analysis with long-term survivors. Wiley, New York

    MATH  Google Scholar 

  • Morgan TM, Elashoff RM (1986) Effect of categorizing a continuous covariate on the comparison of survival time. J Am Stat Assoc 81:917–921

    Article  Google Scholar 

  • Ortega EMM, Cordeiro GM, Hashimoto EM, Cooray K (2014) A log-linear regression model for the odd Weibull distribution with censored data. J Appl Stat 41:1859–1880

    Article  MathSciNet  MATH  Google Scholar 

  • Ortega EMM, Cordeiro GM, Campelo AK, Kattan MW, Cancho VG (2015) A power series beta Weibull regression model for predicting breast carcinoma. Stat Med 34:1366–1388

    Article  MathSciNet  Google Scholar 

  • Ramires TG, Ortega EMM, Cordeiro GM, Hens N (2016) A bimodal flexible distribution for lifetime data. J Stat Comput Simul 86:2450–2470

    Article  MathSciNet  Google Scholar 

  • Rigby RA, Stasinopoulos DM (2005) Generalized additive models for location, scale and shape. J R Stat Soc Ser C (Appl Stat) 54:507–554

    Article  MathSciNet  MATH  Google Scholar 

  • Rigby RA, Stasinopoulos DM (2014) Automatic smoothing parameter selection in GAMLSS with an application to centile estimation. Stat Methods Med Res 23:318–332

    Article  MathSciNet  Google Scholar 

  • Stasinopoulos DM, Rigby RA (2007) Generalized additive models for location scale and shape (GAMLSS) in R. J Stat Softw 23:1–46

    Article  Google Scholar 

  • Schumacher M, Bastert G, Bojar H, Huebner K et al (1994) Randomized \(2 \times 2\) trial evaluating hormonal treatment and the duration of chemotherapy in node-positive breast cancer patients. German Breast Cancer Study Group. J Clin Oncol 12:2086–2093

  • Tsodikov AD, Ibrahim JG, Yakovlev AY (2003) Estimating cure rates from survival data: an alternative to two-component mixture models. J Am Stat Assoc 98:1063–1078

    Article  MathSciNet  Google Scholar 

  • Voudouris V, Gilchrist R, Rigby R, Sedgwick J, Stasinopoulos D (2012) Modelling skewness and kurtosis with the BCPE density in GAMLSS. J Appl Stat 39:1279–1293

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

The first author acknowledge the financial support of the “Ciência sem Fronteiras” program of CNPq (Brazil) under the process number 200574/2015-9.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thiago G. Ramires.

Ethics declarations

Conflict of interest

The authors have declared no conflict of interest.

Appendix: Computational codes

Appendix: Computational codes

Here, we present the codes implemented in the GAMLSS package in the software R. The pdf, cdf, qf and the samples generator functions are

figure a

Next, we present the codes used in the data analysis.

figure b

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ramires, T.G., Hens, N., Cordeiro, G.M. et al. Estimating nonlinear effects in the presence of cure fraction using a semi-parametric regression model. Comput Stat 33, 709–730 (2018). https://doi.org/10.1007/s00180-017-0781-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-017-0781-8

Keywords