Skip to main content

Advertisement

Log in

Methods for estimating the optimal number and location of cut points in multivariate survival analysis: a statistical solution to the controversial effect of BMI

  • Original paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

In clinical practice, researchers usually categorize continuous variables for risk assessment. Many algorithms have been developed to find one optimal cut point to group variables into two halves; however, there is often need to determine the optimal number of cut points and their locations at the same time. In this paper we proposed a new AIC criterion, where the AIC values were corrected with cross-validation and Monte Carlo method, to select the optimal number of cut points. In addition, the cross-validation and Monte Carlo methods were used to correct the p value and relative risk. To provide the biomedical researchers with an easy tool, we developed an R function that utilized the genetic algorithm to find the location of the optimal cut points. Furthermore, we conducted simulation experiments to study the performance of our proposed method. In the end we applied our method to study the effect of body mass index on cervical cancer survival, which had inconsistent reports in the literature.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Barrio I, Arostegui I, Rodriguez-Alvarez M, Quintana JM (2015) A new approach to categorising continuous variables in prediction models: proposal and validation. Stat Methods Med Res 26(6):2586–2602. https://doi.org/10.1177/0962280215601873

    Article  MathSciNet  Google Scholar 

  • Calle EE, Rodriguez C, Walker-Thurmond K, Thun MJ (2003) Overweight, obesity, and mortality from cancer in a prospectively studied cohort of US adults. N Engl J Med 348:1625–1638

    Article  Google Scholar 

  • Camp RL, Dolled-Filhart M, Rimm DL (2004) X-tile: a new bio-informatics tool for biomarker assessment and outcome-based cut-point optimization. Clin Cancer Res 10:7252–7259

    Article  Google Scholar 

  • Chang C, Hsieh MK, Chang WY, Chiang AJ, Chen J (2017) Determining the optimal number and location of cutoff points with application to data of cervical cancer. PLOS One 12(4):76231. https://doi.org/10.1371/journal.pone.0176231

    Article  Google Scholar 

  • Clark LH, Jackson AL, Soo AE, Orrey DC, Gehrig PA, Kim KH (2016) Extremes in body mass index affect overall survival in women with cervical cancer. Gynecol Oncol 141:497–500

    Article  Google Scholar 

  • Eiben AE, Smith JE (2003) Introduction to evolutionary computing. Springer, Berlin

    Book  Google Scholar 

  • Faraggi D, Simon R (1996) A simulation study of cross-validation for selecting an optimal cutpoint in univariate survival analysis. Stat Med 15:2203–2213

    Article  Google Scholar 

  • Haque R, Van Den Eeden SK, Wallner LP, Richert-Boe K, Kallakury B, Wang R, Weinmann S (2014) Association of body mass index and prostate cancer mortality. Obes Res Clin Pract 8:374–381

    Article  Google Scholar 

  • Hilsenbeck SG, Clark GM, McGuire WL (1992) Why do so many prognostic factors fail to pan out? Breast Cancer Res Treat 22:197–206

    Article  Google Scholar 

  • Kizer NT, Thaker PH, Gao F, Zighelboim I, Powell MA, Rader JS, Mutch DG, Grigsby PW (2011) The effects of body mass index on complications and survival outcomes in patients with cervical carcinoma undergoing curative chemoradiation therapy. Cancer 117:948–956

    Article  Google Scholar 

  • Klein JP, Moeschberger ML (1997) Survival analysis: techniques for censored and truncated data. Springer, New York

    Book  Google Scholar 

  • Lausen B, Schumacher M (1992) Maximally selected rank statistics. Biometrics 48:73–85

    Article  Google Scholar 

  • Marozzi M (2016) Multivariate tests based on interpoint distances with application to magnetic resonance imaging. Stat Methods Med Res 25:2593–2610

    Article  MathSciNet  Google Scholar 

  • Matsunaga T, Suzuki K, Imashimizu K, Banno T, Takamochi K, Oh S (2015) Body mass index as a prognostic factor in resected lung cancer: obesity or underweight, which is the risk factor? Thorac Cardiovasc Surg 63:551–557

    Article  Google Scholar 

  • Mazumdar M, Smith A, Bacik J (2003) Methods for categorizing a prognostic variable in a multivariate setting. Stat Med 22:559–571

    Article  Google Scholar 

  • Mebane WR, Sekhon JS (2011) Genetic optimization using derivatives: the rgenoud package for r. J Stat Softw 42:1–26

    Article  Google Scholar 

  • Miller R, Siegmund D (1982) Maximally selected chi square statistics. Biometrics 38:1011–1016

    Article  MathSciNet  Google Scholar 

  • Pesarin F (2001) Multivariate permutation tests with applications in biostatistics. Wiley, Chichester

    MATH  Google Scholar 

  • R Development Core Team (2015) R: a language and environment for statistical computing. R Foundation for Statistical Computing, 2015. http://www.R-project.org/. Accessed 2015

  • SuarezLuis CC, Machado M, Kneib T, Gude F (2010) Flexible hazard ratio curves for continuous predictors in multi-state models: an application to breast cancer data. Stat Model 10:291–314

    Article  MathSciNet  Google Scholar 

  • Xu R, Vaida F, Harrington DP (2009) Using profile likelihood for semiparametric model selection with application to proportional hazards mixed models. Stat Sin 19:819–842

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The authors thank the reviewers for their comments and suggestions which helped improve this manuscript greatly. We are also grateful to Editor and Associate Editor for their help. In addition, We are grateful to the National Center for High-performance Computing of ROC for computer time and facilities. Also we thank Professor Mei-hui Guo for the helpful discussion. This work was partially supported by a grant from the Ministry of Science and Technology (MOST) (MOST 107-2118-M-110-003) and a joint grant from the Kaohsiung Veteran’s General Hospital and the National Sun Yat-sen University (VGHNSU107-011).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jiabin Chen.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chang, C., Hsieh, MK., Chiang, A.J. et al. Methods for estimating the optimal number and location of cut points in multivariate survival analysis: a statistical solution to the controversial effect of BMI. Comput Stat 34, 1649–1674 (2019). https://doi.org/10.1007/s00180-019-00908-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-019-00908-9

Keywords

Navigation