Skip to main content

Advertisement

Log in

Survival neural networks for time-to-event prediction in longitudinal study

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Time-to-event prediction has been an important practical task for longitudinal studies in many fields such as manufacturing, medicine, and healthcare. While most of the conventional survival analysis approaches suffer from the presence of censored failures and statistically circumscribed assumptions, few attempts have been made to develop survival learning machines that explore the underlying relationship between repeated measures of covariates and failure-free survival probability. This requires a purely dynamic-data-driven prediction approach, free of survival models or statistical assumptions. To this end, we propose two real-time survival networks: a time-dependent survival neural network (TSNN) with a feed-forward architecture and a recurrent survival neural network (RSNN) incorporating long short-term memory units. The TSNN additively estimates a latent failure risk arising from the repeated measures and performs multiple binary classifications to generate prognostics of survival probability, while the RSNN with time-dependent input covariates implicitly estimates the relation between these covariates and the survival probability. We propose a novel survival learning criterion to train the neural networks by minimizing the censoring Kullback–Leibler divergence, which guarantees monotonicity of the resulting probability. Besides the failure-event AUC, C-index, and censoring Brier score, we redefine a survival time estimate to evaluate the performance of the competing models. Experiments on four datasets demonstrate the great promise of our approach in real applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. https://www.clsa-elcv.ca/about-us/about-clsa-research-platform.

References

  1. Aalen O (1978) Nonparametric estimation of partial transition probabilities in multiple decrement models. Ann Stat 6:534–545

    Article  MathSciNet  Google Scholar 

  2. Alaa AM, van der Schaar M (2017) Deep multi-task Gaussian processes for survival analysis with competing risks. In: Proceedings of the annual conference on neural information processing systems (NIPS), pp 2326–2334

  3. Bellot A, van der Schaar M (2018) Multitask boosting for survival analysis with competing risks. In: Proceedings of the annual conference on neural information processing systems (NIPS), pp 1397–1406

  4. Biganzoli E, Boracchi P, Mariani L, Marubini E (1998) Feed forward neural networks for the analysis of censored survival data: a partial logistic regression approach. Stat Med 17(10):1169–1186

    Article  Google Scholar 

  5. Binder H, Schumacher M (2008) Allowing for mandatory covariates in boosting estimation of sparse high-dimensional survival models. BMC Bioinform 9(1):14

    Article  Google Scholar 

  6. Buckley J, James I (1979) Linear regression with censored data. Biometrika 66(3):429–436

    Article  Google Scholar 

  7. Caruana EJ, Roman M, Hernández-Sánchez J, Solli P (2015) Longitudinal studies. J Thorac Dis 7(11):E537

    Google Scholar 

  8. Chen L, Wang S (2013) Central clustering of categorical data with automated feature weighting. In: Proceedings of the international joint conference on artificial intelligence (IJCAI), pp 1260–1266

  9. Chen Q, May RC, Ibrahim JG, Chu H, Cole SR (2014) Joint modeling of longitudinal and survival data with missing and left-censored time-varying covariates. Stat Med 33(26):4560–4576

    Article  MathSciNet  Google Scholar 

  10. Cox DR (1972) Regression models and life tables. J R Stat Soc Ser B (Stat Methodol) 34:187–220

    MathSciNet  MATH  Google Scholar 

  11. Cutler SJ, Ederer F (1958) Maximum utilization of the life table method in analyzing survival. J Chronic Dis 8(6):699

    Article  Google Scholar 

  12. Fan J, Zhang W (2008) Statistical methods with varying coefficient models. Stat Interface 1(1):179

    Article  MathSciNet  Google Scholar 

  13. Faraggi D, Simon R (1995) A neural network model for survival data. Stat Med 14(1):73–82

    Article  Google Scholar 

  14. Fernández T, Rivera N, Teh YW (2016) Gaussian processes for survival analysis. In: Proceedings of the annual conference on neural information processing systems (NIPS), pp 5021–5029

  15. Fisher LD, Lin DY (1999) Time-dependent covariates in the Cox proportional-hazards regression model. Annu Rev Public Health 20(1):145–157

    Article  Google Scholar 

  16. Giunchiglia E, Nemchenko A, van der Schaar M (2018) RNN-SURV: a deep recurrent model for survival analysis. In: International conference on artificial neural networks (ICANN), pp 23–32. Springer, Berlin

  17. Grob GL, Cardoso Â, Liu CB, Little DA, Chamberlain BP (2018) A recurrent neural network survival model: predicting web user return time. In: Proceedings of the European conference on machine learning and principles and practice of knowledge discovery in databases (ECML PKDD), pp 152–168. Springer, Berlin

  18. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  Google Scholar 

  19. Ishwaran H, Kogalur UB, Blackstone EH, Lauer MS (2008) Random survival forests. Ann Appl Stat 2:841–860

    Article  MathSciNet  Google Scholar 

  20. Jenkins SP (2005) Survival analysis. Unpublished Manuscript, Institute for Social and Economic Research, Chapter 3, University of Essex, Colchester, UK

  21. Kaplan EL, Meier P (1958) Nonparametric estimation from incomplete observations. J Am Stat Assoc (JASA) 53(282):457–481

    Article  MathSciNet  Google Scholar 

  22. Katzman J, Shaham U, Bates J, Cloninger A, Jiang T, Kluger Y (2018) DeepSurv: personalized treatment recommender system using a Cox proportional hazards deep neural network. BMC Med Res Methodol 18:24

    Article  Google Scholar 

  23. Kim H, Golub GH, Park H (2004) Imputation of missing values in DNA microarray gene expression data. In: CSB, pp 572–573

  24. Kim M, Pavlovic V (2018) Variational inference for Gaussian process models for survival analysis. In: Proceedings of the annual conference on uncertainty in artificial intelligence (UAI), pp 435–445

  25. Lee C, Zame WR, Yoon J, van der Schaar M (2018) Deephit: a deep learning approach to survival analysis with competing risks. In: Proceedings of the AAAI national conference on artificial intelligence (AAAI), pp 2314–2321

  26. Li H, Ge Y, Zhu H, Xiong H, Zhao H (2017a) Prospecting the career development of talents: a survival analysis perspective. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining (KDD), pp 917–925

  27. Li Y, Wang J, Ye J, Reddy CK (2016a) A multi-task learning formulation for survival analysis. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining (KDD), pp 1715–1724

  28. Li Y, Wang L, Wang J, Ye J, Reddy CK (2017b) Transfer learning for survival analysis via efficient l2,1-norm regularized Cox regression. In: Proceedings of the IEEE international conference on data mining (ICDM), pp 231–240

  29. Li Y, Xu KS, Reddy CK (2016b) Regularized parametric regression for high-dimensional survival analysis. In: Proceedings of the SIAM international conference on data mining (SDM), pp 765–773

  30. Liestbl K, Andersen PK, Andersen U (1994) Survival analysis and neural nets. Stat Med 13(12):1189–1200

    Article  Google Scholar 

  31. Lin H-C, Baracos V, Greiner R, Chun-nam JY (2011) Learning patient-specific cancer survival distributions as a sequence of dependent regressors. In: Proceedings of the annual conference on neural information processing systems (NIPS), pp 1845–1853

  32. Liu M, Lu W, Shore RE, Zeleniuch-Jacquotte A (2010) Cox regression model with time-varying coefficients in nested case–control studies. Biostatistics 11(4):693–706

    Article  Google Scholar 

  33. Michalewicz Z, Schoenauer M (1996) Evolutionary algorithms for constrained parameter optimization problems. Evolut Comput 4(1):1–32

    Article  Google Scholar 

  34. Moghaddass R, Rudin C (2014) The latent state hazard model, with application to wind turbine reliability. Ann Appl Stat 9(4):1823–1863

    Article  MathSciNet  Google Scholar 

  35. Rumelhart DE, Hinton GE, Williams RJ (1988) Neurocomputing: foundations of research. Chapter Learning representations by back-propagating errors. MIT Press, Cambridge, pp 696–699

    Google Scholar 

  36. Street WN (1998) A neural network model for prognostic prediction. In: Proceedings of the annual international conference on machine learning (ICML), pp 540–546

  37. Sun Y, Sundaram R, Zhao Y (2009) Empirical likelihood inference for the Cox model with time-dependent coefficients via local partial likelihood. Scand J Stat 36(3):444–462

    Article  MathSciNet  Google Scholar 

  38. Thomas L, Reyes EM (2014) Tutorial: survival estimation for Cox regression models with time-varying coefficients using SAS and R. J Stat Softw 61(c1):1–23

    Google Scholar 

  39. Tian L, Zucker D, Wei L (2005) On the Cox model with time-varying regression coefficients. J Am Stat Assoc (JASA) 100(469):172–183

    Article  MathSciNet  Google Scholar 

  40. Tibshirani R et al (1997) The LASSO method for variable selection in the Cox model. Stat Med 16(4):385–395

    Article  Google Scholar 

  41. Vinzamuri B, Li Y, Reddy CK (2014) Active learning based survival regression for censored data. In: Proceedings of the ACM international conference on information and knowledge management (CIKM), pp 241–250

  42. Vinzamuri B, Reddy CK (2013) Cox regression with correlation based regularization for electronic health records. In: Proceedings of the IEEE international conference on data mining (ICDM), pp 757–766

  43. Wang L, Li Y, Zhou J, Zhu D, Ye J (2017) Multi-task survival analysis. In: Proceedings of the IEEE international conference on data mining (ICDM), pp 485–494

  44. Wang W (2004) Proportional hazards regression models with unknown link function and time-dependent covariates. Stat Sin 14(3):885–906

    MathSciNet  MATH  Google Scholar 

  45. Wei L-J (1992) The accelerated failure time model: a useful alternative to the Cox regression model in survival analysis. Stat Med 11(14–15):1871–1879

    Article  Google Scholar 

  46. Wilamowski BM, Yu H (2010) Neural network learning without backpropagation. IEEE Trans Neural Netw (TNN) 21(11):1793–1803

    Article  Google Scholar 

  47. Wu Y, Yuan M, Dong S, Lin L, Liu Y (2018) Remaining useful life estimation of engineered systems using vanilla LSTM neural networks. Neurocomputing 275:167–179

    Article  Google Scholar 

  48. Yamashita H, Tanabe T (2010) A primal-dual exterior point method for nonlinear optimization. SIAM J Optim (SIOPT) 20(6):3335–3363

    Article  MathSciNet  Google Scholar 

  49. Yang G, Cai Y, Reddy CK (2018) Spatio-temporal check-in time prediction with recurrent neural network based survival analysis. In: Proceedings of the international joint conference on artificial intelligence (IJCAI), pp 2976–2983

  50. Yu S, Fung G, Rosales R, Krishnan S, Rao RB, Dehing-Oberije C, Lambin P (2008) Privacy-preserving Cox regression for survival analysis. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining (KDD), pp 1034–1042

  51. Zhang D (2008) Analysis of survival data (chapter 10: time dependent covariates). https://www.coursehero.com/file/18367916/chap10/

  52. Zhang J, Chen L, Vanasse A, Courteau J, Wang S (2016a) Survival prediction by an integrated learning criterion on intermittently varying healthcare data. In: Proceedings of the AAAI national conference on artificial intelligence (AAAI), pp 72–78

  53. Zhang J, Wang S, Chen L, Guo G, Chen R, Vanasse A (2019) Time-dependent survival neural network for remaining useful life prediction. In: Proceedings of the Pacific-Asia conference on knowledge discovery and data mining (PAKDD), pp 441–452. Springer, Berlin

  54. Zhang J, Wang S, Courteau J, Chen L, Bach A, Vanasse A (2016b) Predicting COPD failure by modeling hazard in longitudinal clinical data. In: Proceedings of the IEEE international conference on data mining (ICDM), pp 639–648

  55. Zhang Z, Reinikainen J, Adeleke KA, Pieterse ME, Groothuis-Oudshoorn CG (2018) Time-varying covariates and coefficients in Cox regression models. Ann Transl Med 6(7):121

    Article  Google Scholar 

  56. Zhou M (2001) Understanding the Cox regression models with time-change covariates. Am Stat 55(2):153–155

    Article  MathSciNet  Google Scholar 

  57. Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc Ser B (Stat Methodol) 67(2):301–320

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

This work was partially supported by the Natural Sciences and Engineering Research Council of Canada (NSERC) under Grant 396097-2015, the National Natural Science Foundation of China (NSFC) under Grants U1805263 and 61672157, the Canadian Institutes of Health Research (CIHR) under Grant 391051, the Fonds de Recherche du Québec-Santé, and the Département de médecine de famille et de médecine d’urgence at the Université de Sherbrooke. Part of this work was done while Jianfei Zhang was doing research in CWRU with Yanfang Ye. Jianfei Zhang and Yanfang Ye’s work is partially supported by the National Science Foundation (NSF) under Grants IIS-1951504, CNS-1940859, CNS-1946327, CNS-1814825, and OAC-1940855, the Department of Justice/ National Institute of Justice (DoJ/NIJ) under Grant NIJ 2018-75-CX-0032, and the Institute for Smart, Secure and Connected Systems (ISSACS) at CWRU and Cleveland Foundation under Grant 292767.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shengrui Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, J., Chen, L., Ye, Y. et al. Survival neural networks for time-to-event prediction in longitudinal study. Knowl Inf Syst 62, 3727–3751 (2020). https://doi.org/10.1007/s10115-020-01472-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-020-01472-1

Keywords

Navigation