Skip to main content

Advertisement

Log in

Survival Prediction After Transarterial Chemoembolization for Hepatocellular Carcinoma: a Deep Multitask Survival Analysis Approach

  • Research Article
  • Published:
Journal of Healthcare Informatics Research Aims and scope Submit manuscript

Abstract

The accurate prediction of postoperative survival time of patients with Barcelona Clinic Liver Cancer (BCLC) stage B hepatocellular carcinoma (HCC) is important for postoperative health care. Survival analysis is a common method used to predict the occurrence time of events of interest in the medical field. At present, the mainstream survival analysis models, such as the Cox proportional risk model, should make strict assumptions about the potential random process to solve the censored data, thus potentially limiting their application in clinical practice. In this paper, we propose a novel deep multitask survival model (DMSM) to analyze HCC survival data. Specifically, DMSM transforms the traditional survival time prediction problem of patients with HCC into a survival probability prediction problem at multiple time points and applies entropy regularization and ranking loss to optimize a multitask neural network. Compared with the traditional methods of deleting censored data and strong hypothesis, DMSM makes full use of all the information in the censored data but does not need to make any assumption. In addition, we identify the risk factors affecting the prognosis of patients with HCC and visualize the importance of ranking these factors. On the basis of the analysis of a real dataset of patients with BCLC stage B HCC, experimental results on three different validation datasets show that the DMSM achieves competitive performance with concordance index of 0.779, 0.727, and 0.780 and integrated Brier score (IBS) of 0.172, 0.138, and 0.135, respectively. Our DMSM has a comparatively small standard deviation (0.002, 0.002, and 0.003) for IBS of bootstrapping 100 times. The DMSM we proposed can be utilized as an effective survival analysis model and provide an important means for the accurate prediction of postoperative survival time of patients with BCLC stage B HCC.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data Availability

Shen, Lujun et al. (2019), Data from: Dynamically prognosticating patients with hepatocellular carcinoma through survival paths mapping based on time-series data, Dryad, Dataset, https://doi.org/10.5061/dryad.pd44k8r

Notes

  1. https://square.github.io/pysurvival/

  2. https://github.com/ailzy/pycox/

References

  1. Alejandro F , R María, Jordi B. Hepatocellular carcinoma. Lancet (London, England), 2018;391(10127):1301-1314

  2. Bray F, Ferlay J, Soerjomataram I et al (2018) Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA: Cancer J Clin 68(6):394–424

    Google Scholar 

  3. Sung H, Ferlay J, Siegel RL et al (2021) Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA: Cancer J Clin 71(3):209–249

    Google Scholar 

  4. European Association for The Study of the Liver (2018) EASL clinical practice guidelines for the management of patients with decompensated cirrhosis. J Hepatol 69(2):406–460

    Google Scholar 

  5. Lencioni R, de Baere T, Soulen MC et al (2016) Lipiodol transarterial chemoembolization for hepatocellular carcinoma: a systematic review of efficacy and safety data. Hepatology 64(1):106–116

    Google Scholar 

  6. Marrero JA, Kulik LM, Sirlin CB et al (2019) Diagnosis, staging, and management of hepatocellular carcinoma: 2018 practice guidance by the American Association for the Study of Liver Diseases. Clin Liver Dis 13(1):1

    Google Scholar 

  7. Tsilimigras DI, Bagante F, Sahara K et al (2019) Prognosis after resection of Barcelona clinic liver cancer (BCLC) stage 0, A, and B hepatocellular carcinoma: a comprehensive assessment of the current BCLC classification. Ann Surg Oncol 26(11):3693–3700

    Google Scholar 

  8. Burrel M, Reig M, Forner A et al (2012) Survival of patients with hepatocellular carcinoma treated by transarterial chemoembolisation (TACE) using drug eluting beads. Implications for clinical practice and trial design. J Hepatol 56(6):1330–1335

    Google Scholar 

  9. Wang P, Li Y, Reddy CK (2019) Machine learning for survival analysis: a survey. ACM Comput Surveys (CSUR) 51(6):1–36

    Google Scholar 

  10. Lee ET, Wang J (2003) Statistical methods for survival data analysis[M]. John Wiley & Sons

    Google Scholar 

  11. Moreno-Betancur M, Sadaoui H, Piffaretti C et al (2017) Survival analysis with multiple causes of death. Epidemiology 28(1):12–19

    Google Scholar 

  12. Cox DR (1972) Regression models and life-tables. J Royal Stat Soc: Series B (Methodological) 34(2):187–202

    MathSciNet  MATH  Google Scholar 

  13. Cox DR (1975) Partial likelihood. Biometrika 62(2):269–276

    MathSciNet  MATH  Google Scholar 

  14. Simon N, Friedman J, Hastie T et al (2011) Regularization paths for Cox’s proportional hazards model via coordinate descent. J Stat Softw 39(5):1

    Google Scholar 

  15. Lawless JF (2014) Parametric models in survival analysis. Statistics Reference Online, Wiley StatsRef

    Google Scholar 

  16. Mittal S, Madigan D, Cheng JQ et al (2013) Large-scale parametric survival analysis. Stat Med 32(23):3955–3971

    MathSciNet  Google Scholar 

  17. Martinsson E (2017) WTTE-RNN: Weibull time to event recurrent neural network a model for sequential prediction of time-to-event in the case of discrete or continuous censored data, recurrent events or time-varying covariates. Gothenburg: Chalmers University of Technology University of Gothenburg 

  18. Singh R, Mukhopadhyay K (2011) Survival analysis in clinical trials: basics and must know areas[J]. Perspect Clin Res 2(4):145

    Google Scholar 

  19. Yu CN, Greiner R, Lin HC et al (2011) Learning patient-specific cancer survival distributions as a sequence of dependent regressors. Adv Neural Inf Process Syst 24:1845–1853

    Google Scholar 

  20. Ranganath R, Perotte A, Elhadad N et al (2016) Deep survival analysis[C]//Machine Learning for Healthcare Conference. PMLR:101–114

  21. Katzman JL, Shaham U, Cloninger A et al (2016) Deep survival: a deep Cox proportional hazards network. BMC Med Res Methodol 1050:1–10

    Google Scholar 

  22. Luck M, Sylvain T, Cardinal H et al (2017) Deep learning for patient-specific kidney graft survival analysis[J]. arXiv preprint arXiv:1705.10245

  23. Yousefi S, Amrollahi F, Amgad M et al (2017) Predicting clinical outcomes from large scale cancer genomic profiles with deep survival models[J]. Sci Rep 7(1):1–11

    Google Scholar 

  24. Martinsson E (2016) Wtte-rnn: Weibull time to event recurrent neural network. Chalmers University of Technology & University of Gothenburg

    Google Scholar 

  25. Lin H, Zeng L, Yang J et al (2021) A machine learning-based model to predict survival after transarterial chemoembolization for BCLC stage B hepatocellular carcinoma. Front Oncol 11:608260

    Google Scholar 

  26. Roy B, Stepišnik T, TP ALS et al (2022) Survival analysis with semi-supervised predictive clustering trees. Comp Biol Med 141:105001

    Google Scholar 

  27. Ishwaran H, Kogalur UB, Blackstone EH et al (2008) Random survival forests[J]. The annals of applied statistics 2(3):841–860

    MathSciNet  MATH  Google Scholar 

  28. Kretowska M (2019) Oblique survival trees in discrete event time analysis[J]. IEEE J Biomed Health Inform 24(1):247–258

    Google Scholar 

  29. Adele C et al (2004) Random forests. Mach. Learn 45:157–176

    Google Scholar 

  30. Książek W, Turza F, Pławiak P (2022) NCA-GA-SVM: a new two-level feature selection method based on neighborhood component analysis and genetic algorithm in hepatocellular carcinoma fatality prognosis[J]. Int J Num Method Biomed Eng 38(6):e3599

    Google Scholar 

  31. Shivaswamy PK, Chu W, Jansche M (2007) A support vector approach to censored targets. Seventh IEEE Int Conf Data Mining (ICDM) 2007:655–660

    Google Scholar 

  32. Ali MAS, Orban R, Rajammal Ramasamy R et al (2022) A novel method for survival prediction of hepatocellular carcinoma using feature-selection techniques. Appl Sci 12(13):6427

    Google Scholar 

  33. Noh B, Park YM, Kwon Y et al (2022) Machine learning-based survival rate prediction of Korean hepatocellular carcinoma patients using multi-center data. BMC Gastroenterol 22(1):85

    Google Scholar 

  34. Santos MS, Abreu PH, García-Laencina PJ et al (2015) A new cluster-based oversampling method for improving survival prediction of hepatocellular carcinoma patients. J Biomed Inf 58:49–59

    Google Scholar 

  35. Yun S, Du B, Mao Y (2021) Robust deep multi-task learning framework for cancer survival analysis. Int Joint Conf Neural Netw (IJCNN):1–8

  36. Zhang L, Dong D, Liu Z et al (2021) Joint multi-task learning for survival prediction of gastric cancer Patients using CT images IEEE 18th IEEE. In: Int Symp Biomed Imag (ISBI), pp 895–898

    Google Scholar 

  37. Gu W, Zhang Z, Xie X et al (2019) An improved muti-task learning algorithm for analyzing cancer survival data. IEEE/ACM Transact Comput Biol Bioinform 18(2):500–511

    Google Scholar 

  38. Viganò A, Dorgan M, Buckingham J et al (2000) Survival prediction in terminal cancer patients: a systematic review of the medical literature. Palliat Med 14(5):363–374

    Google Scholar 

  39. Kourou K, Exarchos TP, Exarchos KP et al (2015) Machine learning applications in cancer prognosis and prediction. Comput Struct Biotechnol J 13:8–17

    Google Scholar 

  40. Kaplan EL, Meier P (1958) Nonparametric estimation from incomplete observations. J Am Stat Assoc 53(282):457–481

    MathSciNet  MATH  Google Scholar 

  41. Faraggi D, Simon R (1995) A neural network model for survival data. Stat Med 14(1):73–82

    Google Scholar 

  42. Zhu X, Yao J, Huang J (2016) Deep convolutional neural network for survival analysis with pathological images IEEE. Int Conf Bioinform Biomed, IEEE:544–547

  43. Katzman JL, Shaham U, Cloninger A et al (2018) DeepSurv: personalized treatment recommender system using a Cox proportional hazards deep neural network. BMC Med Res Methodol 18(1):1–12

    Google Scholar 

  44. Chen L, Shao K, Long X et al (2020) Multi-task regression learning for survival analysis via prior information guided transductive matrix completion. Front Comput Sci 14(5):1–14

    Google Scholar 

  45. Bolondi L, Burroughs A, Dufour JF et al (2012) Heterogeneity of patients with intermediate (BCLC B) hepatocellular carcinoma: proposal for a subclassification to facilitate treatment decisions Seminars in liver disease. Thieme Medical Publishers 32(04):348–359

    Google Scholar 

  46. Kadalayil L, Benini R, Pallan L et al (2013) A simple prognostic scoring system for patients receiving transarterial embolisation for hepatocellular cancer. Ann Oncol 24(10):2565–2570

    Google Scholar 

  47. Lee DH (2013) Pseudo-label: the simple and efficient semi-supervised learning method for deep neural networks Workshop on challenges in representation learning. ICML 3(2):896

    Google Scholar 

  48. Shen L, Zeng Q, Guo P et al (2018) Dynamically prognosticating patients with hepatocellular carcinoma through survival paths mapping based on time-series data. Nat Commun 9(1):1–10

    Google Scholar 

  49. Tsoris A, Marlar CA (2020) Use of the Child Pugh Score in Liver Disease; StatPearls: Treasure Island. FL, USA

  50. Fotso S (2018) Deep neural networks for survival analysis based on a multi-task framework. arXiv preprint arXiv:1801.05512

  51. Lee C, Zame W, Yoon J et al (2018) Deephit: a deep learning approach to survival analysis with competing risks. Proc AAAI Conf Artif Intell 32(1)

  52. Kvamme H, Borgan Ø (2019) Continuous and discrete-time survival prediction with neural networks. Lifetime Data Anal 1910:06724

    MATH  Google Scholar 

  53. Gensheimer MF, Narasimhan B (2019) A scalable discrete-time survival model for neural networks. PeerJ 7:e6257

    Google Scholar 

  54. Zhong BY, Yan ZP, Sun JH et al (2021) Random survival forests to predict disease control for hepatocellular carcinoma treated with transarterial chemoembolization combined with sorafenib. Front Mol Biosci:437

  55. Xie J, Liu C (2005) Adjusted Kaplan–Meier estimator and log-rank test with inverse probability of treatment weighting for survival data. Stat Med 24(20):3089–3110

    MathSciNet  Google Scholar 

Download references

Acknowledgements

The authors would like to thank the College of Computer Science, Chongqing University, for providing the computing resources for this study.

Author information

Authors and Affiliations

Authors

Contributions

All authors collected, extracted, and analyzed the data and wrote the article. GH and HJL conceived and designed this study. HJL and SG provided critical revisions to the manuscript. All authors have approved the final version of the manuscript.

Corresponding authors

Correspondence to Huijun Liu or Shu Gong.

Ethics declarations

Ethical Approval

This declaration is “not applicable.”

Conflict of Interest

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

  1. A.

    Data Preprocessing

Three cohorts (i.e., derivation, internal, and multicenter testing cohorts) have missing values. The number and percentage of HCC patients with missing values in each variable of datasets are provided in Table 4.

Missing values are a common problem in clinical medical data. Therefore, to ensure data quality, we must take appropriate measures to reduce data problems as much as possible. Specifically, our HCC data include continuous variables and categorical variables. For continuous features, we impute the median value of each feature and fill in the missing value, and for categorical features, we completed missing values by replacing values by the most common occurrence. On the basis of the recommendations of HCC clinical medical experts, we used the inclusion and exclusion criteria mentioned above, excluded some abnormal feature values, and removed nine features (lymph node metastasis, distant metastasis, invasion of vena cava or atrium, invasion of hepatic veins, branch of invaded portal vein, portal vein invasion, PS score, hepatic encephalopathy, and treatment response in last time slice). Therefore, the last 21 features (include Child-Pugh) related to demographic, clinical, and biological features of patients were filtered into the model.

Then, we use the standard score method for continuous features and one-hot encoding for categorical variables to perform data normalization. For feature X, the standard score method’s output is

$${X}^{\hbox{'}}=\frac{X-\mu (X)}{\delta (X)}$$
(1)

μ is the mean, and δ is the standard deviation.

Table 4 Statistics of the missing values in each feature of three datasets
  1. 2

    Hyperparameter Tuning for the Baselines

For each experiment, we use fivefold cross-validation to maximize the concordance index (C-index) on the validation dataset and obtain the optimal hyperparameters of the model. In our DMSM, hyperparameters include batch size, learning rate, nodes, λ, β, and activation. The hyperparameters tuned for DMSM and other comparative models are presented below, and the optimal selection of hyperparameters is based on grid search.

DMSM: The number of nodes in hidden layer is selected from [5, 10, 15, 20, 25, 30]; the activation function is selected from [RELU, Sigmoid, Tanh]; the batch size is selected from [32, 64, 128, 256, 512]; the size of the learning rate is selected from [0.0001, 0.001, 0.01, 0.1]; the size of λ is selected from [0.01, 0.05, 0.1, 0.5, 1.0]; the size of β is selected from [0.001, 0.005, 0.01, 0.05, 0.1, 0.5]. We perform fivefold cross-validation, randomly select parameters from a given grid, and finally select the set of hyperparameters with the highest C-index. The hyperparameters for DMSM method and two-baseline models are shown in Tables 5 and 6.

RSF: We followed the experimental settings provided in the RSF GitHub repository.Footnote 1

The size of Max_depth is selected from 5, 10, 15, and 20; the max_features is selected from [“sqrt,” “int,” “float,” “log2”]; the sample_size_pct is selected from [0.55, 0.60, 0.65, 0.70]; the min_node_size is selected from [10, 20, 30, 40, 50]; the number of num_trees is selected from [100, 200, 300, 400, 500]. The hyperparameters are shown in Table 7.

DeepHit: We followed the experimental settings provided in the DeepHit GitHub repository.Footnote 2 The size of alpha is selected from [0.1, 0.2, 0.3, 0.4, 0.5, 0.6]; the size of sigma is selected from [0.1, 0.2, 0.3, 0.4, 0.5, 0.6]; the num_nodes is selected from [[4, 8], [4, 16], [8, 16], [8, 32], [16, 32], [32, 32].The selection interval of the learning rate and batch size is the same as above. The hyperparameters are shown in Table 8.

Table 5 DMSM experimental hyperparameters
Table 6 Hyperparameters of two-baseline models
Table 7 Hyperparameters of RSF model
Table 8 Hyperparameters of DeepHit model

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Huang, G., Liu, H., Gong, S. et al. Survival Prediction After Transarterial Chemoembolization for Hepatocellular Carcinoma: a Deep Multitask Survival Analysis Approach. J Healthc Inform Res 7, 332–358 (2023). https://doi.org/10.1007/s41666-023-00139-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s41666-023-00139-0

Keywords

Navigation