Survival Prediction After Transarterial Chemoembolization for Hepatocellular Carcinoma: a Deep Multitask Survival Analysis Approach

Huang, Guo; Liu, Huijun; Gong, Shu; Ge, Yongxin

doi:10.1007/s41666-023-00139-0

Survival Prediction After Transarterial Chemoembolization for Hepatocellular Carcinoma: a Deep Multitask Survival Analysis Approach

Research Article
Published: 31 July 2023

Volume 7, pages 332–358, (2023)
Cite this article

Journal of Healthcare Informatics Research Aims and scope Submit manuscript

Guo Huang¹,
Huijun Liu¹,
Shu Gong^2,3,4 &
…
Yongxin Ge⁵

447 Accesses
Explore all metrics

Abstract

The accurate prediction of postoperative survival time of patients with Barcelona Clinic Liver Cancer (BCLC) stage B hepatocellular carcinoma (HCC) is important for postoperative health care. Survival analysis is a common method used to predict the occurrence time of events of interest in the medical field. At present, the mainstream survival analysis models, such as the Cox proportional risk model, should make strict assumptions about the potential random process to solve the censored data, thus potentially limiting their application in clinical practice. In this paper, we propose a novel deep multitask survival model (DMSM) to analyze HCC survival data. Specifically, DMSM transforms the traditional survival time prediction problem of patients with HCC into a survival probability prediction problem at multiple time points and applies entropy regularization and ranking loss to optimize a multitask neural network. Compared with the traditional methods of deleting censored data and strong hypothesis, DMSM makes full use of all the information in the censored data but does not need to make any assumption. In addition, we identify the risk factors affecting the prognosis of patients with HCC and visualize the importance of ranking these factors. On the basis of the analysis of a real dataset of patients with BCLC stage B HCC, experimental results on three different validation datasets show that the DMSM achieves competitive performance with concordance index of 0.779, 0.727, and 0.780 and integrated Brier score (IBS) of 0.172, 0.138, and 0.135, respectively. Our DMSM has a comparatively small standard deviation (0.002, 0.002, and 0.003) for IBS of bootstrapping 100 times. The DMSM we proposed can be utilized as an effective survival analysis model and provide an important means for the accurate prediction of postoperative survival time of patients with BCLC stage B HCC.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Computed Tomography Image-Based Deep Survival Regression for Metastatic Colorectal Cancer Using a Non-proportional Hazards Model

Construction and validation of a survival prognostic model for stage III hepatocellular carcinoma: a real-world, multicenter clinical study

Article Open access 13 June 2023

A new model to estimate duration of survival in patients with hepatocellular carcinoma with BCLC intermediate stage

Article Open access 25 November 2023

Data Availability

Shen, Lujun et al. (2019), Data from: Dynamically prognosticating patients with hepatocellular carcinoma through survival paths mapping based on time-series data, Dryad, Dataset, https://doi.org/10.5061/dryad.pd44k8r

Notes

References

Alejandro F , R María, Jordi B. Hepatocellular carcinoma. Lancet (London, England), 2018;391(10127):1301-1314
Bray F, Ferlay J, Soerjomataram I et al (2018) Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA: Cancer J Clin 68(6):394–424
Google Scholar
Sung H, Ferlay J, Siegel RL et al (2021) Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA: Cancer J Clin 71(3):209–249
Google Scholar
European Association for The Study of the Liver (2018) EASL clinical practice guidelines for the management of patients with decompensated cirrhosis. J Hepatol 69(2):406–460
Google Scholar
Lencioni R, de Baere T, Soulen MC et al (2016) Lipiodol transarterial chemoembolization for hepatocellular carcinoma: a systematic review of efficacy and safety data. Hepatology 64(1):106–116
Google Scholar
Marrero JA, Kulik LM, Sirlin CB et al (2019) Diagnosis, staging, and management of hepatocellular carcinoma: 2018 practice guidance by the American Association for the Study of Liver Diseases. Clin Liver Dis 13(1):1
Google Scholar
Tsilimigras DI, Bagante F, Sahara K et al (2019) Prognosis after resection of Barcelona clinic liver cancer (BCLC) stage 0, A, and B hepatocellular carcinoma: a comprehensive assessment of the current BCLC classification. Ann Surg Oncol 26(11):3693–3700
Google Scholar
Burrel M, Reig M, Forner A et al (2012) Survival of patients with hepatocellular carcinoma treated by transarterial chemoembolisation (TACE) using drug eluting beads. Implications for clinical practice and trial design. J Hepatol 56(6):1330–1335
Google Scholar
Wang P, Li Y, Reddy CK (2019) Machine learning for survival analysis: a survey. ACM Comput Surveys (CSUR) 51(6):1–36
Google Scholar
Lee ET, Wang J (2003) Statistical methods for survival data analysis[M]. John Wiley & Sons
Google Scholar
Moreno-Betancur M, Sadaoui H, Piffaretti C et al (2017) Survival analysis with multiple causes of death. Epidemiology 28(1):12–19
Google Scholar
Cox DR (1972) Regression models and life-tables. J Royal Stat Soc: Series B (Methodological) 34(2):187–202
MathSciNet MATH Google Scholar
Cox DR (1975) Partial likelihood. Biometrika 62(2):269–276
MathSciNet MATH Google Scholar
Simon N, Friedman J, Hastie T et al (2011) Regularization paths for Cox’s proportional hazards model via coordinate descent. J Stat Softw 39(5):1
Google Scholar
Lawless JF (2014) Parametric models in survival analysis. Statistics Reference Online, Wiley StatsRef
Google Scholar
Mittal S, Madigan D, Cheng JQ et al (2013) Large-scale parametric survival analysis. Stat Med 32(23):3955–3971
MathSciNet Google Scholar
Martinsson E (2017) WTTE-RNN: Weibull time to event recurrent neural network a model for sequential prediction of time-to-event in the case of discrete or continuous censored data, recurrent events or time-varying covariates. Gothenburg: Chalmers University of Technology University of Gothenburg
Singh R, Mukhopadhyay K (2011) Survival analysis in clinical trials: basics and must know areas[J]. Perspect Clin Res 2(4):145
Google Scholar
Yu CN, Greiner R, Lin HC et al (2011) Learning patient-specific cancer survival distributions as a sequence of dependent regressors. Adv Neural Inf Process Syst 24:1845–1853
Google Scholar
Ranganath R, Perotte A, Elhadad N et al (2016) Deep survival analysis[C]//Machine Learning for Healthcare Conference. PMLR:101–114
Katzman JL, Shaham U, Cloninger A et al (2016) Deep survival: a deep Cox proportional hazards network. BMC Med Res Methodol 1050:1–10
Google Scholar
Luck M, Sylvain T, Cardinal H et al (2017) Deep learning for patient-specific kidney graft survival analysis[J]. arXiv preprint arXiv:1705.10245
Yousefi S, Amrollahi F, Amgad M et al (2017) Predicting clinical outcomes from large scale cancer genomic profiles with deep survival models[J]. Sci Rep 7(1):1–11
Google Scholar
Martinsson E (2016) Wtte-rnn: Weibull time to event recurrent neural network. Chalmers University of Technology & University of Gothenburg
Google Scholar
Lin H, Zeng L, Yang J et al (2021) A machine learning-based model to predict survival after transarterial chemoembolization for BCLC stage B hepatocellular carcinoma. Front Oncol 11:608260
Google Scholar
Roy B, Stepišnik T, TP ALS et al (2022) Survival analysis with semi-supervised predictive clustering trees. Comp Biol Med 141:105001
Google Scholar
Ishwaran H, Kogalur UB, Blackstone EH et al (2008) Random survival forests[J]. The annals of applied statistics 2(3):841–860
MathSciNet MATH Google Scholar
Kretowska M (2019) Oblique survival trees in discrete event time analysis[J]. IEEE J Biomed Health Inform 24(1):247–258
Google Scholar
Adele C et al (2004) Random forests. Mach. Learn 45:157–176
Google Scholar
Książek W, Turza F, Pławiak P (2022) NCA-GA-SVM: a new two-level feature selection method based on neighborhood component analysis and genetic algorithm in hepatocellular carcinoma fatality prognosis[J]. Int J Num Method Biomed Eng 38(6):e3599
Google Scholar
Shivaswamy PK, Chu W, Jansche M (2007) A support vector approach to censored targets. Seventh IEEE Int Conf Data Mining (ICDM) 2007:655–660
Google Scholar
Ali MAS, Orban R, Rajammal Ramasamy R et al (2022) A novel method for survival prediction of hepatocellular carcinoma using feature-selection techniques. Appl Sci 12(13):6427
Google Scholar
Noh B, Park YM, Kwon Y et al (2022) Machine learning-based survival rate prediction of Korean hepatocellular carcinoma patients using multi-center data. BMC Gastroenterol 22(1):85
Google Scholar
Santos MS, Abreu PH, García-Laencina PJ et al (2015) A new cluster-based oversampling method for improving survival prediction of hepatocellular carcinoma patients. J Biomed Inf 58:49–59
Google Scholar
Yun S, Du B, Mao Y (2021) Robust deep multi-task learning framework for cancer survival analysis. Int Joint Conf Neural Netw (IJCNN):1–8
Zhang L, Dong D, Liu Z et al (2021) Joint multi-task learning for survival prediction of gastric cancer Patients using CT images IEEE 18th IEEE. In: Int Symp Biomed Imag (ISBI), pp 895–898
Google Scholar
Gu W, Zhang Z, Xie X et al (2019) An improved muti-task learning algorithm for analyzing cancer survival data. IEEE/ACM Transact Comput Biol Bioinform 18(2):500–511
Google Scholar
Viganò A, Dorgan M, Buckingham J et al (2000) Survival prediction in terminal cancer patients: a systematic review of the medical literature. Palliat Med 14(5):363–374
Google Scholar
Kourou K, Exarchos TP, Exarchos KP et al (2015) Machine learning applications in cancer prognosis and prediction. Comput Struct Biotechnol J 13:8–17
Google Scholar
Kaplan EL, Meier P (1958) Nonparametric estimation from incomplete observations. J Am Stat Assoc 53(282):457–481
MathSciNet MATH Google Scholar
Faraggi D, Simon R (1995) A neural network model for survival data. Stat Med 14(1):73–82
Google Scholar
Zhu X, Yao J, Huang J (2016) Deep convolutional neural network for survival analysis with pathological images IEEE. Int Conf Bioinform Biomed, IEEE:544–547
Katzman JL, Shaham U, Cloninger A et al (2018) DeepSurv: personalized treatment recommender system using a Cox proportional hazards deep neural network. BMC Med Res Methodol 18(1):1–12
Google Scholar
Chen L, Shao K, Long X et al (2020) Multi-task regression learning for survival analysis via prior information guided transductive matrix completion. Front Comput Sci 14(5):1–14
Google Scholar
Bolondi L, Burroughs A, Dufour JF et al (2012) Heterogeneity of patients with intermediate (BCLC B) hepatocellular carcinoma: proposal for a subclassification to facilitate treatment decisions Seminars in liver disease. Thieme Medical Publishers 32(04):348–359
Google Scholar
Kadalayil L, Benini R, Pallan L et al (2013) A simple prognostic scoring system for patients receiving transarterial embolisation for hepatocellular cancer. Ann Oncol 24(10):2565–2570
Google Scholar
Lee DH (2013) Pseudo-label: the simple and efficient semi-supervised learning method for deep neural networks Workshop on challenges in representation learning. ICML 3(2):896
Google Scholar
Shen L, Zeng Q, Guo P et al (2018) Dynamically prognosticating patients with hepatocellular carcinoma through survival paths mapping based on time-series data. Nat Commun 9(1):1–10
Google Scholar
Tsoris A, Marlar CA (2020) Use of the Child Pugh Score in Liver Disease; StatPearls: Treasure Island. FL, USA
Fotso S (2018) Deep neural networks for survival analysis based on a multi-task framework. arXiv preprint arXiv:1801.05512
Lee C, Zame W, Yoon J et al (2018) Deephit: a deep learning approach to survival analysis with competing risks. Proc AAAI Conf Artif Intell 32(1)
Kvamme H, Borgan Ø (2019) Continuous and discrete-time survival prediction with neural networks. Lifetime Data Anal 1910:06724
MATH Google Scholar
Gensheimer MF, Narasimhan B (2019) A scalable discrete-time survival model for neural networks. PeerJ 7:e6257
Google Scholar
Zhong BY, Yan ZP, Sun JH et al (2021) Random survival forests to predict disease control for hepatocellular carcinoma treated with transarterial chemoembolization combined with sorafenib. Front Mol Biosci:437
Xie J, Liu C (2005) Adjusted Kaplan–Meier estimator and log-rank test with inverse probability of treatment weighting for survival data. Stat Med 24(20):3089–3110
MathSciNet Google Scholar

Download references

Acknowledgements

The authors would like to thank the College of Computer Science, Chongqing University, for providing the computing resources for this study.

Author information

Authors and Affiliations

College of Computer Science, Chongqing University, Chongqing, 400044, China
Guo Huang & Huijun Liu
Department of Gastroenterology, Children’s Hospital of Chongqing Medical University, Chongqing, 400044, China
Shu Gong
Ministry of Education Key Laboratory of Child Development and Disorders, Chongqing, 400044, China
Shu Gong
Chongqing Key Laboratory of Pediatrics, Chongqing, 400044, China
Shu Gong
School of Big Data & Software Engineering, Chongqing University, Chongqing, 401331, China
Yongxin Ge

Authors

Guo Huang
View author publications
You can also search for this author in PubMed Google Scholar
Huijun Liu
View author publications
You can also search for this author in PubMed Google Scholar
Shu Gong
View author publications
You can also search for this author in PubMed Google Scholar
Yongxin Ge
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors collected, extracted, and analyzed the data and wrote the article. GH and HJL conceived and designed this study. HJL and SG provided critical revisions to the manuscript. All authors have approved the final version of the manuscript.

Corresponding authors

Correspondence to Huijun Liu or Shu Gong.

Ethics declarations

Ethical Approval

This declaration is “not applicable.”

Conflict of Interest

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

A.
Data Preprocessing

Three cohorts (i.e., derivation, internal, and multicenter testing cohorts) have missing values. The number and percentage of HCC patients with missing values in each variable of datasets are provided in Table 4.

Missing values are a common problem in clinical medical data. Therefore, to ensure data quality, we must take appropriate measures to reduce data problems as much as possible. Specifically, our HCC data include continuous variables and categorical variables. For continuous features, we impute the median value of each feature and fill in the missing value, and for categorical features, we completed missing values by replacing values by the most common occurrence. On the basis of the recommendations of HCC clinical medical experts, we used the inclusion and exclusion criteria mentioned above, excluded some abnormal feature values, and removed nine features (lymph node metastasis, distant metastasis, invasion of vena cava or atrium, invasion of hepatic veins, branch of invaded portal vein, portal vein invasion, PS score, hepatic encephalopathy, and treatment response in last time slice). Therefore, the last 21 features (include Child-Pugh) related to demographic, clinical, and biological features of patients were filtered into the model.

Then, we use the standard score method for continuous features and one-hot encoding for categorical variables to perform data normalization. For feature X, the standard score method’s output is

$${X}^{\hbox{'}}=\frac{X-\mu (X)}{\delta (X)}$$

(1)

μ is the mean, and δ is the standard deviation.

Table 4 Statistics of the missing values in each feature of three datasets

Full size table

2
Hyperparameter Tuning for the Baselines

For each experiment, we use fivefold cross-validation to maximize the concordance index (C-index) on the validation dataset and obtain the optimal hyperparameters of the model. In our DMSM, hyperparameters include batch size, learning rate, nodes, λ, β, and activation. The hyperparameters tuned for DMSM and other comparative models are presented below, and the optimal selection of hyperparameters is based on grid search.

DMSM: The number of nodes in hidden layer is selected from [5, 10, 15, 20, 25, 30]; the activation function is selected from [RELU, Sigmoid, Tanh]; the batch size is selected from [32, 64, 128, 256, 512]; the size of the learning rate is selected from [0.0001, 0.001, 0.01, 0.1]; the size of λ is selected from [0.01, 0.05, 0.1, 0.5, 1.0]; the size of β is selected from [0.001, 0.005, 0.01, 0.05, 0.1, 0.5]. We perform fivefold cross-validation, randomly select parameters from a given grid, and finally select the set of hyperparameters with the highest C-index. The hyperparameters for DMSM method and two-baseline models are shown in Tables 5 and 6.

RSF: We followed the experimental settings provided in the RSF GitHub repository.^{Footnote 1}

The size of Max_depth is selected from 5, 10, 15, and 20; the max_features is selected from [“sqrt,” “int,” “float,” “log2”]; the sample_size_pct is selected from [0.55, 0.60, 0.65, 0.70]; the min_node_size is selected from [10, 20, 30, 40, 50]; the number of num_trees is selected from [100, 200, 300, 400, 500]. The hyperparameters are shown in Table 7.

DeepHit: We followed the experimental settings provided in the DeepHit GitHub repository.^{Footnote 2} The size of alpha is selected from [0.1, 0.2, 0.3, 0.4, 0.5, 0.6]; the size of sigma is selected from [0.1, 0.2, 0.3, 0.4, 0.5, 0.6]; the num_nodes is selected from [[4, 8], [4, 16], [8, 16], [8, 32], [16, 32], [32, 32].The selection interval of the learning rate and batch size is the same as above. The hyperparameters are shown in Table 8.

Table 5 DMSM experimental hyperparameters

Full size table

Table 6 Hyperparameters of two-baseline models

Full size table

Table 7 Hyperparameters of RSF model

Full size table

Table 8 Hyperparameters of DeepHit model

Full size table

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Huang, G., Liu, H., Gong, S. et al. Survival Prediction After Transarterial Chemoembolization for Hepatocellular Carcinoma: a Deep Multitask Survival Analysis Approach. J Healthc Inform Res 7, 332–358 (2023). https://doi.org/10.1007/s41666-023-00139-0

Download citation

Received: 04 November 2022
Revised: 20 February 2023
Accepted: 16 July 2023
Published: 31 July 2023
Issue Date: September 2023
DOI: https://doi.org/10.1007/s41666-023-00139-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Survival Prediction After Transarterial Chemoembolization for Hepatocellular Carcinoma: a Deep Multitask Survival Analysis Approach

Abstract

Access this article

Similar content being viewed by others

Computed Tomography Image-Based Deep Survival Regression for Metastatic Colorectal Cancer Using a Non-proportional Hazards Model

Construction and validation of a survival prognostic model for stage III hepatocellular carcinoma: a real-world, multicenter clinical study

A new model to estimate duration of survival in patients with hepatocellular carcinoma with BCLC intermediate stage

Data Availability

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Ethical Approval

Conflict of Interest

Additional information

Publisher’s Note

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Survival Prediction After Transarterial Chemoembolization for Hepatocellular Carcinoma: a Deep Multitask Survival Analysis Approach

Abstract

Access this article

Similar content being viewed by others

Computed Tomography Image-Based Deep Survival Regression for Metastatic Colorectal Cancer Using a Non-proportional Hazards Model

Construction and validation of a survival prognostic model for stage III hepatocellular carcinoma: a real-world, multicenter clinical study

A new model to estimate duration of survival in patients with hepatocellular carcinoma with BCLC intermediate stage

Data Availability

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Ethical Approval

Conflict of Interest

Additional information

Publisher’s Note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation