Multi-transSP: Multimodal Transformer for Survival Prediction of Nasopharyngeal Carcinoma Patients

Zheng, Hanci; Lin, Zongying; Zhou, Qizheng; Peng, Xingchen; Xiao, Jianghong; Zu, Chen; Jiao, Zhengyang; Wang, Yan

doi:10.1007/978-3-031-16449-1_23

Hanci Zheng¹²,
Zongying Lin¹²,
Qizheng Zhou¹³,
Xingchen Peng¹⁴,
Jianghong Xiao¹⁵,
Chen Zu¹⁶,
Zhengyang Jiao¹² &
…
Yan Wang¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13437))

Included in the following conference series:

International Conference on Medical Image Computing and Computer-Assisted Intervention

5890 Accesses
8 Citations

Abstract

Nasopharyngeal carcinoma (NPC) is a malignant tumor that often occurs in Southeast Asia and southern China. Since there is a need for a more precise personalized therapy plan that depends on accurate prognosis prediction, it may be helpful to predict patients’ overall survival (OS) based on clinical data. However, most of the current deep learning (DL) based methods which use a single modality fail to effectively utilize amount of multimodal data of patients, causing inaccurate survival prediction. In view of this, we propose a Multimodal Transformer for Survival Prediction (Multi-TransSP) of NPC patients that uses tabular data and computed tomography (CT) images jointly. Taking advantage of both convolutional neural network and Transformer, the architecture of our network is comprised of a multimodal CNN-Based Encoder and a Transformer-Based Encoder. Particularly, the CNN-Based Encoder can learn rich information from specific modalities and the Transformer-Based Encoder is able to fuse multimodal feature. Our model automatically gives the final prediction of OS with a concordance index (CI) of 0.6941 on our in-house dataset, and our model significantly outperforms other methods using any single source of data or previous multimodal frameworks. Code is available at https://github.com/gluglurice/Multi-TransSP.

H. Zheng and Z. Lin—Contribute equally to this work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Hu, L., Li, J., Peng, X., et al.: Semi-supervised NPC segmentation with uncertainty and attention guided consistency. Knowl.-Based Syst. 239, 108021–108033 (2022)
Article Google Scholar
Zhan, B., Xiao, J., Cao, C., et al.: Multi-constraint generative adversarial network for dose prediction in radiotherapy. Med. Image Anal. 77, 102339–102352 (2022)
Article Google Scholar
Lambin, P., Leijenaar, R.T.H., Deist, T.M., et al.: Radiomics: the bridge between medical imaging and personalized medicine. Nat Rev Clin Oncol 14, 749–762 (2017)
Article Google Scholar
Wang, Y., Zhou, L., Yu, B. et al.: 3D auto-context-based locality adaptive multi-modality GANs for PET synthesis. IEEE Trans. Med. Imaging 38, 1328–1339 (2019)
Google Scholar
Luo, Y., Zhou, L., Zhan, B., et al.: Adaptive rectification based adversarial network with spectrum constraint for high-quality PET image synthesis. Med. Image Anal. 77, 102335–102347 (2022)
Article Google Scholar
Wang, K., Zhan, B., Zu, C., et al.: Semi-supervised medical image segmentation via a tripled-uncertainty guided mean teacher model with contrastive learning. Med. Image Anal. 79, 102447–102460 (2022)
Article Google Scholar
Yang, Q., Guo, Y., Ou, X., et al.: Automatic T staging using weakly supervised deep learning for nasopharyngeal carcinoma on MR images. J. Magn. Reson. Imaging 52, 1074–1082 (2020)
Article Google Scholar
Liu, K., Xia, W., Qiang, M., et al.: Deep learning pathological microscopic features in endemic nasopharyngeal cancer: prognostic value and protentional role for individual induction chemotherapy. Cancer Med 9, 1298–1306 (2020)
Article Google Scholar
Huang, Y., Zhao, H., Huang, L.: What Makes Multi-modal Learning Better than Single (Provably). arXiv preprint arXiv: 2106.04538 [Cs] (2021)
Google Scholar
Shi, Y., Zu, C., Hong, M., et al.: ASMFS: adaptive-similarity-based multi-modality feature selection for classification of Alzheimer’s disease. Pattern Recogn. 126, 108566–108580 (2022)
Article Google Scholar
Jing, B., Deng, Y., Zhang, T., et al.: Deep learning for risk prediction in patients with nasopharyngeal carcinoma using multi-parametric MRIs. Comput. Methods Programs Biomed. 197, 105684–105690 (2020)
Article Google Scholar
Qiang, M., Li, C., Sun, Y., et al.: A prognostic predictive system based on deep learning for locoregionally advanced nasopharyngeal carcinoma. J. Natl Cancer Inst. 113, 606–615 (2021)
Article Google Scholar
Vale-Silva, L.A., Rohr, K.: Pan-cancer prognosis prediction using multimodal deep learning. In: IEEE 17th International Symposium on Biomedical Imaging, pp. 568–571. IEEE (2020)
Google Scholar
Zhang, L., Wu, X., Liu, J., et al.: MRI-based deep-learning model for distant metastasis-free survival in locoregionally advanced nasopharyngeal carcinoma. J. Magn. Reson. Imaging 53, 167–178 (2021)
Article Google Scholar
Chauhan, G., et al.: Joint modeling of chest radiographs and radiology reports for pulmonary edema assessment. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12262, pp. 529–539. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59713-9_51
Chapter Google Scholar
Guan, Y., et al.: Predicting esophageal fistula risks using a multimodal self-attention network. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12905, pp. 721–730. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87240-3_69
Chapter Google Scholar
Lin, T., Wang, Y., Liu, X. et al.: A Survey of Transformers. arXiv preprint arXiv:2106.04554 [cs] (2021)
Parmar, N., Vaswani, A., Uszkoreit, J. et al.: Image Transformer. arXiv preprint arXiv:1802.05751v3 [cs] (2018)
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 213–229. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13
Chapter Google Scholar
Wang, H., Zhu, Y., Adam, H. et al.: MaX-DeepLab: end-to-end panoptic segmentation with mask transformers. In IEEE Conference on Computer Vision and Pattern Recognition, pp. 5459–5470. IEEE (2021)
Google Scholar
Huang, J., Tao, J., Liu, B. et al.: Multimodal transformer fusion for continuous emotion recognition. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 3507–3511. IEEE (2020)
Google Scholar
Tsai, Y. H., Bai, S., Liang, P. P. et al.: Multimodal transformer for unaligned multimodal language sequences. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6558–6569 (2019)
Google Scholar
Hu, R., Singh, A.: UniT: multimodal multitask learning with a unified transformer. arXiv preprint arXiv:2102.10772 [cs] (2021)
He, K., Zhang, X., Ren, S. et al.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778. IEEE (2016)
Google Scholar
Katzman, J.L., Shaham, U., Cloninger, A., et al.: DeepSurv: personalized treatment recommender system using a Cox proportional hazards deep neural network. BMC Med. Res. Methodol. 18, 24–35 (2018)
Article Google Scholar
Mukherjee, P., Zhou, M., Lee, E., et al.: A shallow convolutional neural network predicts prognosis of lung cancer patients in multi-institutional CT-Image data. Nat. Mach. Intell. 2, 274–282 (2020)
Article Google Scholar
Yap, J., Yolland, W., Tschandl, P.: Multimodal skin lesion classification using deep learning. Exp. Dermatol. 27, 1261–1267 (2018)
Article Google Scholar
Vale-Silva, L.A., Rohr, K.: Long-term cancer survival prediction using multimodal deep learning. Sci. Rep. 11, 13505–13516 (2021)
Article Google Scholar

Download references

Acknowledgement

This work is supported by National Natural Science Foundation of China (NFSC 62071314).

Author information

Authors and Affiliations

School of Computer Science, Sichuan University, Chengdu, China
Hanci Zheng, Zongying Lin, Zhengyang Jiao & Yan Wang
School of Applied Mathematics, New York State Stony Brook University, Stony Brook, NY, USA
Qizheng Zhou
Department of Biotherapy, Cancer Center, West China Hospital, Sichuan University, Chengdu, China
Xingchen Peng
Department of Radiation Oncology, Cancer Center, West China Hospital, Sichuan University, Chengdu, China
Jianghong Xiao
Department of Risk Controlling Research, JD.COM, Chengdu, China
Chen Zu

Authors

Hanci Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Zongying Lin
View author publications
You can also search for this author in PubMed Google Scholar
Qizheng Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Xingchen Peng
View author publications
You can also search for this author in PubMed Google Scholar
Jianghong Xiao
View author publications
You can also search for this author in PubMed Google Scholar
Chen Zu
View author publications
You can also search for this author in PubMed Google Scholar
Zhengyang Jiao
View author publications
You can also search for this author in PubMed Google Scholar
Yan Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yan Wang .

Editor information

Editors and Affiliations

Rochester Institute of Technology, Rochester, NY, USA
Linwei Wang
Chinese University of Hong Kong, Hong Kong, Hong Kong
Qi Dou
University of Virginia, Charlottesville, VA, USA
P. Thomas Fletcher
National Center for Tumor Diseases (NCT/UCC), Dresden, Germany
Stefanie Speidel
Case Western Reserve University, Cleveland, OH, USA
Shuo Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zheng, H. et al. (2022). Multi-transSP: Multimodal Transformer for Survival Prediction of Nasopharyngeal Carcinoma Patients. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds) Medical Image Computing and Computer Assisted Intervention – MICCAI 2022. MICCAI 2022. Lecture Notes in Computer Science, vol 13437. Springer, Cham. https://doi.org/10.1007/978-3-031-16449-1_23

Download citation

DOI: https://doi.org/10.1007/978-3-031-16449-1_23
Published: 17 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16448-4
Online ISBN: 978-3-031-16449-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)