Skip to main content

Multi-transSP: Multimodal Transformer for Survival Prediction of Nasopharyngeal Carcinoma Patients

  • Conference paper
  • First Online:
Book cover Medical Image Computing and Computer Assisted Intervention – MICCAI 2022 (MICCAI 2022)

Abstract

Nasopharyngeal carcinoma (NPC) is a malignant tumor that often occurs in Southeast Asia and southern China. Since there is a need for a more precise personalized therapy plan that depends on accurate prognosis prediction, it may be helpful to predict patients’ overall survival (OS) based on clinical data. However, most of the current deep learning (DL) based methods which use a single modality fail to effectively utilize amount of multimodal data of patients, causing inaccurate survival prediction. In view of this, we propose a Multimodal Transformer for Survival Prediction (Multi-TransSP) of NPC patients that uses tabular data and computed tomography (CT) images jointly. Taking advantage of both convolutional neural network and Transformer, the architecture of our network is comprised of a multimodal CNN-Based Encoder and a Transformer-Based Encoder. Particularly, the CNN-Based Encoder can learn rich information from specific modalities and the Transformer-Based Encoder is able to fuse multimodal feature. Our model automatically gives the final prediction of OS with a concordance index (CI) of 0.6941 on our in-house dataset, and our model significantly outperforms other methods using any single source of data or previous multimodal frameworks. Code is available at https://github.com/gluglurice/Multi-TransSP.

H. Zheng and Z. Lin—Contribute equally to this work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Hu, L., Li, J., Peng, X., et al.: Semi-supervised NPC segmentation with uncertainty and attention guided consistency. Knowl.-Based Syst. 239, 108021–108033 (2022)

    Article  Google Scholar 

  2. Zhan, B., Xiao, J., Cao, C., et al.: Multi-constraint generative adversarial network for dose prediction in radiotherapy. Med. Image Anal. 77, 102339–102352 (2022)

    Article  Google Scholar 

  3. Lambin, P., Leijenaar, R.T.H., Deist, T.M., et al.: Radiomics: the bridge between medical imaging and personalized medicine. Nat Rev Clin Oncol 14, 749–762 (2017)

    Article  Google Scholar 

  4. Wang, Y., Zhou, L., Yu, B. et al.: 3D auto-context-based locality adaptive multi-modality GANs for PET synthesis. IEEE Trans. Med. Imaging 38, 1328–1339 (2019)

    Google Scholar 

  5. Luo, Y., Zhou, L., Zhan, B., et al.: Adaptive rectification based adversarial network with spectrum constraint for high-quality PET image synthesis. Med. Image Anal. 77, 102335–102347 (2022)

    Article  Google Scholar 

  6. Wang, K., Zhan, B., Zu, C., et al.: Semi-supervised medical image segmentation via a tripled-uncertainty guided mean teacher model with contrastive learning. Med. Image Anal. 79, 102447–102460 (2022)

    Article  Google Scholar 

  7. Yang, Q., Guo, Y., Ou, X., et al.: Automatic T staging using weakly supervised deep learning for nasopharyngeal carcinoma on MR images. J. Magn. Reson. Imaging 52, 1074–1082 (2020)

    Article  Google Scholar 

  8. Liu, K., Xia, W., Qiang, M., et al.: Deep learning pathological microscopic features in endemic nasopharyngeal cancer: prognostic value and protentional role for individual induction chemotherapy. Cancer Med 9, 1298–1306 (2020)

    Article  Google Scholar 

  9. Huang, Y., Zhao, H., Huang, L.: What Makes Multi-modal Learning Better than Single (Provably). arXiv preprint arXiv: 2106.04538 [Cs] (2021)

    Google Scholar 

  10. Shi, Y., Zu, C., Hong, M., et al.: ASMFS: adaptive-similarity-based multi-modality feature selection for classification of Alzheimer’s disease. Pattern Recogn. 126, 108566–108580 (2022)

    Article  Google Scholar 

  11. Jing, B., Deng, Y., Zhang, T., et al.: Deep learning for risk prediction in patients with nasopharyngeal carcinoma using multi-parametric MRIs. Comput. Methods Programs Biomed. 197, 105684–105690 (2020)

    Article  Google Scholar 

  12. Qiang, M., Li, C., Sun, Y., et al.: A prognostic predictive system based on deep learning for locoregionally advanced nasopharyngeal carcinoma. J. Natl Cancer Inst. 113, 606–615 (2021)

    Article  Google Scholar 

  13. Vale-Silva, L.A., Rohr, K.: Pan-cancer prognosis prediction using multimodal deep learning. In: IEEE 17th International Symposium on Biomedical Imaging, pp. 568–571. IEEE (2020)

    Google Scholar 

  14. Zhang, L., Wu, X., Liu, J., et al.: MRI-based deep-learning model for distant metastasis-free survival in locoregionally advanced nasopharyngeal carcinoma. J. Magn. Reson. Imaging 53, 167–178 (2021)

    Article  Google Scholar 

  15. Chauhan, G., et al.: Joint modeling of chest radiographs and radiology reports for pulmonary edema assessment. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12262, pp. 529–539. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59713-9_51

    Chapter  Google Scholar 

  16. Guan, Y., et al.: Predicting esophageal fistula risks using a multimodal self-attention network. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12905, pp. 721–730. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87240-3_69

    Chapter  Google Scholar 

  17. Lin, T., Wang, Y., Liu, X. et al.: A Survey of Transformers. arXiv preprint arXiv:2106.04554 [cs] (2021)

  18. Parmar, N., Vaswani, A., Uszkoreit, J. et al.: Image Transformer. arXiv preprint arXiv:1802.05751v3 [cs] (2018)

  19. Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 213–229. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13

    Chapter  Google Scholar 

  20. Wang, H., Zhu, Y., Adam, H. et al.: MaX-DeepLab: end-to-end panoptic segmentation with mask transformers. In IEEE Conference on Computer Vision and Pattern Recognition, pp. 5459–5470. IEEE (2021)

    Google Scholar 

  21. Huang, J., Tao, J., Liu, B. et al.: Multimodal transformer fusion for continuous emotion recognition. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 3507–3511. IEEE (2020)

    Google Scholar 

  22. Tsai, Y. H., Bai, S., Liang, P. P. et al.: Multimodal transformer for unaligned multimodal language sequences. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6558–6569 (2019)

    Google Scholar 

  23. Hu, R., Singh, A.: UniT: multimodal multitask learning with a unified transformer. arXiv preprint arXiv:2102.10772 [cs] (2021)

  24. He, K., Zhang, X., Ren, S. et al.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778. IEEE (2016)

    Google Scholar 

  25. Katzman, J.L., Shaham, U., Cloninger, A., et al.: DeepSurv: personalized treatment recommender system using a Cox proportional hazards deep neural network. BMC Med. Res. Methodol. 18, 24–35 (2018)

    Article  Google Scholar 

  26. Mukherjee, P., Zhou, M., Lee, E., et al.: A shallow convolutional neural network predicts prognosis of lung cancer patients in multi-institutional CT-Image data. Nat. Mach. Intell. 2, 274–282 (2020)

    Article  Google Scholar 

  27. Yap, J., Yolland, W., Tschandl, P.: Multimodal skin lesion classification using deep learning. Exp. Dermatol. 27, 1261–1267 (2018)

    Article  Google Scholar 

  28. Vale-Silva, L.A., Rohr, K.: Long-term cancer survival prediction using multimodal deep learning. Sci. Rep. 11, 13505–13516 (2021)

    Article  Google Scholar 

Download references

Acknowledgement

This work is supported by National Natural Science Foundation of China (NFSC 62071314).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yan Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zheng, H. et al. (2022). Multi-transSP: Multimodal Transformer for Survival Prediction of Nasopharyngeal Carcinoma Patients. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds) Medical Image Computing and Computer Assisted Intervention – MICCAI 2022. MICCAI 2022. Lecture Notes in Computer Science, vol 13437. Springer, Cham. https://doi.org/10.1007/978-3-031-16449-1_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-16449-1_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-16448-4

  • Online ISBN: 978-3-031-16449-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics