Existing automatic contouring methods for primary nasopharyngeal carcinoma (NPC) and metastatic lymph nodes (MLNs) may suffer from low segmentation accuracy and cannot handle multi-modal images correctly. Furthermore, high inter-patient physiological variations and ineffective multi-modal information fusion pose further difficulties. To address these issues, a 3D reconstruction-oriented fully automatic multi-modal segmentation method has been presented to delineate primary NPC tumors and MLNs via a dual attention-guided VNet. Specifically, we leverage a physiologically-sensitive feature enhancement (PFE) module that emphasizes long-range spatial context information in tumor regions of interest and thereby copes with the variability resulting from inter-patient characteristics. This can help extract the 3D spatial feature and facilitate the high-quality reconstruction of 3D geometry of tumors. Next, we develop a multi-modal feature aggregation (MFA) module to describe multi-scale modality-aware features, exploring the effective information aggregation of multi-modal images. To the best of our knowledge, this is the first fully automatic, highly accurate segmentation framework of the primary NPC tumors and MLNs on combined CT-MR datasets. Experimental results on clinical medical datasets validate the effectiveness of our method, and it outperforms the state-of-the-art methods.

Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data Availability
The data were utilized with permission for this study and are therefore not available to the public.
Joseph, S.S., Aju, D.: A comparative survey on three-dimensional reconstruction of medical modalities based on various approaches. In: Information Systems Design and Intelligent Applications, pp. 223–233, Springer, Singapore (2019)
Le Moal, J., Peillon, C., Dacher, J.-N., Baste, J.-M.: Three-dimensional computed tomography reconstruction for operative planning in robotic segmentectomy: a pilot study. J. Thorac. Dis. 10(1), 196 (2018)
Chang, E.T., Adami, H.-O.: The enigmatic epidemiology of nasopharyngeal carcinoma. Cancer Epidemiol. Prev. Biomark. 15(10), 1765–1777 (2006)
Teguh, D.N., Levendag, P.C., Voet, P.W., Al-Mamgani, A., Han, et al.: Clinical validation of atlas-based auto-segmentation of multiple target volumes and normal tissue (swallowing/mastication) structures in the head and neck. Int. J. Radiat. Oncol.* Biol.* Phys. 81(4), 950–957 (2011)
Daisne, J.-F., Blumhofer, A.: Atlas-based automatic segmentation of head and neck organs at risk and nodal target volumes: a clinical validation. Radiat. Oncol. 8, 1–11 (2013)
Zhou, J., Chan, K.L., Xu, P., Chong, V.F.: Nasopharyngeal carcinoma lesion segmentation from MR images by support vector machine. In: 3rd IEEE International Symposium on Biomedical Imaging: Nano to Macro, 2006, pp. 1364–1367. IEEE (2006)
Lee, F.K., Yeung, D.K., King, A.D., Leung, S., Ahuja, A.: Segmentation of nasopharyngeal carcinoma (NPC) lesions in MR images. Int. J. Radiat. Oncol.* Biol.* Phys. 61(2), 608–620 (2005)
Men, K., Chen, X., Zhang, Y., Zhang, T., Dai, J., Yi, J., Li, Y.: Deep deconvolutional neural network for target segmentation of nasopharyngeal cancer in planning computed tomography images. Front. Oncol. 7, 315 (2017)
Lin, L., Dou, Q., Jin, Y.-M., Zhou, G.-Q., Tang, Y.-Q., Chen, W.-L., Su, B.-A., Liu, F., Tao, C.-J., Jiang, N., et al.: Deep learning for automated contouring of primary tumor volumes by MRI for nasopharyngeal carcinoma. Radiology 291(3), 677–686 (2019)
Cabezas, M., Oliver, A., Lladó, X., Freixenet, J., Cuadra, M.B.: A review of atlas-based segmentation for magnetic resonance brain images. Comput. Methods Programs Biomed. 104(3), e158–e177 (2011)
Mohammed, M.A., Abd Ghani, M.K., Hamed, R.I., Ibrahim, D.A.: Review on nasopharyngeal carcinoma: concepts, methods of analysis, segmentation, classification, prediction and impact: a review of the research literature. J. Comput. Sci. 21, 283–298 (2017)
Jiang, H., Diao, Z., Yao, Y.-D.: Deep learning techniques for tumor segmentation: a review. J. Supercomput. 78(2), 1807–1851 (2022)
Huang, B., Chen, Z., Wu, P.-M., Ye, Y., Feng, V., Wong, C.-Y.O., Zheng, L., Liu, Y., Wang, T., Li, Q., et al.: Fully automated delineation of gross tumor volume for head and neck cancer on PET-CT using deep learning: a dual-center study. Contrast Media Mol. Imaging 2018, 8923028 (2018)
Li, Y., Dan, T., Li, H., Chen, J., Peng, H., Liu, L., Cai, H.: Npcnet: jointly segment primary nasopharyngeal carcinoma tumors and metastatic lymph nodes in MR images. IEEE Trans. Med. Imaging 41(7), 1639–1650 (2022)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Zhou, T., Ruan, S., Canu, S.: A review: deep learning for medical image segmentation using multi-modality fusion. Array 3, 100004 (2019)
Ma, Z., Zhou, S., Wu, X., Zhang, H., Yan, W., Sun, S., Zhou, J.: Nasopharyngeal carcinoma segmentation based on enhanced convolutional neural networks using multi-modal metric learning. Phys. Med. Biol. 64(2), 025005 (2019)
Ren, J., Eriksen, J., Nijkamp, J., Korreman, S.: Comparing different CT, PET and MRI multi-modality image combinations for deep learning-based head and neck tumor segmentation. Acta Oncol. 60, 1–8 (2021)
Stapleford, L.J., Lawson, J.D., Perkins, C., Edelman, S., Davis, L., McDonald, M.W., Waller, A., Schreibmann, E., Fox, T.: Evaluation of automatic atlas-based lymph node segmentation for head-and-neck cancer. Int. J. Radiat. Oncol.* Biol.* Phys. 77(3), 959–966 (2010)
Qazi, A.A., Pekar, V., Kim, J., Xie, J., Breen, S.L., Jaffray, D.A.: Auto-segmentation of normal and target structures in head and neck CT images: a feature-driven model-based approach. Med. Phys. 38(11), 6160–6170 (2011)
Kosmin, M., Ledsam, J., Romera-Paredes, B., Mendes, R., Moinuddin, S., de Souza, D., Gunn, L., Kelly, C., Hughes, C., Karthikesalingam, A., et al.: Rapid advances in auto-segmentation of organs at risk and target volumes in head and neck cancer. Radiother. Oncol. 135, 130–140 (2019)
Ma, Z., Wu, X., Song, Q., Luo, Y., Wang, Y., Zhou, J.: Automated nasopharyngeal carcinoma segmentation in magnetic resonance images by combination of convolutional neural networks and graph cut. Exp. Ther. Med. 16(3), 2511–2521 (2018)
Tang, P., Zu, C., Hong, M., Yan, R., Peng, X., Xiao, J., Wu, X., Zhou, J., Zhou, L., Wang, Y.: DA-DSUnet: Dual attention-based dense SU-net for automatic head-and-neck tumor segmentation in MRI images. Neurocomputing 435, 103–113 (2021)
Guo, M.-H., Xu, T.-X., Liu, J.-J., Liu, Z.-N., Jiang, P.-T., Mu, T.-J., Zhang, S.-H., Martin, R.R., Cheng, M.-M., Hu, S.-M.: Attention mechanisms in computer vision: a survey. Comput. Vis. Media 8(3), 331–368 (2022)
Khan, S., Naseer, M., Hayat, M., Zamir, S.W., Khan, F.S., Shah, M.: Transformers in vision: a survey. ACM Comput. Surv. (CSUR) 54(10s), 1–41 (2022)
Wang, P., Li, Y., Sun, Y., He, D., Wang, Z.: Multi-scale boundary neural network for gastric tumor segmentation. Vis. Comput. 39(3), 915–926 (2023)
Wang, L., Cai, L., Chen, C., Fu, X., Yu, J., Ge, R., Yuan, B., Yang, X., Shao, Q., Lv, Q.: A novel davnet3+ method for precise segmentation of bladder cancer in MRI. Vis. Comput. 1–13 (2022). https://doi.org/10.1007/s00371-022-02622-y
Wang, X., Girshick, R., Gupta, A., He, v.: Non-local neural networks. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7794–7803 (2018)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Wang, Y., Dou, H., Hu, X., Zhu, L., Yang, X., Xu, M., Qin, J., Heng, P.-A., Wang, T., Ni, D.: Deep attentive features for prostate segmentation in 3d transrectal ultrasound. IEEE Trans. Med. Imaging 38(12), 2768–2778 (2019)
Xu, R., Wang, C., Xu, S., Meng, W., Zhang, X.: Dc-net: dual context network for 2d medical image segmentation. In: Medical Image Computing and Computer Assisted Intervention–MICCAI 2021, pp. 503–513. Springer (2021)
Wang, C., Xu, R., Xu, S., Meng, W., Zhang, X.: Da-net: dual branch transformer and adaptive strip upsampling for retinal vessels segmentation. In: Medical Image Computing and Computer Assisted Intervention–MICCAI 2022, pp. 528–538. Springer (2022)
Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015, pp. 234–241. Springer (2015)
Çiçek, Ö., Abdulkadir, A., Lienkamp, S. S., Brox, T., Ronneberger, O.: 3d u-net: learning dense volumetric segmentation from sparse annotation. In: Medical Image Computing and Computer-Assisted Intervention—MICCAI 2016, pp. 424–432. Springer (2016)
Zhang, Z., Liu, Q., Wang, Y.: Road extraction by deep residual u-net. IEEE Geosci. Remote Sens. Lett. PP(99), 1–5 (2017)
Milletari, F., Navab, N., Ahmadi, S.-A.: V-net: fully convolutional neural networks for volumetric medical image segmentation. In: 2016 Fourth International Conference on 3D Vision (3DV), pp. 565–571. IEEE (2016)
Liu, L., Cheng, J., Quan, Q., Wu, F.-X., Wang, Y.-P., Wang, J.: A survey on u-shaped networks in medical image segmentations. Neurocomputing 409, 244–258 (2020)
Dolz, J., Ben Ayed, I., Desrosiers, C.: Dense multi-path u-net for ischemic stroke lesion segmentation in multiple image modalities. In: Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries, pp. 271–282. Springer (2019)
Lachinov, D., Vasiliev, E., Turlapov, V.: Glioma segmentation with cascaded unet. In: Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries, pp. 189–198. Springer (2019)
Sinha, A., Dolz, J.: Multi-scale self-guided attention for medical image segmentation. IEEE J. Biomed. Health Inform. 25(1), 121–130 (2020)
Kavur, A.E., Gezer, N.S., Barış, M., Aslan, S., Conze, P.-H., Groza, V., Pham, D.D., Chatterjee, S., Ernst, P., Özkan, S., et al.: Chaos challenge-combined (CT-MR) healthy abdominal organ segmentation. Med. Image Anal. 69, 101950 (2021)
Azad, R., Aghdam, E.K., Rauland, A., Jia, Y., Avval, A.H., Bozorgpour, A., Karimijafarbigloo, S., Cohen, J.P., Adeli, E., Merhof, D.: Medical image segmentation review: the success of u-net. ArXiv, vol. abs/2211.14830 (2022)
Liu, H., Liu, F., Fan, X., Huang, D.: Polarized self-attention: towards high-quality pixel-wise regression. Neurocomputing 506, 158–167 (2022)
Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: Cbam: convolutional block attention module. In: European Conference on Computer Vision (ECCV), pp. 3–19 (2018)
Fu, J., Liu, J., Tian, H., Li, Y., Bao, Y., Fang, Z., Lu, H.: Dual attention network for scene segmentation. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3141–3149 (2019)
Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder–decoder with atrous separable convolution for semantic image segmentation. In: Computer Vision—ECCV 2018 (Cham), pp. 833–851. Springer (2018)
Fedorov, A., Beichel, R., Kalpathy-Cramer, J., Finet, J., Fillion-Robin, J.-C., Pujol, S., Bauer, C., Jennings, D., Fennessy, F., Sonka, M., et al.: 3d slicer as an image computing platform for the quantitative imaging network. Magn. Reson. Imaging 30(9), 1323–1341 (2012)
DenOtter, T.D., Schubert, J.: Hounsfield Unit. StatPearls Publishing, Treasure Island, FL (2022)
Zhou, Z., Siddiquee, M., Tajbakhsh, N., Liang, J.: Unet++: redesigning skip connections to exploit multiscale features in image segmentation. IEEE Trans. Med. Imaging 39(6), 1856–1867 (2020)
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2261–2269 (2017)
Hatamizadeh, A., Tang, Y., Nath, V., Yang, D., Myronenko, A., Landman, B., Roth, H.R., Xu, D.: Unetr: transformers for 3d medical image segmentation. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 574–584 (2022)
Lorensen, W.E., Cline, H.E.: Marching cubes: a high resolution 3d surface construction algorithm. ACM Siggraph Comput. Graph. 21(4), 163–169 (1987)
This work was supported by the National Natural Science Foundation of China (No. 11921006), Beijing Outstanding Young Scientists Program, the National Grand Instrument Project (No. 2019YFF01014400), Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai) (No. SML2021SP101).
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of interest
All authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Meng, D., Li, S., Sheng, B. et al. 3D reconstruction-oriented fully automatic multi-modal tumor segmentation by dual attention-guided VNet. Vis Comput 39, 3183–3196 (2023). https://doi.org/10.1007/s00371-023-02965-0
Issue Date:
DOI: https://doi.org/10.1007/s00371-023-02965-0