Probabilistic Modeling Ensemble Vision Transformer Improves Complex Polyp Segmentation

Ling, Tianyi; Wu, Chengyi; Yu, Huan; Cai, Tian; Wang, Da; Zhou, Yincong; Chen, Ming; Ding, Kefeng

doi:10.1007/978-3-031-43990-2_54

Tianyi Ling^14,15,
Chengyi Wu^15,16,
Huan Yu^15,17,
Tian Cai¹⁸,
Da Wang¹⁴,
Yincong Zhou¹⁵,
Ming Chen¹⁵ &
…
Kefeng Ding¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14226))

Included in the following conference series:

International Conference on Medical Image Computing and Computer-Assisted Intervention

2961 Accesses

Abstract

Colorectal polyps detected during colonoscopy are strongly associated with colorectal cancer, making polyp segmentation a critical clinical decision-making tool for diagnosis and treatment planning. However, accurate polyp segmentation remains a challenging task, particularly in cases involving diminutive polyps and other intestinal substances that produce a high false-positive rate. Previous polyp segmentation networks based on supervised binary masks may have lacked global semantic perception of polyps, resulting in a loss of capture and discrimination capability for polyps in complex scenarios. To address this issue, we propose a novel Gaussian-Probabilistic guided semantic fusion method that progressively fuses the probability information of polyp positions with the decoder supervised by binary masks. Our Probabilistic Modeling Ensemble Vision Transformer Network(PETNet) effectively suppresses noise in features and significantly improves expressive capabilities at both pixel and instance levels, using just simple types of convolutional decoders. Extensive experiments on five widely adopted datasets show that PETNet outperforms existing methods in identifying polyp camouflage, appearance changes, and small polyp scenes, and achieves a speed about 27FPS in edge computing devices. Codes are available at: https://github.com/Seasonsling/PETNet.

T. Ling and C. Wu—Equal contributions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ahmed, A.M.A.A.: Generative adversarial networks for automatic polyp segmentation. arXiv:2012.06771 [cs, eess] (2020)
Bernal, J., Sánchez, F.J., Fernández-Esparrach, G., Gil, D., Rodríguez, C., Vilariño, F.: WM-DOVA maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians. Comput. Med. Imaging Graph.: Official J. Comput. Med. Imaging Soc. 43, 99–111 (2015). https://doi.org/10.1016/j.compmedimag.2015.02.007
Dong, B., Wang, W., Fan, D.P., Li, J., Fu, H., Shao, L.: Polyp-PVT: polyp segmentation with pyramid vision transformers. arXiv:2108.06932 [cs] (2021)
Fan, D.P., et al.: PraNet: parallel reverse attention network for polyp segmentation. arXiv:2006.11392 [cs, eess] (2020)
Jha, D., et al.: Kvasir-SEG: a segmented polyp dataset. arXiv:1911.07069 [cs, eess] (2019)
Jha, D., et al.: ResUNet++: an advanced architecture for medical image segmentation. arXiv:1911.07067 [cs, eess] (2019)
Ji, G.P., Fan, D.P., Chou, Y.C., Dai, D., Liniger, A., Van Gool, L.: Deep gradient learning for efficient camouflaged object detection. Tech. Rep. arXiv:2205.12853, arXiv (2022)
Mamonov, A.V., Figueiredo, I.N., Figueiredo, P.N., Tsai, Y.H.R.: Automated polyp detection in colon capsule endoscopy. IEEE Trans. Med. Imaging 33(7), 1488–1502 (2014). https://doi.org/10.1109/TMI.2014.2314959, http://arxiv.org/abs/1305.1912
National Health Commission of the People’s Republic of China: [Chinese Protocol of Diagnosis and Treatment of Colorectal Cancer (2020 edition)]. Zhonghua Wai Ke Za Zhi [Chinese Journal of Surgery] 58(8), 561–585 (2020). https://doi.org/10.3760/cma.j.cn112139-20200518-00390
Patel, K., Bur, A.M., Wang, G.: Enhanced U-Net: a feature enhancement network for polyp segmentation. In Proceedings of the International Robots & Vision Conference. International Robots & Vision Conference 2021, 181–188 (2021). https://doi.org/10.1109/crv52889.2021.00032, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8341462/
Qadir, H.A., Shin, Y., Solhusvik, J., Bergsland, J., Aabakken, L., Balasingham, I.: Toward real-time polyp detection using fully CNNs for 2D Gaussian shapes prediction. Med. Image Anal. 68, 101897 (2021). https://doi.org/10.1016/j.media.2020.101897, https://linkinghub.elsevier.com/retrieve/pii/S1361841520302619
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. arXiv:1505.04597 [cs] (2015), http://arxiv.org/abs/1505.04597
Silva, J., Histace, A., Romain, O., Dray, X., Granado, B.: Toward embedded detection of polyps in WCE images for early diagnosis of colorectal cancer. Int. J. Comput. Assist. Radiol. Surg. 9(2), 283–293 (2013). https://doi.org/10.1007/s11548-013-0926-3
Article Google Scholar
Tajbakhsh, N., Gurudu, S.R., Liang, J.: A comprehensive computer-aided polyp detection system for colonoscopy videos. Inf. Process. Med. Imaging 24, 327–38 (2015). https://doi.org/10.1007/978-3-319-19992-4_25
Article Google Scholar
Vázquez, D., et al.: A benchmark for endoluminal scene segmentation of colonoscopy images. J. Healthc. Eng. 2017, 4037190 (2017). https://doi.org/10.1155/2017/4037190
Article Google Scholar
Wang, H., et al.: Mixed transformer U-Net for medical image segmentation. arXiv:2111.04734 [cs, eess] (2021)
Wang, J., Huang, Q., Tang, F., Meng, J., Su, J., Song, S.: Stepwise feature fusion: local guides global. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds.) Medical Image Computing and Computer Assisted Intervention - MICCAI 2022, pp. 110–120. Lecture Notes in Computer Science, Springer Nature Switzerland, Cham (2022). https://doi.org/10.1007/978-3-031-16437-8_11
Wang, W., et al.: PVTv2: improved baselines with pyramid vision transformer. arXiv:2106.13797 [cs] (2022)
Zhang, R., Lai, P., Wan, X., Fan, D.J., Gao, F., Wu, X.J., Li, G.: Lesion-aware dynamic kernel for polyp segmentation. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds.) Medical Image Computing and Computer Assisted Intervention - MICCAI 2022, pp. 99–109. Lecture Notes in Computer Science, Springer Nature Switzerland, Cham (2022). https://doi.org/10.1007/978-3-031-16437-8_10
Zhang, Y., Liu, H., Hu, Q.: TransFuse: fusing transformers and CNNs for medical image segmentation. arXiv:2102.08005 [cs] (2021)
Zhou, X., Wang, D., Krähenbühl, P.: Objects as points (2019). https://doi.org/10.48550/arXiv.1904.07850
Zhou, Z., Siddiquee, M.M.R., Tajbakhsh, N., Liang, J.: UNet++: a nested U-Net architecture for medical image segmentation. arXiv:1807.10165 [cs, eess, stat] (2018)

Download references

Acknowledgement

This work was supported by the National Natural Sciences Foundation of China (Nos. 31771477, 32070677), the Fundamental Research Funds for the Central Universities (No. 226-2022-00009), and the Key R &D Program of Zhejiang (No. 2023C03049).

Author information

Authors and Affiliations

Department of Colorectal Surgery and Oncology (Key Laboratory of Cancer Prevention and Intervention, China National Ministry of Education, Key Laboratory of Molecular Biology in Medical Sciences, Zhejiang Province, China), The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
Tianyi Ling, Da Wang & Kefeng Ding
Department of Bioinformatics, College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, China
Tianyi Ling, Chengyi Wu, Huan Yu, Yincong Zhou & Ming Chen
Department of Hepatobiliary and Pancreatic Surgery, The First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, Zhejiang, China
Chengyi Wu
Department of Thoracic Surgery, The Second Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, Zhejiang, China
Huan Yu
Department of Hepatobiliary and Pancreatic Surgery, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
Tian Cai

Authors

Tianyi Ling
View author publications
You can also search for this author in PubMed Google Scholar
Chengyi Wu
View author publications
You can also search for this author in PubMed Google Scholar
Huan Yu
View author publications
You can also search for this author in PubMed Google Scholar
Tian Cai
View author publications
You can also search for this author in PubMed Google Scholar
Da Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yincong Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Ming Chen
View author publications
You can also search for this author in PubMed Google Scholar
Kefeng Ding
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Ming Chen or Kefeng Ding .

Editor information

Editors and Affiliations

Icahn School of Medicine, Mount Sinai, NYC, NY, USA, Tel Aviv University, Tel Aviv, Israel
Hayit Greenspan
Emory University, Atlanta, GA, USA
Anant Madabhushi
Queen’s University, Kingston, ON, Canada
Parvin Mousavi
The University of British Columbia, Vancouver, BC, Canada
Septimiu Salcudean
Yale University, New Haven, CT, USA
James Duncan
IBM Research, San Jose, CA, USA
Tanveer Syeda-Mahmood
Johns Hopkins University, Baltimore, MD, USA
Russell Taylor

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (zip 50610 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ling, T. et al. (2023). Probabilistic Modeling Ensemble Vision Transformer Improves Complex Polyp Segmentation. In: Greenspan, H., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2023. MICCAI 2023. Lecture Notes in Computer Science, vol 14226. Springer, Cham. https://doi.org/10.1007/978-3-031-43990-2_54

Download citation

DOI: https://doi.org/10.1007/978-3-031-43990-2_54
Published: 01 October 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43989-6
Online ISBN: 978-3-031-43990-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)