Using Guided Self-Attention with Local Information for Polyp Segmentation

Cai, Linghan; Wu, Meijing; Chen, Lijiang; Bai, Wenpei; Yang, Min; Lyu, Shuchang; Zhao, Qi

doi:10.1007/978-3-031-16440-8_60

Linghan Cai¹²,
Meijing Wu¹³,
Lijiang Chen¹²,
Wenpei Bai¹³,
Min Yang¹³,
Shuchang Lyu¹² &
…
Qi Zhao¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13434))

Included in the following conference series:

International Conference on Medical Image Computing and Computer-Assisted Intervention

6156 Accesses
6 Citations

Abstract

Automatic and precise polyp segmentation is crucial for the early diagnosis of colorectal cancer. Existing polyp segmentation methods are mostly based on convolutional neural networks (CNNs), which usually utilize the global features to enhance local features through well-designed modules, thereby dealing with the diversity of polyps. Although CNN-based methods achieve impressive results, they are powerless to model explicit long-range relations, which limits their performance. Different from CNN, Transformer has a strong capability of modeling long-range relations owing to self-attention. However, self-attention always spreads attention to unexpected regions and the Transformer’s ability of local feature extraction is insufficient, resulting in inaccurate localization and fuzzy boundary. To address these issues, we propose PPFormer for accurate polyp segmentation. Specifically, we first adopt a shallow CNN encoder and a deep Transformer encoder to extract rich features. In the decoder, we present the PP-guided self-attention that uses prediction maps to guide self-attention to focus on the hard regions so as to enhance the model’s perception of polyp boundary. Meanwhile, the Local-to-Global mechanism is designed to encourage the Transformer to capture more information in the local-window for better polyp localization. Extensive experiments on five challenging datasets show that PPFormer outperforms other advanced methods and achieves state-of-the-art results with six metrics, i.e. mean Dice and mean IoU.

L. Cai and M. Wu—Contributed equally to this work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 44.99; Price excludes VAT (USA)

Softcover Book: USD 59.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Akbari, M., et al.: Polyp segmentation in colonoscopy images using fully convolutional network. In: 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 69–72. IEEE (2018)
Google Scholar
Bernal, J., Sánchez, F.J., Fernández-Esparrach, G., Gil, D., Rodríguez, C., Vilariño, F.: WM-DOVA maps for accurate polyp highlighting in colonoscopy: validation vs. saliency maps from physicians. Comput. Med. Imaging Graph. 43, 99–111 (2015)
Google Scholar
Chen, J., et al.: Transunet: Transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.04306 (2021)
Dosovitskiy, A., et al.: An image is worth 16 \(\times \) 16 words: transformers for image recognition at scale. In: ICLR (2021)
Google Scholar
Fan, D.P., Cheng, M.M., Liu, Y., Li, T., Borji, A.: Structure-measure: a new way to evaluate foreground maps. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4548–4557 (2017)
Google Scholar
Fan, D.P., Gong, C., Cao, Y., Ren, B., Cheng, M.M., Borji, A.: Enhanced-alignment measure for binary foreground map evaluation. In: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18, pp. 698–704. International Joint Conferences on Artificial Intelligence Organization (2018)
Google Scholar
Fan, D.-P., et al.: PraNet: parallel reverse attention network for polyp segmentation. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12266, pp. 263–273. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59725-2_26
Chapter Google Scholar
Han, K., et al.: A survey on vision transformer. IEEE Trans. Pattern Anal. Mach. Intell. (2022)
Google Scholar
Jha, D., et al.: A comprehensive study on colorectal polyp segmentation with resunet++, conditional random field and test-time augmentation. IEEE J. Biomed. Health Inf. 25(6), 2029–2040 (2021)
Article Google Scholar
Jha, D., et al.: Kvasir-SEG: a segmented polyp dataset. In: Ro, Y.M., et al. (eds.) MMM 2020. LNCS, vol. 11962, pp. 451–462. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-37734-2_37
Chapter Google Scholar
Margolin, R., Zelnik-Manor, L., Tal, A.: How to evaluate foreground maps? In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2014)
Google Scholar
Nguyen, T.-C., Nguyen, T.-P., Diep, G.-H., Tran-Dinh, A.-H., Nguyen, T.V., Tran, M.-T.: CCBANet: cascading context and balancing attention for polyp segmentation. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12901, pp. 633–643. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87193-2_60
Chapter Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Silva, J., Histace, A., Romain, O., Dray, X., Granado, B.: Toward embedded detection of polyps in WCE images for early diagnosis of colorectal cancer. Int. J. Comput. Assist. Radiol. Surg. 9(2), 283–293 (2014)
Article Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR (2015)
Google Scholar
Sung, H., Ferlay, J., Siegel, R.L., Laversanne, M., Soerjomataram, I., Jemal, A., Bray, F.: Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA: A Cancer J. Clin. 71(3), 209–249 (2021)
Google Scholar
Tajbakhsh, N., Gurudu, S.R., Liang, J.: Automated polyp detection in colonoscopy videos using shape and context information. IEEE Trans. Med. Imaging 35(2), 630–644 (2015). https://doi.org/10.1109/TMI.2015.2487997
Article Google Scholar
Vázquez, D., et al.: A benchmark for endoluminal scene segmentation of colonoscopy images. J. Healthc. Eng. 2017 (2017)
Google Scholar
Wang, W., et al.: Pvtv 2: improved baselines with pyramid vision transformer. Comput. Vis. Media 8(3), 1–10 (2022)
Google Scholar
Wei, J., Wang, S., Huang, Q.: F\(^3\)net: fusion, feedback and focus for salient object detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 12321–12328 (2020)
Google Scholar
Wu, H., et al.: CVT: introducing convolutions to vision transformers. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 22–31 (2021)
Google Scholar
Zhang, R., Li, G., Li, Z., Cui, S., Qian, D., Yu, Y.: Adaptive context selection for polyp segmentation. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12266, pp. 253–262. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59725-2_25
Chapter Google Scholar
Zhang, Y., Liu, H., Hu, Q.: TransFuse: fusing transformers and CNNs for medical image segmentation. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12901, pp. 14–24. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87193-2_2
Chapter Google Scholar
Zhao, X., Zhang, L., Lu, H.: Automatic polyp segmentation via multi-scale subtraction network. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12901, pp. 120–130. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87193-2_12
Chapter Google Scholar
Zhou, Z., Siddiquee, M.M.R., Tajbakhsh, N., Liang, J.: Unet++: redesigning skip connections to exploit multiscale features in image segmentation. IEEE Trans. Med. Imaging 39(6), 1856–1867 (2019). https://doi.org/10.1109/TMI.2019.2959609
Article Google Scholar

Download references

Acknowledgements

This project was partly supported by the National Natural Science Foundation of China (Grant No. 62072021), the Fundamental Research Funds for the Central Universities (Grant No. YWF-22-L-532), and the Beijing Hospitals Authority’Ascent Plan (Grant No. DFL20190701).

Author information

Authors and Affiliations

Institute of Electronic Information Engineering, Beihang University, Beijing, China
Linghan Cai, Lijiang Chen, Shuchang Lyu & Qi Zhao
Department of Gynecology and Obstetrics, Beijing Shijitan Hospital, Capital Medical University, Beijing, China
Meijing Wu, Wenpei Bai & Min Yang

Authors

Linghan Cai
View author publications
You can also search for this author in PubMed Google Scholar
Meijing Wu
View author publications
You can also search for this author in PubMed Google Scholar
Lijiang Chen
View author publications
You can also search for this author in PubMed Google Scholar
Wenpei Bai
View author publications
You can also search for this author in PubMed Google Scholar
Min Yang
View author publications
You can also search for this author in PubMed Google Scholar
Shuchang Lyu
View author publications
You can also search for this author in PubMed Google Scholar
Qi Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lijiang Chen .

Editor information

Editors and Affiliations

Rochester Institute of Technology, Rochester, NY, USA
Linwei Wang
Chinese University of Hong Kong, Hong Kong, Hong Kong
Qi Dou
University of Virginia, Charlottesville, VA, USA
P. Thomas Fletcher
National Center for Tumor Diseases (NCT/UCC), Dresden, Germany
Stefanie Speidel
Case Western Reserve University, Cleveland, OH, USA
Shuo Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cai, L. et al. (2022). Using Guided Self-Attention with Local Information for Polyp Segmentation. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds) Medical Image Computing and Computer Assisted Intervention – MICCAI 2022. MICCAI 2022. Lecture Notes in Computer Science, vol 13434. Springer, Cham. https://doi.org/10.1007/978-3-031-16440-8_60

Download citation

DOI: https://doi.org/10.1007/978-3-031-16440-8_60
Published: 16 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16439-2
Online ISBN: 978-3-031-16440-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)

Using Guided Self-Attention with Local Information for Polyp Segmentation