Polyp segmentation with convolutional MLP

Jin, Yan; Hu, Yibiao; Jiang, Zhiwei; Zheng, Qiufu

doi:10.1007/s00371-022-02630-y

Polyp segmentation with convolutional MLP

Original article
Published: 23 August 2022

Volume 39, pages 4819–4837, (2023)
Cite this article

The Visual Computer Aims and scope Submit manuscript

Yan Jin ORCID: orcid.org/0000-0001-8956-7684¹,
Yibiao Hu¹,
Zhiwei Jiang¹ &
…
Qiufu Zheng¹

757 Accesses
4 Citations
1 Altmetric
Explore all metrics

Abstract

Accurate polyp segmentation can help doctors find and resect abnormal tissue and decrease the chances of polyps changing into colorectal cancer. The current polyp segmentation neural networks are still challenged by complicated scenarios where polyps have large variations of shapes, size, color, and appearance. In this paper, we propose convolutional multilayer perceptron polyp segmentation network to achieve more accurate polyp segmentation in colonoscopy images. The proposed network adopts a convolutional MLP encoder and enhances the low-level feature using the parallel self-attention module. Furthermore, instead of directly adding encoder features to the decoder, we introduce a cascaded context aggregation module to aggregate the high-level semantic feature and low-level local feature. Finally, channel guide group reverse attention is used to enhance structural and textural details by mining the relationship between areas and boundary cues. The proposed approach is evaluated on six widely adopted datasets and demonstrates superior performance compared to other state-of-the-art models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Swin-Unet: Unet-Like Pure Transformer for Medical Image Segmentation

Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

Deep Learning for Medical Image Processing: Overview, Challenges and the Future

References

Nogueira-Rodríguez, A., Domínguez-Carbajales, R., Campos-Tato, F., et al.: Real-time polyp detection model using convolutional neural networks. Neural Comput. Appl. (2021). https://doi.org/10.1007/s00521-021-06496-4
Article Google Scholar
Wickstrøm, K., Kampffmeyer, M., Jenssen, R.: Uncertainty and interpretability in convolutional neural networks for semantic segmentation of colorectal polyps. Med. Image Anal. 60, 101619 (2020). https://doi.org/10.1016/j.media.2019.101619
Article Google Scholar
Sundaram, P., Zomorodian, A., Beaulieu, C., Napel, S.: Automated polyp detection in colon capsule endoscopy. IEEE Trans. Med. Imaging 33(7), 1488–1502 (2014). https://doi.org/10.1109/TMI.2014.2314959
Article Google Scholar
Tajbakhsh, N., Gurudu, S.R., Liang, J.: Colon polyp detection using smoothed shape operators: preliminary results. Med. Image Anal. 12(2), 99–119 (2008). https://doi.org/10.1016/j.media.2007.08.001
Article Google Scholar
Brandao, P., Zisimopoulos, O., Mazomenos, E., Ciuti, G., Bernal, J., Visentini-Scarzanella, M., et al.: Towards a computed-aided diagnosis system in colonoscopy: automatic polyp segmentation using convolution neural networks. J. Med. Robot. Res. 3(02), 1840002 (2018). https://doi.org/10.1142/s2424905x18400020
Article Google Scholar
Zhou, Z., Siddiquee, M.M.R., Tajbakhsh, N., Liang, J.: Unet++: A nested u-net architecture for medical image segmentation. In: Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, pp. 3–11. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00889-5_1
Chapter Google Scholar
Murugesan, B., Sarveswaran, K., Shankaranarayana, S.M., Ram, K., Joseph J., Sivaprakasam, M.: Psi-Net: shape and boundary aware joint multi-task deep network for medical image segmentation. In: 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pp. 7223–7226. IEEE (2019). https://doi.org/10.1109/EMBC.2019.8857339
Jha, D., Smedsrud, P.H., Riegler, M.A., Johansen, D., De, Lange, T., Halvorsen, P., Johansen, H.D.: Resunet++: an advanced architecture for medical image segmentation. In: 2019 IEEE International Symposium on Multimedia, pp. 225–2255. IEEE (2019). https://doi.org/10.1109/ISM46123.2019.00049
Zhong, J., Wang, W., Wu, H., Wen, Z., Qin, J.: PolypSeg: an efficient context-aware network for polyp segmentation from colonoscopy videos. In: Medical Image Computing and Computer Assisted Intervention Lecture Notes in Computer Science, vol. 12266. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59725-2_28
Chapter Google Scholar
Ji, G.P., Chou, Y.C., Fan, D.P., Chen, G., Fu, H., Jha, D., Shao, L.: Progressively normalized self-attention network for video polyp segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 142–152. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87193-2_14
Wu, H., Zhong, J., Wang, W., Wen, Z., Qin, J.: Precise yet efficient semantic calibration and refinement in convnets for real-time polyp segmentation from colonoscopy videos. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, No. 4, pp. 2916–2924 (2021)
Vázquez, D., Bernal, J., Sánchez, F.J., Fernández-Esparrach, G., López, A.M., Romero, A., Courville, A.: A benchmark for endoluminal scene segmentation of colonoscopy images. J. Healthc. Eng. (2017). https://doi.org/10.1155/2017/4037190
Article Google Scholar
Bernal, J., Sánchez, F.J., Fernández-Esparrach, G., Gil, D., Rodríguez, C., Vilariño, F.: WM-DOVA maps for accurate polyp highlighting in colonoscopy: validation vs. saliency maps from physicians. Comput. Med. Imaging Graph. 43, 99–111 (2015). https://doi.org/10.1016/j.compmedimag.2015.02.007
Article Google Scholar
Bernal, J., Sánchez, J., Vilarino, F.: Towards automatic polyp detection with a polyp appearance model. Pattern Recognit. 45(9), 3166–3182 (2012). https://doi.org/10.1016/j.patcog.2012.03.002
Article Google Scholar
Jha, D., Smedsrud, P.H., Riegler, M.A., Halvorsen, P., de Lange, T., Johansen, D., Johansen, H.D.: Kvasir-seg: a segmented polyp dataset. In: International conference on multimedia modeling, pp. 451–462. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-37734-2_37
Silva, J., Histace, A., Romain, O., Dray, X., Granado, B.: Toward embedded detection of polyps in wce images for early diagnosis of colorectal cancer. Int. J. Comput. Assist. Radiol. Surg. 9(2), 283–293 (2014). https://doi.org/10.1007/s11548-013-0926-3
Article Google Scholar
Sánchez-Peralta, L.F., Pagador, J.B., Picón, A., Calderón, Á.J., Polo, F., Andraka, N., Sánchez-Margallo, F.M.: PICCOLO white-light and narrow-band imaging colonoscopic dataset: a performance comparative of models and datasets. Appl. Sci. 10(23), 8501 (2020). https://doi.org/10.3390/app10238501
Article Google Scholar
Zhou SK, Greenspan H, Davatzikos C, Duncan JS, Van Ginneken B, Madabhushi A, Summers RM (2021) A review of deep learning in medical imaging: Imaging traits, technology trends, case studies with progress highlights, and future promises. In: Proceedings of the IEEE, vol. 109, no. 5, pp. 820–838 (2021). https://doi.org/10.1109/JPROC.2021.3054390
Akbari, M., Mohrekesh, M., Nasr-Esfahani, E., Soroushmehr, S. R., Karimi, N., Samavi, S., Najarian, K.: Polyp segmentation in colonoscopy images using fully convolutional network. In: 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 69–72. IEEE (2018). https://doi.org/10.1109/EMBC.2018.8512197
Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: International Conference on Medical image computing and computer-assisted intervention, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Fang, Y., Chen, C., Yuan, Y., Tong, K.Y.: Selective feature aggregation network with area-boundary constraints for polyp segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 302–310. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32239-7_34
Fan, D.P., Ji, G.P., Zhou, T., Chen, G., Fu, H., Shen, J., Shao, L.: Pranet: parallel reverse attention network for polyp segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 263–273. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59725-2_26
Ahmed, A.: Generative adversarial networks for automatic polyp segmentation. MediaEval20, Multimedia Evaluation Workshop (2020). https://doi.org/10.1109/EMBC.2019.8857958
Patel, K., Bur, A.M., Wang, G.: Enhanced U-Net: a feature enhancement network for polyp segmentation. In: Proceedings of the International Robots & Vision Conference. International Robots & Vision Conference, 2021, pp 181–188 (2021). https://doi.org/10.1109/crv52889.2021.00032
Zhang, R., Li, G., Li, Z., Cui, S., Qian, D., Yu, Y.: Adaptive context selection for polyp segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 253–262. Springer (2020). https://doi.org/10.1007/978-3-030-59725-2_25
Wei, J., Hu, Y., Zhang, R., Li, Z., Zhou, S.K., Cui, S.: Shallow attention network for polyp segmentation. In: International conference on medical image computing and computer-assisted intervention, pp. 699–708. Springer (2021). https://doi.org/10.1007/978-3-030-87193-2_66
Lai, H., Luo, Y., Zhang, G., Shen, X., Li, B., Lu, J.: Toward accurate polyp segmentation with cascade boundary-guided attention. Vis. Comput. (2022). https://doi.org/10.1007/s00371-022-02422-4
Article Google Scholar
Tolstikhin, I., Houlsby, N., Kolesnikov, A., Beyer, L., Zhai, X., Unterthiner, T., Dosovitskiy, A.: Mlp-mixer: an all-mlp architecture for vision. In: Thirty-Fifth Conference on Neural Information Processing Systems. arXiv preprint. https://arxiv.org/pdf/2105.01601 (2021)
Ding, X., Zhang, X., Han, J., Ding, G.: RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image Recognition. arXiv preprint. https://arxiv.org/abs/2105.01883 (2021)
Liu, H., Dai, Z., So, DR., Le, Q.V.: Pay Attention to MLPs. arXiv preprint. https://arxiv.org/abs/2105.08050 (2021)
Chen, S., Xie, E., Ge, C., Liang, D., Luo, P.: Cyclemlp: a mlp-like architecture for dense prediction. arXiv preprint https://arxiv.org/abs/2107.10224 (2021)
Lian, D., Yu, Z., Sun, X., Gao, S.: As-mlp: an axial shifted mlp architecture for vision. arXiv preprint https://arxiv.org/abs/2107.08391 (2021)
Guo, J., Tang, Y., Han, K., Chen, X., Wu, H., Xu, C., Wang, Y.: Hire-MLP: vision MLP via hierarchical rearrangement. arXiv preprint https://arxiv.org/abs/2108.13341 (2021)
Li, J., Hassani, A., Walton, S., Shi, H.: ConvMLP: hierarchical convolutional MLPs for vision. arXiv preprint https://arxiv.org/pdf/2109.04454 (2021)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018). https://doi.org/10.1109/TPAMI.2019.2913372
Woo, S., Park, J., Lee, J.Y., Kweon, I.S.: Cbam: convolutional block attention module. In: Proceedings of the European Conference on Computer Vision, pp. 3–19 (2018). https://doi.org/10.1007/978-3-030-01234-2_1
Guo, J., Ma, X., Sansom, A., McGuire, M., Kalaani, A., Chen, Q., Fu, S.: Spanet: spatial pyramid attention network for enhanced image recognition. In: 2020 IEEE International Conference on Multimedia and Expo, pp. 1–6. IEEE (2020). https://doi.org/10.1109/ICME46284.2020.9102906
Li, H., Luo, H., Huan, W., et al.: Automatic lumbar spinal MRI image segmentation with a multi-scale attention network. Neural Comput. Appl. 33, 11589–11602 (2021). https://doi.org/10.1007/s00521-021-05856-4
Article Google Scholar
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, AN., Polosukhin, I.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008. https://papers.nips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf (2017)
Cordonnier, J.B., Loukas, A., Jaggi, M.: On the relationship between self-attention and convolutional layers. In: Eighth International Conference on Learning Representations (2020). https://openreview.net/forum?id=HJlnC1rKPB
Jiang, M., Zhai, F., Kong, J.: Sparse Attention Module for optimizing semantic segmentation performance combined with a multi-task feature extraction network. Vis. Comput. (2021). https://doi.org/10.1007/s00371-021-02124-3
Article Google Scholar
Wang, X., Girshick, R., Gupta, A., He, K.: Non-local neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7794–7803 (2018). https://openaccess.thecvf.com/content_cvpr_2018/papers/Wang_Non-Local_Neural_Networks_CVPR_2018_paper.pdf
Nam, H., Ha, J.W., Kim, J.: Dual attention networks for multimodal reasoning and matching. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 299–307 (2017). https://doi.org/10.1109/CVPR.2017.232
Xie, S., Girshick, R., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1492–1500 (2017). https://doi.org/10.1109/CVPR.2017.634
Zhang, X., Zhou, X., Lin, M., Sun, J.: Shufflenet: An extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6848–6856 (2018). https://doi.org/10.1109/CVPR.2018.00716
Sabour, S., Frosst, N., Hinton, G.E.: Dynamic routing between capsules. In: Advances in Neural Information Processing Systems, vol. 30, pp. 3856–3866 (2017). https://papers.nips.cc/paper/2017/file/2cad8fa47bbef282badbb8de5374b894-Paper.pdf
Hou, Q., Jiang, Z., Yuan, L., Cheng, M.M., Yan, S., Feng, J.: Vision permutator: a permutable mlp-like architecture for visual recognition. arXiv preprint https://arxiv.org/abs/2106.12368 (2021)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778(2016). https://doi.org/10.1109/CVPR.2016.90
Gao, S.H., Cheng, M.M., Zhao, K., Zhang, X.Y., Yang, M.H., Torr, P.: Res2net: A new multi-scale backbone architecture. IEEE Trans. Pattern Anal Mach. Intell. 43(2), 652–662 (2019). https://doi.org/10.1109/TPAMI.2019.2938758
Article Google Scholar
Wu, Z., Su, L., Huang, Q.: Cascaded partial decoder for fast and accurate salient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3907–3916 (2019). https://doi.org/10.1109/CVPR.2019.00403
Fan, D.P., Ji, G.P., Sun, G., Cheng M.M., Shen, J., Shao, L.: Camouflaged object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2777–2787 (2020). https://doi.org/10.1109/cvpr42600.2020.00285
Chen, S., Fu, Y.: Progressively guided alternate refinement network for rgb-d salient object detection. In: European Conference on Computer Vision, pp. 520–538. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58598-3_31
Chen, S., Tan, X., Wang, B., Hu, X.: Reverse attention for salient object detection. In: Proceedings of the European Conference on Computer Vision, pp. 234–250 (2018). https://doi.org/10.1007/978-3-030-01240-3_15
Wei, Y., Feng, J., Liang, X., Cheng, M.M., Zhao, Y., Yan, S.: Object region mining with adversarial erasing: A simple classification to semantic segmentation approach. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1568–1576 (2017). https://doi.org/10.1109/CVPR.2017.687
Qin, X., Zhang, Z., Huang, C., Gao, C., Dehghan, M., Jagersand, M.: Basnet: boundary-aware salient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7479–7489 (2019). https://doi.org/10.1109/CVPR.2019.00766
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. In: International conference on learning representations. http://arxiv.org/abs/1711.05101 (2018)
Wei, J., Wang, S., Huang, Q.: F³Net: fusion, feedback and focus for salient object detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 34, no. 07, pp. 12321–12328 (2020). https://doi.org/10.1609/aaai.v34i07.6916
Cheng, M.M., Fan, D.P.: Structure-measure: a new way to evaluate foreground maps. Int. J. Comput. Vision 129(9), 2622–2638 (2021). https://doi.org/10.1007/s11263-021-01490-8
Article Google Scholar
Margolin, R., Zelnik-Manor, L., Tal, A.: How to evaluate foreground maps? In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2014). https://doi.org/10.1109/CVPR.2014.39
Fan, D.P., Ji, G.P., Qin, X., Cheng, M.M.: Cognitive vision inspired object segmentation metric and loss function. SCIENTIA SINICA Informationis 6, 6 (2021). https://doi.org/10.1360/SSI-2020-0370
Article Google Scholar

Download references

Author information

Authors and Affiliations

College of Information Engineering, Zhejiang University of Technology, Hangzhou, 310023, Zhejiang Province, China
Yan Jin, Yibiao Hu, Zhiwei Jiang & Qiufu Zheng

Authors

Yan Jin
View author publications
You can also search for this author in PubMed Google Scholar
Yibiao Hu
View author publications
You can also search for this author in PubMed Google Scholar
Zhiwei Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Qiufu Zheng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yan Jin.

Ethics declarations

Conflict of interest

There are no conflicts of interest in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Jin, Y., Hu, Y., Jiang, Z. et al. Polyp segmentation with convolutional MLP. Vis Comput 39, 4819–4837 (2023). https://doi.org/10.1007/s00371-022-02630-y

Download citation

Accepted: 22 July 2022
Published: 23 August 2022
Issue Date: October 2023
DOI: https://doi.org/10.1007/s00371-022-02630-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Polyp segmentation with convolutional MLP

Abstract

Access this article

Similar content being viewed by others

Swin-Unet: Unet-Like Pure Transformer for Medical Image Segmentation

Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

Deep Learning for Medical Image Processing: Overview, Challenges and the Future

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Polyp segmentation with convolutional MLP

Abstract

Access this article

Similar content being viewed by others

Swin-Unet: Unet-Like Pure Transformer for Medical Image Segmentation

Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

Deep Learning for Medical Image Processing: Overview, Challenges and the Future

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation