Skip to main content
Log in

RT-Net: Region-Enhanced Attention Transformer Network for Polyp Segmentation

  • Research
  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

Colonic polyps are highly correlated with colorectal cancer. Prevention of colorectal cancer is the detection and removal of polyps in the early stages of the disease. But the detection process relies on the physician’s experience and is prone to missed diagnoses. The drawbacks motivate us to design an algorithm to automatically assist physicians in detection to reduce the rate of missed polyps. However, polyp segmentation encounters challenges due to the variable appearance and blurred borders with the surrounding mucosa. And it is difficult for existing CNN-based polyp segmentation algorithms to learn long-range dependencies. Therefore, we propose a region-enhanced attention transformer network (RT-Net) for polyp segmentation. Unlike existing CNN-based approaches, it employs a pyramid Transformer encoder to promote the learning ability and robustness of the network. In addition, we introduce three modules, including the residual multiscale (RMS) module, the region-enhanced attention (REA) module and the feature aggregation (FA) module. Specifically, the RMS module learns multiscale information from the features of the encoder. The REA module adopts the prediction maps of each decoder layer to guide the network in building target regions and boundary cues to compensate for the missing local fields of view in the encoder. The role of the FA module is to efficiently aggregate the features from REA with those from the decoding layer to achieve better segmentation performance. RT-Net is evaluated on five benchmark polyp datasets. Extensive experiments demonstrate that our proposed RT-Net exhibits excellent performance compared to other state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. https://datasets.simula.no/kvasir-seg/.

  2. http://adas.cvc.uab.es/endoscene.

  3. https://polyp.grand-challenge.org/CVCClinicDB/.

  4. https://polyp.grand-challenge.org/EtisLarib/.

  5. http://mv.cvc.uab.es/projects/colon-qa/cvc-colondb.

References

  1. Favoriti P, Carbone G, Greco M, Pirozzi F, Pirozzi REM, Corcione F (2016) Worldwide burden of colorectal cancer: a review. Updates Surg 68(1):7–11

    Article  Google Scholar 

  2. Sanchez-Peralta LF, Bote-Curiel L, Picon A, Sanchez-Margallo FM, Pagador JB (2020) Deep learning to find colorectal polyps in colonoscopy: a systematic literature review. Artif Intell Med 108:101923

    Article  Google Scholar 

  3. Al-Amri SS, Kalyankar NV, et al (2010) Image segmentation by using threshold techniques. arXiv preprint arXiv:1005.4020

  4. Tang J (2010) A color image segmentation algorithm based on region growing. In: 2010 2nd international conference on computer engineering and technology, 6: 6–634, IEEE

  5. Zhang X, Tian J, Deng K, Wu Y, Li X (2010) Automatic liver segmentation using a statistical shape model with optimal surface detection. IEEE Transact Biomed Eng 57(10):2622–2626

    Article  Google Scholar 

  6. Padmapriya B, Kesavamurthi T, Ferose HW (2012) Edge based image segmentation technique for detection and estimation of the bladder wall thickness. Procedia Eng 30:828–835

    Article  Google Scholar 

  7. Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention, pp 234–241. Springer

  8. Zhou Z, Siddiquee MMR, Tajbakhsh N, Liang J (2019) Unet++: redesigning skip connections to exploit multiscale features in image segmentation. IEEE Transact Med Imag 39(6):1856–1867

    Article  Google Scholar 

  9. Fang Y, Chen C, Yuan Y, Tong K-y (2019) Selective feature aggregation network with area-boundary constraints for polyp segmentation. In: International conference on medical image computing and computer-assisted intervention, pp 302–310. Springer

  10. Jha D, Smedsrud PH, Riegler MA, Johansen D, De Lange T, Halvorsen P, Johansen HD (2019) Resunet++: An advanced architecture for medical image segmentation. In: 2019 IEEE International symposium on multimedia (ISM), pp 225–2255. IEEE

  11. Guo X, Chen Z, Liu J, Yuan Y (2022) Non-equivalent images and pixels: confidence-aware resampling with meta-learning mixup for polyp segmentation. Med Image Anal 78:102394

    Article  Google Scholar 

  12. Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 3431–3440

  13. Akbari M, Mohrekesh M, Nasr-Esfahani E, Soroushmehr SR, Karimi N, Samavi S, Najarian K (2018) Polyp segmentation in colonoscopy images using fully convolutional network. In: 2018 40th annual international conference of the IEEE engineering in medicine and biology society (EMBC), pp 69–72. IEEE

  14. Safarov S, Whangbo TK (2021) A-denseunet: adaptive densely connected unet for polyp segmentation in colonoscopy images with atrous convolution. Sensors 21(4):1441

    Article  Google Scholar 

  15. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv Neural Inform Proc Syst 30

  16. Chen M, Radford A, Child R, Wu J, Jun H, Luan D, Sutskever I (2020) Generative pretraining from pixels. In: International conference on machine learning, pp 1691–1703. PMLR

  17. Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S et al (2020) An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929

  18. Zheng S, Lu J, Zhao H, Zhu X, Luo Z, Wang Y, Fu Y, Feng J, Xiang T, Torr PH et al (2021) Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 6881–6890

  19. Wang W, Xie E, Li X, Fan D-P, Song K, Liang D, Lu T, Luo P, Shao L (2022) Pvt v2: improved baselines with pyramid vision transformer. Comput Vis Med 8(3):415–424

    Article  Google Scholar 

  20. Wickstrøm K, Kampffmeyer M, Jenssen R (2020) Uncertainty and interpretability in convolutional neural networks for semantic segmentation of colorectal polyps. Med Image Anal 60:101619

    Article  Google Scholar 

  21. Alom MZ, Hasan M, Yakopcic C, Taha TM, Asari VK (2018) Recurrent residual convolutional neural network based on u-net (r2u-net) for medical image segmentation. arXiv preprint arXiv:1802.06955

  22. Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4700–4708

  23. Banik D, Roy K, Bhattacharjee D, Nasipuri M, Krejcar O (2020) Polyp-net: A multimodel fusion network for polyp segmentation. IEEE Transact Instrum Measur 70:1–12

    Google Scholar 

  24. Schlemper J, Oktay O, Schaap M, Heinrich M, Kainz B, Glocker B, Rueckert D (2019) Attention gated networks: learning to leverage salient regions in medical images. Med Image Anal 53:197–207

    Article  Google Scholar 

  25. Fan D-P, Ji G-P, Zhou T, Chen G, Fu H, Shen J, Shao L (2020) Pranet: parallel reverse attention network for polyp segmentation. In: International conference on medical image computing and computer-assisted intervention, pp 263–273. Springer

  26. Lou A, Guan S, Ko H, Loew MH (2022) Caranet: context axial reverse attention network for segmentation of small medical objects. In: Medical Imaging 2022: Image Processing, vol 12032, pp 81–92. SPIE

  27. Wei J, Hu Y, Zhang R, Li Z, Zhou SK, Cui S (2021) Shallow attention network for polyp segmentation. In: International conference on medical image computing and computer-assisted intervention, pp 699–708. Springer

  28. Zhang Y, Liu H, Hu Q (2021) Transfuse: fusing transformers and cnns for medical image segmentation. In: International conference on medical image computing and computer-assisted intervention, pp 14–24. Springer

  29. Chen J, Lu Y, Yu Q, Luo X, Adeli E, Wang Y, Lu L, Yuille AL, Zhou Y (2021) Transunet: transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.04306

  30. Valanarasu JMJ, Oza P, Hacihaliloglu I, Patel VM (2021) Medical transformer: Gated axial-attention for medical image segmentation. In: International conference on medical image computing and computer-assisted intervention, pp 36–46. Springer

  31. Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 10012–10022

  32. Cao H, Wang Y, Chen J, Jiang D, Zhang X, Tian Q, Wang M (2021) Swin-unet: Unet-like pure transformer for medical image segmentation. arXiv preprint arXiv:2105.05537

  33. Jha D, Smedsrud PH, Riegler MA, Halvorsen P, Lange Td, Johansen D, Johansen HD (2020) Kvasir-seg: a segmented polyp dataset. In: International conference on multimedia modeling, pp 451–462. Springer

  34. Vázquez D, Bernal J, Sánchez FJ, Fernández-Esparrach G, López AM, Romero A, Drozdzal M, Courville A (2017) A benchmark for endoluminal scene segmentation of colonoscopy images. J Healthcare Eng 2017

  35. Bernal J, Sánchez FJ, Fernández-Esparrach G, Gil D, Rodríguez C, Vilariño F (2015) Wm-dova maps for accurate polyp highlighting in colonoscopy: validation vs. saliency maps from physicians. Comput Med Imag Graph 43:99–111

    Article  Google Scholar 

  36. Silva J, Histace A, Romain O, Dray X, Granado B (2014) Toward embedded detection of polyps in wce images for early diagnosis of colorectal cancer. Int J Comput Assist Radiol Surg 9(2):283–293

    Article  Google Scholar 

  37. Bernal J, Sánchez J, Vilarino F (2012) Towards automatic polyp detection with a polyp appearance model. Pattern Recog 45(9):3166–3182

    Article  Google Scholar 

  38. Touvron H, Cord M, Douze M, Massa F, Sablayrolles A, Jégou H (2021) Training data-efficient image transformers & distillation through attention. In: International conference on machine learning, pp 10347–10357. PMLR

  39. Chen L-C, Papandreou G, Schroff F, Adam H (2017) Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587

  40. Zhang R, Li G, Li Z, Cui S, Qian D, Yu Y (2020) Adaptive context selection for polyp segmentation. In: International conference on medical image computing and computer-assisted intervention, pp 253–262. Springer

Download references

Acknowledgements

This work was supported in part by the Natural Science Foundation of China under Grant 62106054 and 62366005, and in part by the Science and Technology Project of Guangxi under Grant 2018GXNSFAA281351.

Author information

Authors and Affiliations

Authors

Contributions

Yilin Qin: Conceptualization, Methodology, Formal analysis, Validation, Software, Writing-Original draft preparation. Haiying Xia: Software, Validation, Conceptualization, Resources, Writing-reviewing Editing, Investigation, Supervision, Funding acquisition. Shuxiang Song: Methodology, Writing-reviewing Editing, Supervision.

Corresponding author

Correspondence to Haiying Xia.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

https://github.com/Q-qinyilin/RT-NET.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Qin, Y., Xia, H. & Song, S. RT-Net: Region-Enhanced Attention Transformer Network for Polyp Segmentation. Neural Process Lett 55, 11975–11991 (2023). https://doi.org/10.1007/s11063-023-11405-y

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-023-11405-y

Keywords

Navigation