Abstract
Colonic polyps are highly correlated with colorectal cancer. Prevention of colorectal cancer is the detection and removal of polyps in the early stages of the disease. But the detection process relies on the physician’s experience and is prone to missed diagnoses. The drawbacks motivate us to design an algorithm to automatically assist physicians in detection to reduce the rate of missed polyps. However, polyp segmentation encounters challenges due to the variable appearance and blurred borders with the surrounding mucosa. And it is difficult for existing CNN-based polyp segmentation algorithms to learn long-range dependencies. Therefore, we propose a region-enhanced attention transformer network (RT-Net) for polyp segmentation. Unlike existing CNN-based approaches, it employs a pyramid Transformer encoder to promote the learning ability and robustness of the network. In addition, we introduce three modules, including the residual multiscale (RMS) module, the region-enhanced attention (REA) module and the feature aggregation (FA) module. Specifically, the RMS module learns multiscale information from the features of the encoder. The REA module adopts the prediction maps of each decoder layer to guide the network in building target regions and boundary cues to compensate for the missing local fields of view in the encoder. The role of the FA module is to efficiently aggregate the features from REA with those from the decoding layer to achieve better segmentation performance. RT-Net is evaluated on five benchmark polyp datasets. Extensive experiments demonstrate that our proposed RT-Net exhibits excellent performance compared to other state-of-the-art methods.
Similar content being viewed by others
References
Favoriti P, Carbone G, Greco M, Pirozzi F, Pirozzi REM, Corcione F (2016) Worldwide burden of colorectal cancer: a review. Updates Surg 68(1):7–11
Sanchez-Peralta LF, Bote-Curiel L, Picon A, Sanchez-Margallo FM, Pagador JB (2020) Deep learning to find colorectal polyps in colonoscopy: a systematic literature review. Artif Intell Med 108:101923
Al-Amri SS, Kalyankar NV, et al (2010) Image segmentation by using threshold techniques. arXiv preprint arXiv:1005.4020
Tang J (2010) A color image segmentation algorithm based on region growing. In: 2010 2nd international conference on computer engineering and technology, 6: 6–634, IEEE
Zhang X, Tian J, Deng K, Wu Y, Li X (2010) Automatic liver segmentation using a statistical shape model with optimal surface detection. IEEE Transact Biomed Eng 57(10):2622–2626
Padmapriya B, Kesavamurthi T, Ferose HW (2012) Edge based image segmentation technique for detection and estimation of the bladder wall thickness. Procedia Eng 30:828–835
Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention, pp 234–241. Springer
Zhou Z, Siddiquee MMR, Tajbakhsh N, Liang J (2019) Unet++: redesigning skip connections to exploit multiscale features in image segmentation. IEEE Transact Med Imag 39(6):1856–1867
Fang Y, Chen C, Yuan Y, Tong K-y (2019) Selective feature aggregation network with area-boundary constraints for polyp segmentation. In: International conference on medical image computing and computer-assisted intervention, pp 302–310. Springer
Jha D, Smedsrud PH, Riegler MA, Johansen D, De Lange T, Halvorsen P, Johansen HD (2019) Resunet++: An advanced architecture for medical image segmentation. In: 2019 IEEE International symposium on multimedia (ISM), pp 225–2255. IEEE
Guo X, Chen Z, Liu J, Yuan Y (2022) Non-equivalent images and pixels: confidence-aware resampling with meta-learning mixup for polyp segmentation. Med Image Anal 78:102394
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 3431–3440
Akbari M, Mohrekesh M, Nasr-Esfahani E, Soroushmehr SR, Karimi N, Samavi S, Najarian K (2018) Polyp segmentation in colonoscopy images using fully convolutional network. In: 2018 40th annual international conference of the IEEE engineering in medicine and biology society (EMBC), pp 69–72. IEEE
Safarov S, Whangbo TK (2021) A-denseunet: adaptive densely connected unet for polyp segmentation in colonoscopy images with atrous convolution. Sensors 21(4):1441
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv Neural Inform Proc Syst 30
Chen M, Radford A, Child R, Wu J, Jun H, Luan D, Sutskever I (2020) Generative pretraining from pixels. In: International conference on machine learning, pp 1691–1703. PMLR
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S et al (2020) An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929
Zheng S, Lu J, Zhao H, Zhu X, Luo Z, Wang Y, Fu Y, Feng J, Xiang T, Torr PH et al (2021) Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 6881–6890
Wang W, Xie E, Li X, Fan D-P, Song K, Liang D, Lu T, Luo P, Shao L (2022) Pvt v2: improved baselines with pyramid vision transformer. Comput Vis Med 8(3):415–424
Wickstrøm K, Kampffmeyer M, Jenssen R (2020) Uncertainty and interpretability in convolutional neural networks for semantic segmentation of colorectal polyps. Med Image Anal 60:101619
Alom MZ, Hasan M, Yakopcic C, Taha TM, Asari VK (2018) Recurrent residual convolutional neural network based on u-net (r2u-net) for medical image segmentation. arXiv preprint arXiv:1802.06955
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4700–4708
Banik D, Roy K, Bhattacharjee D, Nasipuri M, Krejcar O (2020) Polyp-net: A multimodel fusion network for polyp segmentation. IEEE Transact Instrum Measur 70:1–12
Schlemper J, Oktay O, Schaap M, Heinrich M, Kainz B, Glocker B, Rueckert D (2019) Attention gated networks: learning to leverage salient regions in medical images. Med Image Anal 53:197–207
Fan D-P, Ji G-P, Zhou T, Chen G, Fu H, Shen J, Shao L (2020) Pranet: parallel reverse attention network for polyp segmentation. In: International conference on medical image computing and computer-assisted intervention, pp 263–273. Springer
Lou A, Guan S, Ko H, Loew MH (2022) Caranet: context axial reverse attention network for segmentation of small medical objects. In: Medical Imaging 2022: Image Processing, vol 12032, pp 81–92. SPIE
Wei J, Hu Y, Zhang R, Li Z, Zhou SK, Cui S (2021) Shallow attention network for polyp segmentation. In: International conference on medical image computing and computer-assisted intervention, pp 699–708. Springer
Zhang Y, Liu H, Hu Q (2021) Transfuse: fusing transformers and cnns for medical image segmentation. In: International conference on medical image computing and computer-assisted intervention, pp 14–24. Springer
Chen J, Lu Y, Yu Q, Luo X, Adeli E, Wang Y, Lu L, Yuille AL, Zhou Y (2021) Transunet: transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.04306
Valanarasu JMJ, Oza P, Hacihaliloglu I, Patel VM (2021) Medical transformer: Gated axial-attention for medical image segmentation. In: International conference on medical image computing and computer-assisted intervention, pp 36–46. Springer
Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 10012–10022
Cao H, Wang Y, Chen J, Jiang D, Zhang X, Tian Q, Wang M (2021) Swin-unet: Unet-like pure transformer for medical image segmentation. arXiv preprint arXiv:2105.05537
Jha D, Smedsrud PH, Riegler MA, Halvorsen P, Lange Td, Johansen D, Johansen HD (2020) Kvasir-seg: a segmented polyp dataset. In: International conference on multimedia modeling, pp 451–462. Springer
Vázquez D, Bernal J, Sánchez FJ, Fernández-Esparrach G, López AM, Romero A, Drozdzal M, Courville A (2017) A benchmark for endoluminal scene segmentation of colonoscopy images. J Healthcare Eng 2017
Bernal J, Sánchez FJ, Fernández-Esparrach G, Gil D, Rodríguez C, Vilariño F (2015) Wm-dova maps for accurate polyp highlighting in colonoscopy: validation vs. saliency maps from physicians. Comput Med Imag Graph 43:99–111
Silva J, Histace A, Romain O, Dray X, Granado B (2014) Toward embedded detection of polyps in wce images for early diagnosis of colorectal cancer. Int J Comput Assist Radiol Surg 9(2):283–293
Bernal J, Sánchez J, Vilarino F (2012) Towards automatic polyp detection with a polyp appearance model. Pattern Recog 45(9):3166–3182
Touvron H, Cord M, Douze M, Massa F, Sablayrolles A, Jégou H (2021) Training data-efficient image transformers & distillation through attention. In: International conference on machine learning, pp 10347–10357. PMLR
Chen L-C, Papandreou G, Schroff F, Adam H (2017) Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587
Zhang R, Li G, Li Z, Cui S, Qian D, Yu Y (2020) Adaptive context selection for polyp segmentation. In: International conference on medical image computing and computer-assisted intervention, pp 253–262. Springer
Acknowledgements
This work was supported in part by the Natural Science Foundation of China under Grant 62106054 and 62366005, and in part by the Science and Technology Project of Guangxi under Grant 2018GXNSFAA281351.
Author information
Authors and Affiliations
Contributions
Yilin Qin: Conceptualization, Methodology, Formal analysis, Validation, Software, Writing-Original draft preparation. Haiying Xia: Software, Validation, Conceptualization, Resources, Writing-reviewing Editing, Investigation, Supervision, Funding acquisition. Shuxiang Song: Methodology, Writing-reviewing Editing, Supervision.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Qin, Y., Xia, H. & Song, S. RT-Net: Region-Enhanced Attention Transformer Network for Polyp Segmentation. Neural Process Lett 55, 11975–11991 (2023). https://doi.org/10.1007/s11063-023-11405-y
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-023-11405-y