HT-Net: hierarchical context-attention transformer network for medical ct image segmentation

Ma, Mingjun; Xia, Haiying; Tan, Yumei; Li, Haisheng; Song, Shuxiang

doi:10.1007/s10489-021-03010-0

HT-Net: hierarchical context-attention transformer network for medical ct image segmentation

Published: 15 January 2022

Volume 52, pages 10692–10705, (2022)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Mingjun Ma¹,
Haiying Xia ORCID: orcid.org/0000-0001-8711-1851¹,
Yumei Tan²,
Haisheng Li¹ &
…
Shuxiang Song¹

1961 Accesses
19 Citations
1 Altmetric
Explore all metrics

Abstract

Convolutional neural networks (CNNs) have been a prevailing technique in the field of medical CT image processing. Although encoder-decoder CNNs exploit locality for efficiency, they cannot adequately model remote pixel relationships. Recent works prove it possible to stack self-attention or transformer layers to effectively learn long-range dependencies. Transformers have been extended to computer vision tasks by creating and treating image patches as embeddings. However, transformer-based architectures lack global semantic information interaction and require large-scale dataset for training, making it difficult to effectively train with limited data samples. To address these issues, we propose a hierarchical context-attention transformer network (HT-Net), which integrates the multi-scale, transformer and hierarchical context extraction modules in skip-connections. The multi-scale module captures richer CT semantic information, enabling transformers to better encode feature maps of tokenized image patches from different stages of CNN as input attention sequences.The hierarchical context attention module complements global information and re-weights the pixels to capture semantic context. Extensive experiments on three datasets demonstrate that the proposed HT-Net outperforms state-of-the-art approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 3

ConTrans: Improving Transformer with Convolutional Attention for Medical Image Segmentation

GCEENet: A Global Context Enhancement and Exploitation for Medical Image Segmentation

ConvFormer: Plug-and-Play CNN-Style Transformers for Improving Medical Image Segmentation

References

Liskowski P, Krawiec K (2016) Segmenting retinal blood vessels with deep neural networks. IEEE Trans Med Imaging 35(11):2369–2380
Article Google Scholar
Ben Abdallah M, Azar A, Guedri H, et al. (2018) Noise-estimation-based anisotropic diffusion approach for retinal blood vessel segmentation. Neural Comput Appl 29:159–180
Article Google Scholar
Tong H, Fang Z, Wei Z, et al. (2021) SAT-Net: a side attention network for retinal image segmentation. Appl Intell 51: 5146–5156
Article Google Scholar
Deniz C M, Xiang S, Hallyburton R S, Welbeck A, Babb J S, Honig S, Cho K, Chang G (2018) Segmentation of the proximal femur from mr images using deep convolutional neural networks. Sci Rep 8(1):1–14
Article Google Scholar
Fan DP, Ji GP, Zhou T, Chen G, Fu H, Shen J, Shao L (2020) Pranet: Parallel reverse attention network for polyp segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, pp 263–273
Zhou Z, Siddiquee M M R, Tajbakhsh N, Liang J (2018) UNEt++: A Nested U-Net Architecture for Medical Image Segmentation. In: 4th Deep Learning in Medical Image Analysis, DLMIA, Workshop, Granada, DLMIA 2018, LNCS 11045, pp 3–11
Khened M, Kollerathu V A, Krishnamurthi G (2019) Fully convolutional multi-scale residual densenets for cardiac segmentation and automated cardiac diagnosis using ensemble of classifiers. Med Image Anal 51:21–45
Article Google Scholar
Pitchai R, Madhu Babu C, Supraja P, et al. (2020) Cerebrum tumor segmentation of high resolution magnetic resonance images using 2D-Convolutional network with skull stripping. Neural Process Lett 53:2567–2580
Article Google Scholar
Pereira S, Pinto A, Alves V, Silva C A (2016) Brain tumor segmentation using convolutional neural networks in MRI images. IEEE Trans Med Imaging 35(5):1240–1251
Article Google Scholar
Pitchai R, Supraja P, Victoria A H, et al. (2020) Brain tumor segmentation using deep learning and fuzzy K-Means clustering for magnetic resonance images. Neural Process Lett 53:2519–2532
Article Google Scholar
Zhao X, Ji J, Wang X (2019) Dynamic brain functional parcellation via sliding window and artificial bee colony algorithm. Appl Intell 49:1748–1770
Article Google Scholar
Soliman A, et al. (2017) Accurate lungs segmentation on CT chest images by adaptive Appearance-Guided shape modeling. IEEE Trans Med Imaging 36(1):263–276
Article Google Scholar
Song J, et al. (2016) Lung lesion extraction using a toboggan based growing automatic segmentation approach. IEEE Trans Med Imaging 35(1):337–353
Article Google Scholar
Jiang J, et al. (2019) Multiple resolution residually connected feature streams for automatic lung tumor segmentation from CT images. IEEE Trans Med Imaging 38(1):134–144
Article Google Scholar
Zhao B, Chen X, Li Z, Yu Z, Yao S, Yan L, Wang Y, Liu Z, Liang C, Han C (2020) Triple U-net: Hematoxylin-aware nuclei segmentation with progressive dense feature aggregation. Med Image Anal 65:101786
Article Google Scholar
Wang Y, Ye H, Cao F (2021) A novel multi-discriminator deep network for image segmentation. Appl Intell. https://doi.org/10.1007/s10489-021-02427-x
Li X, Chen H, Qi X, Dou Q, Fu C W, Heng P A (2018) H-denseunet: hybrid densely connected unet for liver and tumor segmentation from ct volumes. IEEE Trans Med Imaging 37(12):2663–2674
Article Google Scholar
Esteva A, Kuprel B, Novoa R A, Ko J, Swetter S M, Blau H M, et al. (2017) Dermatologist-level classification of skin cancer with deep neural networks. Nature 542(7639):115–118
Article Google Scholar
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp 3431–3440
Ronneberger O, Fischer P, Brox TN (2015) Convolutional networks for biomedical image segmentation. In: Paper presented at international conference on medical image computing and computer-assisted intervention (ICCV). Springer, pp 234– 241
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 770–778
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N (2021) An image is worth 16x16 words: Transformers for image recognition at scale. In: International Conference on Learning Representations, ICLR, arXiv:2010.11929
Touvron H, Cord M, Douze M, Massa F, Sablayrolles A, J egou H (2020) Training data-efficient image transformers & distillation through attention. arXiv:2012.12877
Wang H, Zhu Y, Green B, Adam H, Yuille A, Chen LC (2020) Axial-deeplab: Stand-alone axial-attention for panoptic segmentation. In: ECCV, vol 12349. Springer. https://doi.org/10.1007/978-3-030-58548-8_7
Zheng S, Lu J, Zhao H, Zhu X, Luo Z, Wang Y, Fu Y, Feng J, Xiang T, Torr PH et al (2021) Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR
Zhou B, Zhao H, Puig X, Fidler S, Barriuso A, Torralba A (2016) Semantic understanding of scenes through the ade20k dataset. Int J Comput Vis (IJCV) 127(3):302–321
Article Google Scholar
Chen L, Papandreou G, Kokkinos I, Murphy K, Yuille A L (2018) Deeplab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
Article Google Scholar
Milletari F, Navab N, Ahmadi SA (2016) V-Net: Fully convolutional neural networks for volumetric medical image segmentation. In: 2016 Fourth International Conference on 3D Vision (3DV). IEEE, pp 565–571
Schlemper J, Oktay O, Schaap M, Heinrich M, Kainz B, Glocker B, et al. (2019) Attention gated networks: Learning to leverage salient regions in medical images. Med Image Anal 53:197–207
Article Google Scholar
Alom M Z, Yakopcic C, Taha T M, Asari V K (2018) Nuclei Segmentation with Recurrent Residual Convolutional Neural Networks based U-Net (R2U-Net). NAECON 2018 - IEEE National Aerospace and Electronics Conference, pp 228–233
Xiao X, Lian S, Luo Z, Li S (2018) Weighted Res-Unet for High-Quality Retina Vessel Segmentation. In: 2018 9th International Conference on Information Technology in Medicine and Education (ITME). IEEE, pp 327–331
Guan S, Khan A A, Sikdar S, Chitnis P V (2020) Fully dense unet for 2-D sparse photoacoustic tomography artifact removal. IEEE J Biomed Health Inf 24(2):568–576
Article Google Scholar
Ibtehaz N, Rahman M S (2020) MultiresUNet: Rethinking the U-Net Architecture for Multimodal Biomedical Image Segmentation. Neural Netw 121:74–87
Article Google Scholar
Szegedy C, et al (2015) Going deeper with convolutions. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 1–9
He K, Gkioxari G, Dollr P, Girshick R (2017) Mask r-CNN. in IEEE international conference on computer vision (ICCV), Venice, pp 2980–2988
Gu Z, Cheng J, Fu H, Zhou K, Hao H, Zhao Y, et al. (2019) CE-Net: context encoder network for 2D medical image segmentation. IEEE Trans Med Imaging 38(10):2281–2292
Article Google Scholar
Zhang J, Xie Y, Wang Y, Xia Y (2020) Inter-slice Context Residual Learning for 3D Medical Image Segmentation. In: IEEE Transactions on Medical Imaging(Early Access), pp 1–1
Devlin J, Chang M, Lee K, Toutanova K (2019) BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. NAACL-HLT arXiv:2103.05940
Dai Y, Gao Y (2021) TransMed: Transformers Advance Multi-modal Medical Image Classification. Diagnostics. https://doi.org/10.3390/diagnostics11081384
Chen J, Lu Y, Yu Q, Luo X, Adeli E, Wang Y, Lu L, Yuille A, Zhou Y (2021) TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation. arXiv:2102.04306
Valanarasu J M, Oza P, Hacihaliloglu I, Patel V (2021) Medical transformer: Gated Axial-Attention for medical image Segmentation.Medical image computing and computer assisted intervention, MICCAI. arXiv:2102.10662
Cao H, Wang Y, Chen J, Jiang D, Zhang X, Tian Q, Wang M (2021) Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation. arXiv:2105.05537
Hu J, Shen L, Albanie S, Sun G, Wu E (2020) Squeeze-and-excitation Networks. IEEE Trans Pattern Anal Mach Intell (TPAMI) 42(8):2011–2023. https://doi.org/10.1109/TPAMI.2019.2913372
Oktay O et al (2018) Attention U-Net: Learning Where to Look for the Pancreas. In: 1st Conference on Medical Imaging with Deep Learning (MIDL). arXiv:1804.03999
Chen L, Zhang H, Xiao J, Nie L, Shao J, Liu W, Chua T (2017) SCA-CNN: Spatial and channel-wise attention in convolutional networks for image captioning. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 6298–6306
Wang X, Han S, Chen Y, Gao D, Vasconcelos N (2019) Volumetric attention for 3D medical image segmentation and detection. In: Shen D et al (eds) Medical image computing and computer assisted intervention, MICCAI. Springer, Cham, p 11769

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China under Grant 61762014, in part by the Science and Technology Project of Guangxi under Grant 2018GXNSFAA281351, and in part by the Innovation Project of Guangxi Graduate Education under Grant YCSW2021096.

Author information

Authors and Affiliations

College of Electronic Engineering, Guangxi Normal University, Guilin, 541004, China
Mingjun Ma, Haiying Xia, Haisheng Li & Shuxiang Song
School of Computer Science and Engineering, Guangxi Normal University, Guilin, 541004, China
Yumei Tan

Authors

Mingjun Ma
View author publications
You can also search for this author in PubMed Google Scholar
Haiying Xia
View author publications
You can also search for this author in PubMed Google Scholar
Yumei Tan
View author publications
You can also search for this author in PubMed Google Scholar
Haisheng Li
View author publications
You can also search for this author in PubMed Google Scholar
Shuxiang Song
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Haiying Xia.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ma, M., Xia, H., Tan, Y. et al. HT-Net: hierarchical context-attention transformer network for medical ct image segmentation. Appl Intell 52, 10692–10705 (2022). https://doi.org/10.1007/s10489-021-03010-0

Download citation

Accepted: 12 November 2021
Published: 15 January 2022
Issue Date: July 2022
DOI: https://doi.org/10.1007/s10489-021-03010-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

HT-Net: hierarchical context-attention transformer network for medical ct image segmentation

Abstract

Access this article

Similar content being viewed by others

ConTrans: Improving Transformer with Convolutional Attention for Medical Image Segmentation

GCEENet: A Global Context Enhancement and Exploitation for Medical Image Segmentation

ConvFormer: Plug-and-Play CNN-Style Transformers for Improving Medical Image Segmentation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

HT-Net: hierarchical context-attention transformer network for medical ct image segmentation

Abstract

Access this article

Similar content being viewed by others

ConTrans: Improving Transformer with Convolutional Attention for Medical Image Segmentation

GCEENet: A Global Context Enhancement and Exploitation for Medical Image Segmentation

ConvFormer: Plug-and-Play CNN-Style Transformers for Improving Medical Image Segmentation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation