Convolution-Free Medical Image Segmentation Using Transformers

Karimi, Davood; Vasylechko, Serge Didenko; Gholipour, Ali

doi:10.1007/978-3-030-87193-2_8

Davood Karimi¹⁵,
Serge Didenko Vasylechko¹⁵ &
Ali Gholipour¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12901))

Included in the following conference series:

International Conference on Medical Image Computing and Computer-Assisted Intervention

15k Accesses

Abstract

Like other applications in computer vision, medical image segmentation and his email address have been most successfully addressed using deep learning models that rely on the convolution operation as their main building block. Convolutions enjoy important properties such as sparse interactions, weight sharing, and translation equivariance. These properties give convolutional neural networks (CNNs) a strong and useful inductive bias for vision tasks. However, the convolution operation also has important shortcomings: it performs a fixed operation on every test image regardless of the content and it cannot efficiently model long-range interactions. In this work we show that a network based on self-attention between neighboring patches and without any convolution operations can achieve better results. Given a 3D image block, our network divides it into $n^3$ 3D patches, where $n=3 \text { or } 5$ and computes a 1D embedding for each patch. The network predicts the segmentation map for the center patch of the block based on the self-attention between these patch embeddings. We show that the proposed model can achieve higher segmentation accuracies than a state of the art CNN. For scenarios with very few labeled images, we propose methods for pre-training the network on large corpora of unlabeled images. Our experiments show that with pre-training the advantage of our proposed network over CNNs can be significant when labeled training data is small.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 109.00; Price excludes VAT (USA)

Softcover Book: USD 139.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

TransDeepLab: Convolution-Free Transformer-Based DeepLab v3+ for Medical Image Segmentation

Abstract: 3D Medical Image Segmentation with Transformer-based Scaling of ConvNets

MedNeXt: Transformer-Driven Scaling of ConvNets for Medical Image Segmentation

References

Bai, W., et al.: Semi-supervised learning for network-based cardiac MR image segmentation. In: Descoteaux, M., Maier-Hein, L., Franz, A., Jannin, P., Collins, D.L., Duchesne, S. (eds.) MICCAI 2017. LNCS, vol. 10434, pp. 253–260. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66185-8_29
Chapter Google Scholar
Bai, W., et al.: Recurrent neural networks for aortic image sequence segmentation with sparse annotations. In: Frangi, A.F., Schnabel, J.A., Davatzikos, C., Alberola-López, C., Fichtinger, G. (eds.) MICCAI 2018. LNCS, vol. 11073, pp. 586–594. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00937-3_67
Chapter Google Scholar
Bakas, S., et al.: Identifying the best machine learning algorithms for brain tumor segmentation, progression assessment, and overall survival prediction in the brats challenge. arXiv preprint arXiv:1811.02629 (2018)
Bastiani, M., et al.: Automated processing pipeline for neonatal diffusion MRI in the developing human connectome project. NeuroImage 185, 750–763 (2019)
Article Google Scholar
Bernard, O., et al.: Deep learning techniques for automatic MRI cardiac multi-structures segmentation and diagnosis: is the problem solved? IEEE Trans. Med. Imaging 37(11), 2514–2525 (2018)
Article Google Scholar
Chen, J., et al.: Transunet: Transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.04306 (2021)
Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014)
Dosovitskiy, A., et al.: An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
Dou, H., et al.: A deep attentive convolutional neural network for automatic cortical plate segmentation in fetal MRI. arXiv preprint arXiv:2004.12847 (2020)
Gao, Y., Phillips, J.M., Zheng, Y., Min, R., Fletcher, P.T., Gerig, G.: Fully convolutional structured lstm networks for joint 4d medical image segmentation. In: 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), pp. 1104–1108 (2018). https://doi.org/10.1109/ISBI.2018.8363764
Gibbs, P., Buckley, D.L., Blackband, S.J., Horsman, A.: Tumour volume determination from MR images by morphological segmentation. Phys. Med. Biol. 41(11), 2437 (1996)
Article Google Scholar
Goodfellow, I., Bengio, Y., Courville, A., Bengio, Y.: Deep Learning, vol. 1. MIT Press, Cambridge (2016)
Google Scholar
Hesamian, M.H., Jia, W., He, X., Kennedy, P.: Deep learning techniques for medical image segmentation: achievements and challenges. J. Digit. Imaging 32(4), 582–596 (2019)
Article Google Scholar
Ho, J., Kalchbrenner, N., Weissenborn, D., Salimans, T.: Axial attention in multidimensional transformers. arXiv preprint arXiv:1912.12180 (2019)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Isensee, F., Kickingereder, P., Wick, W., Bendszus, M., Maier-Hein, K.H.: No new-net. In: Crimi, A., Bakas, S., Kuijf, H., Keyvan, F., Reyes, M., van Walsum, T. (eds.) BrainLes 2018. LNCS, vol. 11384, pp. 234–244. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11726-9_21
Chapter Google Scholar
Kamnitsas, K., et al.: Efficient multi-scale 3d CNN with fully connected CRF for accurate brain lesion segmentation. Med. Image Anal. 36, 61–78 (2017)
Article Google Scholar
Khan, S., Naseer, M., Hayat, M., Zamir, S.W., Khan, F.S., Shah, M.: Transformers in vision: A survey. arXiv preprint arXiv:2101.01169 (2021)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Proceedings of the 3rd International Conference on Learning Representations (ICLR) (2014)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Le Cun, Y., et al.: Handwritten digit recognition with a back-propagation network. In: Proceedings of the 2nd International Conference on Neural Information Processing Systems, pp. 396–404 (1989)
Google Scholar
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)
Article Google Scholar
Milletari, F., Navab, N., Ahmadi, S.A.: V-net: fully convolutional neural networks for volumetric medical image segmentation. In: 3D Vision (3DV), 2016 Fourth International Conference on, pp. 565–571. IEEE (2016)
Google Scholar
Murphy, K.P.: Machine learning: a probabilistic perspective (2012)
Google Scholar
Olshausen, B.A., Field, D.J.: Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381(6583), 607–609 (1996)
Article Google Scholar
Prince, J.L., Pham, D., Tan, Q.: Optimization of MR pulse sequences for bayesian image segmentation. Med. Phys. 22(10), 1651–1656 (1995)
Article Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
Google Scholar
Taghanaki, S.A., Abhishek, K., Cohen, J.P., Cohen-Adad, J., Hamarneh, G.: Deep semantic segmentation of natural and medical images: a review. Artif. Intell. Rev. 1–42 (2020)
Google Scholar
Thompson, P.M., Toga, A.W.: Detection, visualization and animation of abnormal anatomic structure with a deformable probabilistic brain atlas based on random vector field transformations. Med. Image Anal. 1(4), 271–294 (1997)
Article Google Scholar
Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., Jégou, H.: Training data-efficient image transformers & distillation through attention. arXiv preprint arXiv:2012.12877 (2020)
Vaswani, A., et al.: Attention is all you need. arXiv preprint arXiv:1706.03762 (2017)
Wang, H., Zhu, Y., Green, B., Adam, H., Yuille, A., Chen, L.-C.: Axial-DeepLab: stand-alone axial-attention for panoptic segmentation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12349, pp. 108–126. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58548-8_7
Chapter Google Scholar
Wang, Y., Guo, Q., Zhu, Y.: Medical image segmentation based on deformable models and its applications. In: Deformable Models. Topics in Biomedical Engineering. International Book Series, pp. 209–260. Springer, New York (2007). https://doi.org/10.1007/978-0-387-68343-0_7
Zhou, Z., Rahman Siddiquee, M.M., Tajbakhsh, N., Liang, J.: UNet++: a nested U-net architecture for medical image segmentation. In: Stoyanov, D., et al. (eds.) DLMIA/ML-CDS -2018. LNCS, vol. 11045, pp. 3–11. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00889-5_1
Chapter Google Scholar

Download references

Acknowledgement

This work was supported in part by the National Institutes of Health (NIH) award numbers R01NS106030, R01EB018988, and R01EB031849; by the Office of the Director of the NIH under number S10OD0250111; and by a Technological Innovations in Neuroscience Award from the McKnight Foundation. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH or the McKnight Foundation.

Author information

Authors and Affiliations

Computational Radiology Laboratory (CRL), Department of Radiology, Boston Children’s Hospital, and Harvard Medical School, Boston, USA
Davood Karimi, Serge Didenko Vasylechko & Ali Gholipour

Authors

Davood Karimi
View author publications
You can also search for this author in PubMed Google Scholar
Serge Didenko Vasylechko
View author publications
You can also search for this author in PubMed Google Scholar
Ali Gholipour
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Davood Karimi .

Editor information

Editors and Affiliations

Erasmus MC - University Medical Center Rotterdam, Rotterdam, The Netherlands
Marleen de Bruijne
University of Basel, Allschwil, Switzerland
Philippe C. Cattin
Inria Nancy Grand Est, Villers-lès-Nancy, France
Stéphane Cotin
ICube, Université de Strasbourg, CNRS, Strasbourg, France
Nicolas Padoy
National Center for Tumor Diseases (NCT/UCC), Dresden, Germany
Stefanie Speidel
Tencent Jarvis Lab, Shenzhen, China
Yefeng Zheng
ICube, Université de Strasbourg, CNRS, Strasbourg, France
Caroline Essert

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Karimi, D., Vasylechko, S.D., Gholipour, A. (2021). Convolution-Free Medical Image Segmentation Using Transformers. In: de Bruijne, M., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2021. MICCAI 2021. Lecture Notes in Computer Science(), vol 12901. Springer, Cham. https://doi.org/10.1007/978-3-030-87193-2_8

Download citation

DOI: https://doi.org/10.1007/978-3-030-87193-2_8
Published: 21 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-87192-5
Online ISBN: 978-3-030-87193-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)

Convolution-Free Medical Image Segmentation Using Transformers

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

TransDeepLab: Convolution-Free Transformer-Based DeepLab v3+ for Medical Image Segmentation

Abstract: 3D Medical Image Segmentation with Transformer-based Scaling of ConvNets

MedNeXt: Transformer-Driven Scaling of ConvNets for Medical Image Segmentation

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Subscribe and save

Buy Now

Navigation

Convolution-Free Medical Image Segmentation Using Transformers

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

TransDeepLab: Convolution-Free Transformer-Based DeepLab v3+ for Medical Image Segmentation

Abstract: 3D Medical Image Segmentation with Transformer-based Scaling of ConvNets

MedNeXt: Transformer-Driven Scaling of ConvNets for Medical Image Segmentation

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation