Skip to main content
Log in

TD-Net:unsupervised medical image registration network based on Transformer and CNN

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Medical image registration is a fundamental task in computer-aided medical diagnosis. Recently, researchers have begun to use deep learning methods based on convolutional neural networks (CNN) for registration, and have made remarkable achievements in medical image registration. Although CNN based methods can provide rich local information on registration, their global modeling ability is weak to carry out the long distance information interaction and restrict the registration performance. The Transformer is originally used for sequence-to-sequence prediction. Now it also achieves great results in various visual tasks, due to its strong global modeling capability. Compared with CNN, Transformer can provide rich global information, in contrast, Transformer lacks of local information. To address Transformer lacks local information, we propose a hybrid network which is similar to U-Net to combine Transformer and CNN, to extract global and local information (at each level). Specifically, CNN is first used to obtain the feature maps of the image, and the Transformer is used as encoder to extract global information. Then the results obtained by Transformer encoding are connected to the upsampling process. The upsampling uses CNN to integrate local information and global information. Finally, the resolution is restored to the input image, and obtain the displacement field after several convolution layers. We evaluate our method on brain MRI scans. Experimental results demonstrate that our method improves the accuracy by 1% compared with the state-of-the-art approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Availability of data and material

The data is a public data set and it can be obtained https://www.oasis-brains.org/. We preprocessed the data.

Code Availability

Our code is available at https://github.com/SLKaMiHi/TD-Net-unsupervised-medical-image-registration-network-based-on-Transformer-and-CNN

References

  1. Sotiras A, Davatzikos C, Paragios N (2013) Deformable medical image registration: A survey. IEEE transactions on medical imaging, 32, https://doi.org/10.1109/TMI.2013.2265603

  2. Avants BB, Epstein CL, Grossman M, Gee JC (2008) Symmetric diffeomorphic image registration with cross-correlation: evaluating automated labeling of elderly and neurodegenerative brain. Med Image Anal 12(1):26–41

    Article  Google Scholar 

  3. Klein A, Andersson J, Ardekani BA, Ashburner J, Avants B, Chiang M-C, Christensen GE, Collins DL, Gee J, Hellier P, Song JH, Jenkinson M, Lepage C, Rueckert D, Thompson P, Vercauteren T, Woods RP, Mann JJ, Parsey RV (2009) Evaluation of 14 nonlinear deformation algorithms applied to human brain mri registration. NeuroImage 46(3):786–802. https://doi.org/10.1016/j.neuroimage.2008.12.037, https://www.sciencedirect.com/science/article/pii/S1053811908012974

    Article  Google Scholar 

  4. Lorenzi M, Ayache N, Frisoni GB, Pennec X (2013) Lcc-demons: A robust and accurate symmetric diffeomorphic registration algorithm. NeuroImage 81:470–483. https://doi.org/10.1016/j.neuroimage.2013.04.114, https://www.sciencedirect.com/science/article/pii/S1053811913004825

    Article  Google Scholar 

  5. Balakrishnan G, Zhao A, Sabuncu MR, Guttag J, Dalca AV (2019) Voxelmorph: A learning framework for deformable medical image registration. IEEE Trans Med Imaging, pp 1788–1800

  6. Mok T, Chung A (2020) Fast symmetric diffeomorphic image registration with convolutional neural networks. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

  7. Zhao S, Lau T, Luo J, Chang E I-C, Xu Y (2020) Unsupervised 3d end-to-end medical image registration with volume tweening network. IEEE J Biomed Health Inf 24(5):1394–1404. https://doi.org/10.1109/JBHI.2019.2951024

    Article  Google Scholar 

  8. Vos B, Berendsen FF, Viergever MA, Staring M, Igum I (2017) End-to-end unsupervised deformable image registration with a convolutional neural network. International Workshop on Deep Learning in Medical Image Analysis International Workshop on Multimodal Learning for Clinical Decision Support

  9. Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. Springer International Publishing, https://doi.org/10.1007/978-3-319-24574-4_28

  10. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, p 5998?6008

  11. Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N (2020) An image is worth 16x16 words: Transformers for image recognition at scale

  12. Wu H, Xiao B, Codella N, Liu M, Dai X, Yuan L, Zhang L (2021) Cvt: Introducing convolutions to vision transformers

  13. Zhang Q, Yang Y (2021) Rest: An efficient transformer for visual recognition

  14. Chen J, He Y, Frey EC, Li Y, Du Y (2021) Vit-v-net: Vision transformer for unsupervised volumetric medical image registration

  15. Wang H, Zhu Y, Adam H, Yuille A, Chen L-C (2021) Max-deeplab: End-to-end panoptic segmentation with mask transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 5463–5474

  16. Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, Zagoruyko S (2020) End-to-end object detection with transformers, pp 213–229. https://doi.org/10.1007/978-3-030-58452-8_13

  17. Chen J, Lu Y, Yu Q, Luo X, Adeli E, Wang Y, Lu L, Yuille A, Zhou Y (2021) Transunet: Transformers make strong encoders for medical image segmentation

  18. Jaderberg M, Simonyan K, Zisserman A (2015) Spatial transformer networks. In: Advances in neural information processing systems, pp 2017–2025

  19. Cao X, Yang J, Zhang J, Nie D, Kim M, Wang Q (2017) Deformable image registration based on similarity-steered cnn regression, vol 10433, pp 300–308

  20. Rohe M-M, Datar M, Heimann T, Sermesant M, Pennec X (2017) Svf-net: Learning deformable image registration using shape matching, pp 266–274. https://doi.org/10.1007/978-3-319-66182-7_31

  21. Krebs J, Mansi T, Delingette H, Li P, Ghesu F, Miao S, Maier A, Ayache N, Liao R, Kamen A (2017) Robust non-rigid registration through agent-based action learning, pp 344–352. https://doi.org/10.1007/978-3-319-66182-7_40

  22. Yang F, Yang H, Fu J, Lu H, Guo B (2020) Learning texture transformer network for image super-resolution

  23. Zeng Y, Fu J, Chao H (2020) Learning joint spatial-temporal transformations for video inpainting, pp 528–543. https://doi.org/10.1007/978-3-030-58517-4_31

  24. Cao H, Wang Y, Chen J, Jiang D, Zhang X, Tian Q, Wang M (2021) Swin-unet: Unet-like pure transformer for medical image segmentation

  25. Ulyanov D, Vedaldi A, Lempitsky V (2016) Instance normalization: The missing ingredient for fast stylization

  26. Ba J, Kiros J, Hinton G (2016) Layer normalization

  27. Marcus DS, Wang TH, Parker J, Csernansky JG, Morris JC, Buckner RL (2007) Open access series of imaging studies (oasis): Cross-sectional mri data in young, middle aged, nondemented, and demented older adults. J Cogn Neurosci 19(9):1498–1507

    Article  Google Scholar 

  28. Fischl B (2012) Freesurfer. NeuroImage (Orlando, Fla.) 62(2):774–781

    Google Scholar 

  29. Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, Desmaison A, Kopf A, Yang E, DeVito Z, Raison M, Tejani A, Chilamkurthy S, Steiner B, Fang L, Bai J, Chintala S (2019) Pytorch: An imperative style, high-performance deep learning library. In: Wallach H, Larochelle H, Beygelzimer A, d’ Alché-Buc F, Fox E, Garnett R (eds) advances in neural information processing systems. https://proceedings.neurips.cc/paper/2019/file/bdbca288fee7f92f2bfa9f7012727740-Paper.pdf, vol 32. Curran Associates, Inc.

  30. Kingma DP, Ba JL (2015) Adam: A method for stochastic optimization

  31. Dice LR (1945) Measures of the amount of ecologic association between species. Ecology (Durham) 26(3):297–302

    Article  Google Scholar 

  32. Avants BB, Tustison NJ, Song G, Cook PA, Klein A, Gee JC (2011) A reproducible evaluation of ants similarity metric performance in brain image registration. NeuroImage (Orlando, Fla.) 54(3):2033–2044

    Google Scholar 

Download references

Acknowledgements

This work was supported by the National Nature Science Foundation of China [grant number61772226,61862056]; Science and Technology Development Program of Jilin Province [grant number 20210204133YY]; The Natural Science Foundation of Jilin Province (Grant number No. 20200201159JC), Key Laboratory for Symbol Computation and Knowledge Engineering of the National Education Ministry of China,Jilin University.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guixia Liu.

Ethics declarations

Conflict of Interests

The authors have no conflicts of interest to declare that are relevant to the content of this article.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Lei Song, Guixia Liu and Mingrui Ma contributed equally to this work.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Song, L., Liu, G. & Ma, M. TD-Net:unsupervised medical image registration network based on Transformer and CNN. Appl Intell 52, 18201–18209 (2022). https://doi.org/10.1007/s10489-022-03472-w

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-022-03472-w

Keywords

Navigation