Skip to main content

Self Pre-training with Single-Scale Adapter for Left Atrial Segmentation

  • Conference paper
  • First Online:
Left Atrial and Scar Quantification and Segmentation (LAScarQS 2022)

Abstract

Accurate Left Atrial (LA) segmentation from Late Gadolinium Enhancement Magnetic Resonance Imaging (LGE MRI) is fundamental to the diagnosis of Atrial Fibrillation (AF). Previous approaches tended to solve this problem by refining network architecture to leverage spatial priors in medical imaging. However, the priors modeling can hardly be achieved due to low image quality and various shapes of LA. In this paper, we try to learn the priors from generation. The motivation is simple: if a model can generate or recover image content well, it possibly has learned the priors well. With the priors built in, such a model can better segment LA. Specifically, we investigate the self pre-training paradigm, i.e., models are pre-trained and fine-tuned on the same LGE-MRI dataset, based on Mask Autoencoder (MAE). In the pre-training stage, we utilize Vision Transformers (ViT) based auto-encoders to perform the pretext task of reconstructing the original MRI images from only partial patches, where the ViT encoder is encouraged to learn contextual information as priors by aggregating global information to recover the contents in masked patches. In the fine-tuning process, we further propose an single-scale adaptor for downstream task. The adapter first has different branches with different numbers of upsampling blocks to remedy the plain, non-hierarchical property of the ViT. This can better adapt ViT to dense prediction task. Then, it constructs a feature pyramid directly from the single-scale feature map of ViT using the multi-scale features from different branches. Finally, the adapter incorporates a decoder to predict the segmentation results based on the feature pyramid. The proposed model (called ViTUNet) outperforms baseline trained from scratch and widely used nnUNet model. The final trained model shows a validation score of 0.89013, 1.70567 and 17.12375 for Dice coefficient, ASD and HD metric, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 44.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 59.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://zmiclab.github.io/projects/lascarqs22/.

References

  1. Chugh, S.S., et al.: Worldwide epidemiology of atrial fibrillation: a global burden of disease 2010 study. Circulation 129(8), 837 (2013)

    Article  Google Scholar 

  2. Li, L., Zimmer, V.A., Schnabel, J.A., Zhuang, X.: Atrialjsqnet: a new framework for joint segmentation and quantification of left atrium and scars incorporating spatial and shape information. Med. Image Anal. 76, 102303 (2022)

    Article  Google Scholar 

  3. Li, L., Zimmer, V.A., Schnabel, J.A., Zhuang, X.: Medical image analysis on left atrial lge mri for atrial fibrillation studies: a review. Med. Image Anal., 102360 (2022)

    Google Scholar 

  4. Li, L., Zimmer, V.A., Schnabel, J.A., Zhuang, X.: AtrialGeneral: domain generalization for left atrial segmentation of multi-center LGE MRIs. In: de Bruijne, M., Cattin, P.C., Cotin, S., Padoy, N., Speidel, S., Zheng, Y., Essert, C. (eds.) MICCAI 2021. LNCS, vol. 12906, pp. 557–566. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87231-1_54

    Chapter  Google Scholar 

  5. Zhang, J., Xie, Y., Liao, Z., Verjans, J., Xia, Y.: EfficientSeg: a simple but efficient solution to myocardial pathology segmentation challenge. In: Zhuang, X., Li, L. (eds.) MyoPS 2020. LNCS, vol. 12554, pp. 17–25. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-65651-5_2

    Chapter  Google Scholar 

  6. Martín-Isla, C., Asadi-Aghbolaghi, M., Gkontra, P., Campello, V.M., Escalera, S., Lekadir, K.: Stacked BCDU-net with semantic CMR synthesis: application to myocardial pathology segmentation challenge. In: Zhuang, X., Li, L. (eds.) MyoPS 2020. LNCS, vol. 12554, pp. 1–16. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-65651-5_1

    Chapter  Google Scholar 

  7. Liu, Y., Zhang, M., Zhan, Q., Gu, D., Liu, G.: Two-stage method for segmentation of the myocardial scars and edema on multi-sequence cardiac magnetic resonance. In: Zhuang, X., Li, L. (eds.) MyoPS 2020. LNCS, vol. 12554, pp. 26–36. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-65651-5_3

    Chapter  Google Scholar 

  8. Tan, M., Le, Q.: Efficientnet: rethinking model scaling for convolutional neural networks. In: International conference on machine learning, pp. 6105–6114, PMLR (2019)

    Google Scholar 

  9. Dosovitskiy, A., et al.: An image is worth 16x16 words: Transformers for image recognition at scale, arXiv preprint arXiv:2010.11929 (2020)

  10. Bao, H., Dong, L., Wei, F.: Beit: Bert pre-training of image transformers, arXiv preprint arXiv:2106.08254 (2021)

  11. Xie, Z., et al.: Simmim: a simple framework for masked image modeling. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9653–9663 (2022)

    Google Scholar 

  12. He, K., Chen, X., Xie, S., Li, Y., Dollár, P., Girshick, R.: Masked autoencoders are scalable vision learners. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16000–16009 (2022)

    Google Scholar 

  13. Chen, Z., Duan, Y., Wang, W., He, J., Lu, T., Dai, J., Qiao, Y.: Vision transformer adapter for dense predictions. arXiv preprint arXiv:2205.08534 (2022)

  14. Zhou, L., Liu, H., Bae, J., He, J., Samaras, D., Prasanna, P.: Self pre-training with masked autoencoders for medical image analysis. arXiv preprint arXiv:2203.05573 (2022)

  15. Li, Y., Mao, H., Girshick, R., He, K.: Exploring plain vision transformer backbones for object detection. arXiv preprint arXiv:2203.16527 (2022)

  16. Hatamizadeh, A., et al.: Unetr: transformers for 3d medical image segmentation. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 574–584, (2022)

    Google Scholar 

  17. Isensee, F., Jaeger, P.F., Kohl, S.A., Petersen, J., Maier-Hein, K.H.: nnu-net: a self-configuring method for deep learning-based biomedical image segmentation. Nat. Methods 18(2), 203–211 (2021)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaowei Ding .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tu, C. et al. (2023). Self Pre-training with Single-Scale Adapter for Left Atrial Segmentation. In: Zhuang, X., Li, L., Wang, S., Wu, F. (eds) Left Atrial and Scar Quantification and Segmentation. LAScarQS 2022. Lecture Notes in Computer Science, vol 13586. Springer, Cham. https://doi.org/10.1007/978-3-031-31778-1_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-31778-1_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-31777-4

  • Online ISBN: 978-3-031-31778-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics