skip to main content
10.1145/3581807.3581825acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiccprConference Proceedingsconference-collections
research-article

MoAFormer: Aggregating Adjacent Window Features into Local Vision Transformer Using Overlapped Attention Mechanism for Volumetric Medical Segmentation

Published:22 May 2023Publication History

ABSTRACT

The window-based attention is used to alleviate the problem of abrupt increase in computation as the input image resolution grows and shows excellent performance. However, the problem that aggregating global features from different windows is waiting to be resolved. Swin-Transformer is proposed to construct hierarchical encoding by a shifted-window mechanism to interactively learn the information between different windows. In this work, we investigate the outcome of applying an overlapped attention block (MoA) after the local attention layer and apply plenty to medical image segmentation tasks. The overlapped attention module employs slightly larger and overlapped patches in the key and value to enable neighbouring pixel information transmission, which leads to significant performance gain. The experimental results on the ACDC and Synapse datasets demonstrate that the used method performs better than previous Transformer models.

References

  1. N. Borzooie, H. Danyali, M. S. J. J. o. I. Helfroush, and Graphics, "Modified Density-Based Data Clustering for Interactive Liver Segmentation," no. 1, 2018.Google ScholarGoogle Scholar
  2. J. Wu, G. Li, H. Lu, T. J. J. o. I. Kamiy, and Graphics, "A Supervoxel Classification Based Method for Multi-organ Segmentation from Abdominal CT Images," no. 1, 2021.Google ScholarGoogle Scholar
  3. N. Richard, C. Fernandez-Maloigne, C. Bonanomi, A. J. J. o. I. Rizzi, and Graphics, "Fuzzy Color Image Segmentation using Watershed Transform," pp. 157-160, 2013.Google ScholarGoogle Scholar
  4. A. Vaswani , "Attention is All you Need," vol. abs/1706.03762, 2017.Google ScholarGoogle Scholar
  5. A. Dosovitskiy , "An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale," vol. abs/2010.11929, 2021.Google ScholarGoogle Scholar
  6. Z. Liu , "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows," pp. 9992-10002, 2021.Google ScholarGoogle Scholar
  7. K. Patel, A. M. Bur, F. Li, and G. J. A. Wang, "Aggregating Global Features into Local Vision Transformer," vol. abs/2201.12903, 2022.Google ScholarGoogle Scholar
  8. P. An, Xu, S., Harmon, S. A., Turkbey, E. B., Sanford, T. H., Amalou, A., Kassin, M., Varble, N., Blain, M., Anderson, V., Patella, F., Carrafiello, G., Turkbey, B. T., & Wood, B. J., "CT Images in COVID-19 [Data set]. The Cancer Imaging Archive.," 2020.Google ScholarGoogle Scholar
  9. O. Ronneberger, P. Fischer, and T. Brox, "U-Net: Convolutional Networks for Biomedical Image Segmentation," in MICCAI, 2015.Google ScholarGoogle Scholar
  10. Ö. Çiçek, A. Abdulkadir, S. S. Lienkamp, T. Brox, and O. J. a. e.-p. Ronneberger, "3D U-Net: Learning Dense Volumetric Segmentation from Sparse Annotation," p. arXiv:1606.06650Accessed on: June 01, 2016Available: https://ui.adsabs.harvard.edu/abs/2016arXiv160606650CGoogle ScholarGoogle Scholar
  11. E. Shelhamer, J. Long, T. J. I. T. o. P. A. Darrell, and M. Intelligence, "Fully Convolutional Networks for Semantic Segmentation," vol. 39, pp. 640-651, 2017.Google ScholarGoogle Scholar
  12. K. Kamnitsas , "Efficient multi‐scale 3D CNN with fully connected CRF for accurate brain lesion segmentation," vol. 36, pp. 61–78, 2017.Google ScholarGoogle Scholar
  13. A. J. a. e.-p. Myronenko, "3D MRI brain tumor segmentation using autoencoder regularization," p. arXiv:1810.11654Accessed on: October 01, 2018Available: https://ui.adsabs.harvard.edu/abs/2018arXiv181011654MGoogle ScholarGoogle Scholar
  14. Z. Jiang, C. Ding, M. Liu, and D. Tao, "Two-Stage Cascaded U-Net: 1st Place Solution to BraTS Challenge 2019 Segmentation Task," 2020.Google ScholarGoogle Scholar
  15. F. Isensee, J. Petersen, S. A. A. Kohl, P. F. Jäger, and K. J. A. Maier-Hein, "nnU-Net: Breaking the Spell on Successful Medical Image Segmentation," vol. abs/1904.08128, 2019.Google ScholarGoogle Scholar
  16. A. Hatamizadeh, D. Yang, H. R. Roth, and D. J. I. C. W. C. o. A. o. C. V. Xu, "UNETR: Transformers for 3D Medical Image Segmentation," pp. 1748-1758, 2022.Google ScholarGoogle Scholar
  17. H. Cao , "Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation," vol. abs/2105.05537, 2021.Google ScholarGoogle Scholar
  18. J. Chen , "TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation," vol. abs/2102.04306, 2021.Google ScholarGoogle Scholar
  19. A.-J. Lin, B. Chen, J. Xu, Z. Zhang, and G. J. A. Lu, "DS-TransUNet: Dual Swin Transformer U-Net for Medical Image Segmentation," vol. abs/2106.06716, 2022.Google ScholarGoogle Scholar
  20. D. Karimi, S. D. Vasylechko, and A. Gholipour, "Convolution-Free Medical Image Segmentation using Transformers," in MICCAI, 2021.Google ScholarGoogle Scholar
  21. H.-Y. Zhou, J. Guo, Y. Zhang, L. Yu, L. Wang, and Y. J. A. Yu, "nnFormer: Interleaved Transformer for Volumetric Segmentation," vol. abs/2109.03201, 2021.Google ScholarGoogle Scholar
  22. O. Bernard , "Deep Learning Techniques for Automatic MRI Cardiac Multi-structures Segmentation and Diagnosis: Is the Problem Solved?," pp. 1-1, 2018.Google ScholarGoogle Scholar
  23. Z. X. Bennett Landman, Juan Eugenio Igelsias, Martin Styner, Thomas Robin Langerak, Arno Klein. (eds.), "“2015 MICCAI Multi-Atlas Labeling Beyond the Cranial Vault – Workshop and Challenge.”," 2015.Google ScholarGoogle Scholar
  24. W. Liu, A. Rabinovich, and A. C. J. A. Berg, "ParseNet: Looking Wider to See Better," vol. abs/1506.04579, 2015.Google ScholarGoogle Scholar

Index Terms

  1. MoAFormer: Aggregating Adjacent Window Features into Local Vision Transformer Using Overlapped Attention Mechanism for Volumetric Medical Segmentation

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          ICCPR '22: Proceedings of the 2022 11th International Conference on Computing and Pattern Recognition
          November 2022
          683 pages
          ISBN:9781450397056
          DOI:10.1145/3581807

          Copyright © 2022 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 22 May 2023

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed limited
        • Article Metrics

          • Downloads (Last 12 months)28
          • Downloads (Last 6 weeks)2

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format