skip to main content
10.1145/3581807.3581825acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiccprConference Proceedingsconference-collections
research-article

MoAFormer: Aggregating Adjacent Window Features into Local Vision Transformer Using Overlapped Attention Mechanism for Volumetric Medical Segmentation

Published: 22 May 2023 Publication History

Abstract

The window-based attention is used to alleviate the problem of abrupt increase in computation as the input image resolution grows and shows excellent performance. However, the problem that aggregating global features from different windows is waiting to be resolved. Swin-Transformer is proposed to construct hierarchical encoding by a shifted-window mechanism to interactively learn the information between different windows. In this work, we investigate the outcome of applying an overlapped attention block (MoA) after the local attention layer and apply plenty to medical image segmentation tasks. The overlapped attention module employs slightly larger and overlapped patches in the key and value to enable neighbouring pixel information transmission, which leads to significant performance gain. The experimental results on the ACDC and Synapse datasets demonstrate that the used method performs better than previous Transformer models.

References

[1]
N. Borzooie, H. Danyali, M. S. J. J. o. I. Helfroush, and Graphics, "Modified Density-Based Data Clustering for Interactive Liver Segmentation," no. 1, 2018.
[2]
J. Wu, G. Li, H. Lu, T. J. J. o. I. Kamiy, and Graphics, "A Supervoxel Classification Based Method for Multi-organ Segmentation from Abdominal CT Images," no. 1, 2021.
[3]
N. Richard, C. Fernandez-Maloigne, C. Bonanomi, A. J. J. o. I. Rizzi, and Graphics, "Fuzzy Color Image Segmentation using Watershed Transform," pp. 157-160, 2013.
[4]
A. Vaswani, "Attention is All you Need," vol. abs/1706.03762, 2017.
[5]
A. Dosovitskiy, "An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale," vol. abs/2010.11929, 2021.
[6]
Z. Liu, "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows," pp. 9992-10002, 2021.
[7]
K. Patel, A. M. Bur, F. Li, and G. J. A. Wang, "Aggregating Global Features into Local Vision Transformer," vol. abs/2201.12903, 2022.
[8]
P. An, Xu, S., Harmon, S. A., Turkbey, E. B., Sanford, T. H., Amalou, A., Kassin, M., Varble, N., Blain, M., Anderson, V., Patella, F., Carrafiello, G., Turkbey, B. T., & Wood, B. J., "CT Images in COVID-19 [Data set]. The Cancer Imaging Archive.," 2020.
[9]
O. Ronneberger, P. Fischer, and T. Brox, "U-Net: Convolutional Networks for Biomedical Image Segmentation," in MICCAI, 2015.
[10]
Ö. Çiçek, A. Abdulkadir, S. S. Lienkamp, T. Brox, and O. J. a. e.-p. Ronneberger, "3D U-Net: Learning Dense Volumetric Segmentation from Sparse Annotation," p. arXiv:1606.06650Accessed on: June 01, 2016Available: https://ui.adsabs.harvard.edu/abs/2016arXiv160606650C
[11]
E. Shelhamer, J. Long, T. J. I. T. o. P. A. Darrell, and M. Intelligence, "Fully Convolutional Networks for Semantic Segmentation," vol. 39, pp. 640-651, 2017.
[12]
K. Kamnitsas, "Efficient multi‐scale 3D CNN with fully connected CRF for accurate brain lesion segmentation," vol. 36, pp. 61–78, 2017.
[13]
A. J. a. e.-p. Myronenko, "3D MRI brain tumor segmentation using autoencoder regularization," p. arXiv:1810.11654Accessed on: October 01, 2018Available: https://ui.adsabs.harvard.edu/abs/2018arXiv181011654M
[14]
Z. Jiang, C. Ding, M. Liu, and D. Tao, "Two-Stage Cascaded U-Net: 1st Place Solution to BraTS Challenge 2019 Segmentation Task," 2020.
[15]
F. Isensee, J. Petersen, S. A. A. Kohl, P. F. Jäger, and K. J. A. Maier-Hein, "nnU-Net: Breaking the Spell on Successful Medical Image Segmentation," vol. abs/1904.08128, 2019.
[16]
A. Hatamizadeh, D. Yang, H. R. Roth, and D. J. I. C. W. C. o. A. o. C. V. Xu, "UNETR: Transformers for 3D Medical Image Segmentation," pp. 1748-1758, 2022.
[17]
H. Cao, "Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation," vol. abs/2105.05537, 2021.
[18]
J. Chen, "TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation," vol. abs/2102.04306, 2021.
[19]
A.-J. Lin, B. Chen, J. Xu, Z. Zhang, and G. J. A. Lu, "DS-TransUNet: Dual Swin Transformer U-Net for Medical Image Segmentation," vol. abs/2106.06716, 2022.
[20]
D. Karimi, S. D. Vasylechko, and A. Gholipour, "Convolution-Free Medical Image Segmentation using Transformers," in MICCAI, 2021.
[21]
H.-Y. Zhou, J. Guo, Y. Zhang, L. Yu, L. Wang, and Y. J. A. Yu, "nnFormer: Interleaved Transformer for Volumetric Segmentation," vol. abs/2109.03201, 2021.
[22]
O. Bernard, "Deep Learning Techniques for Automatic MRI Cardiac Multi-structures Segmentation and Diagnosis: Is the Problem Solved?," pp. 1-1, 2018.
[23]
Z. X. Bennett Landman, Juan Eugenio Igelsias, Martin Styner, Thomas Robin Langerak, Arno Klein. (eds.), "“2015 MICCAI Multi-Atlas Labeling Beyond the Cranial Vault – Workshop and Challenge.”," 2015.
[24]
W. Liu, A. Rabinovich, and A. C. J. A. Berg, "ParseNet: Looking Wider to See Better," vol. abs/1506.04579, 2015.

Index Terms

  1. MoAFormer: Aggregating Adjacent Window Features into Local Vision Transformer Using Overlapped Attention Mechanism for Volumetric Medical Segmentation

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Other conferences
        ICCPR '22: Proceedings of the 2022 11th International Conference on Computing and Pattern Recognition
        November 2022
        683 pages
        ISBN:9781450397056
        DOI:10.1145/3581807
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 22 May 2023

        Permissions

        Request permissions for this article.

        Check for updates

        Qualifiers

        • Research-article
        • Research
        • Refereed limited

        Conference

        ICCPR 2022

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • 0
          Total Citations
        • 39
          Total Downloads
        • Downloads (Last 12 months)18
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 18 Jan 2025

        Other Metrics

        Citations

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format.

        HTML Format

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media