research-article

MoAFormer: Aggregating Adjacent Window Features into Local Vision Transformer Using Overlapped Attention Mechanism for Volumetric Medical Segmentation

Authors:

Xia DuAuthors Info & Claims

ICCPR '22: Proceedings of the 2022 11th International Conference on Computing and Pattern Recognition

Pages 121 - 127

https://doi.org/10.1145/3581807.3581825

Published: 22 May 2023 Publication History

Abstract

The window-based attention is used to alleviate the problem of abrupt increase in computation as the input image resolution grows and shows excellent performance. However, the problem that aggregating global features from different windows is waiting to be resolved. Swin-Transformer is proposed to construct hierarchical encoding by a shifted-window mechanism to interactively learn the information between different windows. In this work, we investigate the outcome of applying an overlapped attention block (MoA) after the local attention layer and apply plenty to medical image segmentation tasks. The overlapped attention module employs slightly larger and overlapped patches in the key and value to enable neighbouring pixel information transmission, which leads to significant performance gain. The experimental results on the ACDC and Synapse datasets demonstrate that the used method performs better than previous Transformer models.

References

[1]

N. Borzooie, H. Danyali, M. S. J. J. o. I. Helfroush, and Graphics, "Modified Density-Based Data Clustering for Interactive Liver Segmentation," no. 1, 2018.

[2]

J. Wu, G. Li, H. Lu, T. J. J. o. I. Kamiy, and Graphics, "A Supervoxel Classification Based Method for Multi-organ Segmentation from Abdominal CT Images," no. 1, 2021.

[3]

N. Richard, C. Fernandez-Maloigne, C. Bonanomi, A. J. J. o. I. Rizzi, and Graphics, "Fuzzy Color Image Segmentation using Watershed Transform," pp. 157-160, 2013.

[4]

A. Vaswani, "Attention is All you Need," vol. abs/1706.03762, 2017.

[5]

A. Dosovitskiy, "An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale," vol. abs/2010.11929, 2021.

[6]

Z. Liu, "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows," pp. 9992-10002, 2021.

[7]

K. Patel, A. M. Bur, F. Li, and G. J. A. Wang, "Aggregating Global Features into Local Vision Transformer," vol. abs/2201.12903, 2022.

[8]

P. An, Xu, S., Harmon, S. A., Turkbey, E. B., Sanford, T. H., Amalou, A., Kassin, M., Varble, N., Blain, M., Anderson, V., Patella, F., Carrafiello, G., Turkbey, B. T., & Wood, B. J., "CT Images in COVID-19 [Data set]. The Cancer Imaging Archive.," 2020.

[9]

O. Ronneberger, P. Fischer, and T. Brox, "U-Net: Convolutional Networks for Biomedical Image Segmentation," in MICCAI, 2015.

[10]

Ö. Çiçek, A. Abdulkadir, S. S. Lienkamp, T. Brox, and O. J. a. e.-p. Ronneberger, "3D U-Net: Learning Dense Volumetric Segmentation from Sparse Annotation," p. arXiv:1606.06650Accessed on: June 01, 2016Available: https://ui.adsabs.harvard.edu/abs/2016arXiv160606650C

[11]

E. Shelhamer, J. Long, T. J. I. T. o. P. A. Darrell, and M. Intelligence, "Fully Convolutional Networks for Semantic Segmentation," vol. 39, pp. 640-651, 2017.

[12]

K. Kamnitsas, "Efficient multi‐scale 3D CNN with fully connected CRF for accurate brain lesion segmentation," vol. 36, pp. 61–78, 2017.

[13]

A. J. a. e.-p. Myronenko, "3D MRI brain tumor segmentation using autoencoder regularization," p. arXiv:1810.11654Accessed on: October 01, 2018Available: https://ui.adsabs.harvard.edu/abs/2018arXiv181011654M

[14]

Z. Jiang, C. Ding, M. Liu, and D. Tao, "Two-Stage Cascaded U-Net: 1st Place Solution to BraTS Challenge 2019 Segmentation Task," 2020.

[15]

F. Isensee, J. Petersen, S. A. A. Kohl, P. F. Jäger, and K. J. A. Maier-Hein, "nnU-Net: Breaking the Spell on Successful Medical Image Segmentation," vol. abs/1904.08128, 2019.

[16]

A. Hatamizadeh, D. Yang, H. R. Roth, and D. J. I. C. W. C. o. A. o. C. V. Xu, "UNETR: Transformers for 3D Medical Image Segmentation," pp. 1748-1758, 2022.

[17]

H. Cao, "Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation," vol. abs/2105.05537, 2021.

[18]

J. Chen, "TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation," vol. abs/2102.04306, 2021.

[19]

A.-J. Lin, B. Chen, J. Xu, Z. Zhang, and G. J. A. Lu, "DS-TransUNet: Dual Swin Transformer U-Net for Medical Image Segmentation," vol. abs/2106.06716, 2022.

[20]

D. Karimi, S. D. Vasylechko, and A. Gholipour, "Convolution-Free Medical Image Segmentation using Transformers," in MICCAI, 2021.

[21]

H.-Y. Zhou, J. Guo, Y. Zhang, L. Yu, L. Wang, and Y. J. A. Yu, "nnFormer: Interleaved Transformer for Volumetric Segmentation," vol. abs/2109.03201, 2021.

[22]

O. Bernard, "Deep Learning Techniques for Automatic MRI Cardiac Multi-structures Segmentation and Diagnosis: Is the Problem Solved?," pp. 1-1, 2018.

[23]

Z. X. Bennett Landman, Juan Eugenio Igelsias, Martin Styner, Thomas Robin Langerak, Arno Klein. (eds.), "“2015 MICCAI Multi-Atlas Labeling Beyond the Cranial Vault – Workshop and Challenge.”," 2015.

[24]

W. Liu, A. Rabinovich, and A. C. J. A. Berg, "ParseNet: Looking Wider to See Better," vol. abs/1506.04579, 2015.

Index Terms

MoAFormer: Aggregating Adjacent Window Features into Local Vision Transformer Using Overlapped Attention Mechanism for Volumetric Medical Segmentation
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Image segmentation
      2. Computer vision tasks
        Biometrics
  2. Computer graphics
    1. Image manipulation
      1. Image processing

Recommendations

Common Vision-Language Attention for Text-Guided Medical Image Segmentation of Pneumonia
Medical Image Computing and Computer Assisted Intervention – MICCAI 2024
Abstract
Pneumonia, recognized as a severe respiratory disease, has attracted widespread attention in the wake of the COVID-19 pandemic, underscoring the critical need for precise diagnosis and effective treatment. Despite significant advancements in the ...
Learnable weight initialization for volumetric medical image segmentation
Abstract
Hybrid volumetric medical image segmentation models, combining the advantages of local convolution and global attention, have recently received considerable attention. While mainly focusing on architectural modifications, most existing hybrid ...
Highlights
- We propose a learnable weight initialization method that can be integrated into any hybrid volumetric medical segmentation model to effectively train small-scale datasets.
- To learn such a weight initialization, we propose data-...
Deformable Cross-Attention Transformer for Medical Image Registration
Machine Learning in Medical Imaging
Abstract
Transformers have recently shown promise for medical image applications, leading to an increasing interest in developing such models for medical image registration. Recent advancements in designing registration Transformers have focused on using ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICCPR '22: Proceedings of the 2022 11th International Conference on Computing and Pattern Recognition

November 2022

683 pages

ISBN:9781450397056

DOI:10.1145/3581807

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 May 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article
Research
Refereed limited

Conference

ICCPR 2022

ICCPR 2022: 2022 11th International Conference on Computing and Pattern Recognition

November 17 - 19, 2022

Beijing, China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
39
Total Downloads

Downloads (Last 12 months)18
Downloads (Last 6 weeks)0

Reflects downloads up to 18 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents