ABSTRACT
The window-based attention is used to alleviate the problem of abrupt increase in computation as the input image resolution grows and shows excellent performance. However, the problem that aggregating global features from different windows is waiting to be resolved. Swin-Transformer is proposed to construct hierarchical encoding by a shifted-window mechanism to interactively learn the information between different windows. In this work, we investigate the outcome of applying an overlapped attention block (MoA) after the local attention layer and apply plenty to medical image segmentation tasks. The overlapped attention module employs slightly larger and overlapped patches in the key and value to enable neighbouring pixel information transmission, which leads to significant performance gain. The experimental results on the ACDC and Synapse datasets demonstrate that the used method performs better than previous Transformer models.
- N. Borzooie, H. Danyali, M. S. J. J. o. I. Helfroush, and Graphics, "Modified Density-Based Data Clustering for Interactive Liver Segmentation," no. 1, 2018.Google Scholar
- J. Wu, G. Li, H. Lu, T. J. J. o. I. Kamiy, and Graphics, "A Supervoxel Classification Based Method for Multi-organ Segmentation from Abdominal CT Images," no. 1, 2021.Google Scholar
- N. Richard, C. Fernandez-Maloigne, C. Bonanomi, A. J. J. o. I. Rizzi, and Graphics, "Fuzzy Color Image Segmentation using Watershed Transform," pp. 157-160, 2013.Google Scholar
- A. Vaswani , "Attention is All you Need," vol. abs/1706.03762, 2017.Google Scholar
- A. Dosovitskiy , "An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale," vol. abs/2010.11929, 2021.Google Scholar
- Z. Liu , "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows," pp. 9992-10002, 2021.Google Scholar
- K. Patel, A. M. Bur, F. Li, and G. J. A. Wang, "Aggregating Global Features into Local Vision Transformer," vol. abs/2201.12903, 2022.Google Scholar
- P. An, Xu, S., Harmon, S. A., Turkbey, E. B., Sanford, T. H., Amalou, A., Kassin, M., Varble, N., Blain, M., Anderson, V., Patella, F., Carrafiello, G., Turkbey, B. T., & Wood, B. J., "CT Images in COVID-19 [Data set]. The Cancer Imaging Archive.," 2020.Google Scholar
- O. Ronneberger, P. Fischer, and T. Brox, "U-Net: Convolutional Networks for Biomedical Image Segmentation," in MICCAI, 2015.Google Scholar
- Ö. Çiçek, A. Abdulkadir, S. S. Lienkamp, T. Brox, and O. J. a. e.-p. Ronneberger, "3D U-Net: Learning Dense Volumetric Segmentation from Sparse Annotation," p. arXiv:1606.06650Accessed on: June 01, 2016Available: https://ui.adsabs.harvard.edu/abs/2016arXiv160606650CGoogle Scholar
- E. Shelhamer, J. Long, T. J. I. T. o. P. A. Darrell, and M. Intelligence, "Fully Convolutional Networks for Semantic Segmentation," vol. 39, pp. 640-651, 2017.Google Scholar
- K. Kamnitsas , "Efficient multi‐scale 3D CNN with fully connected CRF for accurate brain lesion segmentation," vol. 36, pp. 61–78, 2017.Google Scholar
- A. J. a. e.-p. Myronenko, "3D MRI brain tumor segmentation using autoencoder regularization," p. arXiv:1810.11654Accessed on: October 01, 2018Available: https://ui.adsabs.harvard.edu/abs/2018arXiv181011654MGoogle Scholar
- Z. Jiang, C. Ding, M. Liu, and D. Tao, "Two-Stage Cascaded U-Net: 1st Place Solution to BraTS Challenge 2019 Segmentation Task," 2020.Google Scholar
- F. Isensee, J. Petersen, S. A. A. Kohl, P. F. Jäger, and K. J. A. Maier-Hein, "nnU-Net: Breaking the Spell on Successful Medical Image Segmentation," vol. abs/1904.08128, 2019.Google Scholar
- A. Hatamizadeh, D. Yang, H. R. Roth, and D. J. I. C. W. C. o. A. o. C. V. Xu, "UNETR: Transformers for 3D Medical Image Segmentation," pp. 1748-1758, 2022.Google Scholar
- H. Cao , "Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation," vol. abs/2105.05537, 2021.Google Scholar
- J. Chen , "TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation," vol. abs/2102.04306, 2021.Google Scholar
- A.-J. Lin, B. Chen, J. Xu, Z. Zhang, and G. J. A. Lu, "DS-TransUNet: Dual Swin Transformer U-Net for Medical Image Segmentation," vol. abs/2106.06716, 2022.Google Scholar
- D. Karimi, S. D. Vasylechko, and A. Gholipour, "Convolution-Free Medical Image Segmentation using Transformers," in MICCAI, 2021.Google Scholar
- H.-Y. Zhou, J. Guo, Y. Zhang, L. Yu, L. Wang, and Y. J. A. Yu, "nnFormer: Interleaved Transformer for Volumetric Segmentation," vol. abs/2109.03201, 2021.Google Scholar
- O. Bernard , "Deep Learning Techniques for Automatic MRI Cardiac Multi-structures Segmentation and Diagnosis: Is the Problem Solved?," pp. 1-1, 2018.Google Scholar
- Z. X. Bennett Landman, Juan Eugenio Igelsias, Martin Styner, Thomas Robin Langerak, Arno Klein. (eds.), "“2015 MICCAI Multi-Atlas Labeling Beyond the Cranial Vault – Workshop and Challenge.”," 2015.Google Scholar
- W. Liu, A. Rabinovich, and A. C. J. A. Berg, "ParseNet: Looking Wider to See Better," vol. abs/1506.04579, 2015.Google Scholar
Index Terms
- MoAFormer: Aggregating Adjacent Window Features into Local Vision Transformer Using Overlapped Attention Mechanism for Volumetric Medical Segmentation
Recommendations
Deformable Cross-Attention Transformer for Medical Image Registration
Machine Learning in Medical ImagingAbstractTransformers have recently shown promise for medical image applications, leading to an increasing interest in developing such models for medical image registration. Recent advancements in designing registration Transformers have focused on using ...
Med-DANet: Dynamic Architecture Network for Efficient Medical Volumetric Segmentation
Computer Vision – ECCV 2022AbstractFor 3D medical image (e.g. CT and MRI) segmentation, the difficulty of segmenting each slice in a clinical case varies greatly. Previous research on volumetric medical image segmentation in a slice-by-slice manner conventionally use the identical ...
3D Volumetric CT Liver Segmentation Using Hybrid Segmentation Techniques
SOCPAR '09: Proceedings of the 2009 International Conference of Soft Computing and Pattern RecognitionThe first step for computer-aided diagnosis for liver of CT scans is the identification of liver region. To deal with multislice CT scans, automatic liver segmentation is required. In this paper, we propose a liver segmentation algorithm using hybrid ...
Comments