Combining frequency transformer and CNNs for medical image segmentation

Labbihi, Ismayl; El Meslouhi, Othmane; Benaddy, Mohamed; Kardouchi, Mustapha; Akhloufi, Moulay

doi:10.1007/s11042-023-16279-9

Combining frequency transformer and CNNs for medical image segmentation

Published: 31 July 2023

Volume 83, pages 21197–21212, (2024)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Ismayl Labbihi¹,
Othmane El Meslouhi²,
Mohamed Benaddy¹,
Mustapha Kardouchi³ &
…
Moulay Akhloufi³

416 Accesses
Explore all metrics

Abstract

Image segmentation is one of the most challenging and difficult tasks in digital image processing. It has many medical applications such as cancerous tumors segmentation, organ segmentation, or abnormalities segmentation. Recent techniques combining convolution-based models and transformers are proposed for automatic medical segmentation tasks. These techniques achieve good results but require much time and resources. In this paper, we propose a new model to segment medical images which combines CNNs and frequency transformers in a parallel way to minimize the number of parameters and to reduce computation time. This work presents a powerful model, composed of two main branches, able to learn global-local feature interactions which are currently in a medical image. The first branch based on Frequency Transformer (FT) employs Fourier Transform instead of multi-head attention to capture global dependencies. While a no-deeper convolutional neural network (CNN) is employed to get rich local information. With a small number of parameters, the proposed model was tested on many public medical image databases and achieves state-of-the-art results for lesion/tumor segmentation tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

U-Net: Convolutional Networks for Biomedical Image Segmentation

UNet++: A Nested U-Net Architecture for Medical Image Segmentation

Swin-Unet: Unet-Like Pure Transformer for Medical Image Segmentation

Data Availability

All data are available on their link.

References

Al-Masni MA, Al-Antari MA, Choi M-T, Han S-M, Kim T-S (2018) Skin lesion segmentation in dermoscopy images via deep full resolution convolutional networks. Comput Methods Prog Biomed 162:221–231
Article Google Scholar
Alom, MZ, Hasan, M, Yakopcic, C, Taha, TM, Asari, VK (2018) Recurrent residual convolutional neural network based on u-net (r2u-net) for medical image segmentation. arXiv preprint arXiv:1802.06955
An, F-P, Liu, Z-W (2019) Medical image segmentation algorithm based on feedback mechanism cnn. Contrast Media & Molecular Imaging 2019
Bahdanau, D, Cho, K, Bengio, Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473
Bernal J, Sánchez FJ, Fernández-Esparrach G, Gil D, Rodríguez C, Vilariño F (2015) Wm-dova maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians. Comput Med Imaging Graph 43:99–111
Article PubMed Google Scholar
Bi L, Kim J, Ahn E, Kumar A, Feng D, Fulham M (2019) Step-wise integration of deep class-specific learning for dermoscopic image segmentation. Pattern Recog 85:78–89
Article ADS Google Scholar
Cao, H, Wang, Y, Chen, J, Jiang, D, Zhang, X, Tian, Q, Wang, M (2021) Swin-unet: Unet-like pure transformer for medical image segmentation. arXiv preprint arXiv:2105.05537
Chen, L-C, Papandreou, G, Schroff, F, Adam, H (2017) Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587
Chen, J, Lu, Y, Yu, Q, Luo, X, Adeli, E, Wang, Y, Lu, L, Yuille, AL, Zhou, Y (2021) Transunet: Transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.04306
Chu, JL, Krzyżak, A (2014) Analysis of feature maps selection in supervised learning using convolutional neural networks. In: Advances in artificial intelligence: 27th Canadian conference on artificial intelligence, Canadian AI 2014, Montréal, QC, Canada, May 6-9, 2014. Proceedings 27, pp 59–70. Springer
Codella, NC, Gutman, D, Celebi, ME, Helba, B, Marchetti, MA, Dusza, SW, Kalloo, A, Liopyris, K, Mishra, N, Kittler, H, et al (2018) Skin lesion analysis toward melanoma detection: A challenge at the 2017 international symposium on biomedical imaging (isbi):hosted by the international skin imaging collaboration (isic). In: 2018 IEEE 15th international symposium on biomedical imaging (ISBI 2018) pp 168–172. IEEE
Dai Y, Gao Y, Liu F (2021) Transmed: Transformers advance multi-modal medical image classification. Diagnostics 11(8):1384
Article PubMed PubMed Central Google Scholar
Deng, J, Dong, W, Socher, R, Li, L-J, Li, K, Fei-Fei, L (2009) Imagenet: A large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, pp 248–255. Ieee
Dosovitskiy, A, Beyer, L, Kolesnikov, A, Weissenborn, D, Zhai, X, Unterthiner, T, Dehghani, M, Minderer, M, Heigold, G, Gelly, S, et al (2020) An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929
Fan, D-P, Ji, G-P, Zhou, T, Chen, G, Fu, H, Shen, J, Shao, L (2020) Pranet: Parallel reverse attention network for polyp segmentation. In: International conference on medical image computing and computer-assisted intervention, pp 263–273. Springer
Fan, H, Xiong, B, Mangalam, K, Li, Y, Yan, Z, Malik, J, Feichtenhofer, C (2021) Multiscale vision transformers. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV) pp 6824–6835
Hara, K, Kataoka, H, Satoh, Y (2018) Can spatiotemporal 3d cnns retrace the history of 2d cnns and imagenet? In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6546–6555
Hatamizadeh, A, Tang, Y, Nath, V, Yang, D, Myronenko, A, Landman, B, Roth, HR, Xu, D (2022) Unetr: Transformers for 3d medical image segmentation. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 574–584
He, K, Zhang, X, Ren, S, Sun, J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Heckbert P (1995) Fourier transforms and the fast fourier transform (fft) algorithm. Comput Graph 2:15–463
Google Scholar
Hesamian MH, Jia W, He X, Kennedy P (2019) Deep learning techniques for medical image segmentation: achievements and challenges. J Digit Imaging 32(4):582–596
Article PubMed PubMed Central Google Scholar
Huang, C-H, Wu, H-Y, Lin, Y-L (2021) Hardnet-mseg: a simple encoder-decoder polyp segmentation neural network that achieves over 0.9 mean dice and 86 fps. arXiv preprint arXiv:2101.07172
Huang, XS, Perez, F, Ba, J, Volkovs, M (2020) Improving transformer optimization through better initialization. In: International conference on machine learning, pp 4475–4483. PMLR
Huang, H, Lin, L, Tong, R, Hu, H, Zhang, Q, Iwamoto, Y, Han, X, Chen, Y-W, Wu, J (2020) Unet 3+: A full-scale connected unet for medical image segmentation. In: ICASSP 2020-2020 IEEE international conference on acoustics, speech and signal processing (ICASSP) pp 1055–1059. IEEE
Isensee, F, Jäger, PF, Kohl, SA, Petersen, J, Maier-Hein, KH (2019) Automated design of deep learning methods for biomedical image segmentation. arXiv preprint arXiv:1904.08128
Jena B, Jain S, Nayak GK, Saxena S (2023) Analysis of depth variation of u-net architecture for brain tumor segmentation. Multimedia Tools and Applications 82(7):10723–10743
Article Google Scholar
Jha, D, Riegler, MA, Johansen, D, Halvorsen, P, Johansen, HD (2020) Doubleu-net: A deep convolutional neural network for medical image segmentation. In: 2020 IEEE 33rd International symposium on computer-based medical systems (CBMS) pp 58–564. IEEE
Jha, D, Smedsrud, PH, Riegler, MA, Halvorsen, P, Lange, Td, Johansen, D, Johansen, HD (2020) Kvasir-seg: A segmented polyp dataset. In: International conference on multimedia modeling, pp 451–462. Springer
Juneja P, Kashyap R (2016) Energy based methods for medical image segmentation. Int J Comput Appl 146(6):22–27
Google Scholar
Kayalibay, B, Jensen, G, van der Smagt, P (2017) Cnn-based segmentation of medical imaging data. arXiv preprint arXiv:1701.03056
Lee-Thorp, J, Ainslie, J, Eckstein, I, Ontanon, S (2021) Fnet: Mixing tokens with fourier transforms. arXiv preprint arXiv:2105.03824
Li, Y, Wang, Z, Yin, L, Zhu, Z, Qi, G, Liu, Y (2021) X-net: a dual encoding–decoding method in medical image segmentation. The Visual Computer, pp 1–11
Li X, Chen H, Qi X, Dou Q, Fu C-W, Heng P-A (2018) H-denseunet: hybrid densely connected unet for liver and tumor segmentation from ct volumes. IEEE Trans Med Imaging 37(12):2663–2674
Article PubMed Google Scholar
Li H, He X, Zhou F, Yu Z, Ni D, Chen S, Wang T, Lei B (2018) Dense deconvolutional network for skin lesion segmentation. IEEE Journal of Biomedical and Health Informatics 23(2):527–537
Article PubMed Google Scholar
Lin, A, Chen, B, Xu, J, Zhang, Z, Lu, G (2021) Ds-transunet: Dual swin transformer u-net for medical image segmentation. arXiv preprint arXiv:2106.06716
Liu, Z, Lin, Y, Cao, Y, Hu, H, Wei, Y, Zhang, Z, Lin, S, Guo, B (2021) Swin transformer: Hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International conference on computer vision, pp 10012–10022
Liu, W, Tian, T, Xu, W, Yang, H, Pan, X, Yan, S, Wang, L (2022) Phtrans: Parallelly aggregating global and local representations for medical image segmentation. In: Medical image computing and computer assisted intervention–MICCAI 2022: 25th International Conference, Singapore, September 18–22, 2022, Proceedings, Part V, pp 235–244. Springer
Luo, H, Changdong, Y, Selvan, R (2022) Hybrid ladder transformers with efficient parallel-cross attention for medical image segmentation. In: International conference on medical imaging with deep learning, pp 808–819. PMLR
Masulli F, Schenone A (1999) A fuzzy clustering based segmentation system as support to diagnosis in medical imaging. Artif Intell Med 16(2):129–147
Article CAS PubMed Google Scholar
Nasreen G, Haneef K, Tamoor M, Irshad A (2023) A comparative study of state-of-the-art skin image segmentation techniques with cnn. Multimedia Tools and Applications 82(7):10921–10942
Article Google Scholar
Paszke, A, Gross, S, Massa, F, Lerer, A, Bradbury, J, Chanan, G, Killeen, T, Lin, Z, Gimelshein, N, Antiga, L, et al (2019) Pytorch: An imperative style, high-performance deep learning library. Advances in Neural Information Processing Systems 32
Patil DD, Deore SG (2013) Medical image segmentation: a review. International Journal of Computer Science and Mobile Computing 2(1):22–27
Google Scholar
Rao, Y, Zhao, W, Zhu, Z, Lu, J, Zhou, J (2021) Global filter networks for image classification. Advances in Neural Information Processing Systems 34
Rezatofighi, H, Tsoi, N, Gwak, J, Sadeghian, A, Reid, I, Savarese, S (2019) Generalized intersection over union: A metric and a loss for bounding box regression. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 658–666
Ronneberger, O, Fischer, P, Brox, T (2015) U-net: Convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention, pp 234–241. Springer
Sarker, M, Kamal, M, Rashwan, HA, Akram, F, Banu, SF, Saleh, A, Singh, VK, Chowdhury, FU, Abdulwahab, S, Romani, S, et al (2018) Slsdeep: Skin lesion segmentation based on dilated residual and pyramid pooling networks. In: International conference on medical image computing and computer-assisted intervention, pp 21–29. Springer
Schlemper, J, Oktay, O, Schaap, M, Heinrich, M, Kainz, B, Glocker, B, Rueckert, D (2019) Attention gated networks: Learning to leverage salient regions in medical images.Med Image Anal 53:197–207
Shamir, RR, Duchin, Y, Kim, J, Sapiro, G, Harel, N (2019) Continuous dice coefficient: a method for evaluating probabilistic segmentations. arXiv preprint arXiv:1906.11031
Sharma N, Aggarwal LM (2010) Automated medical image segmentation techniques. Journal of Medical Physics/Association of Medical Physicists of India 35(1):3
PubMed Central Google Scholar
Silva J, Histace A, Romain O, Dray X, Granado B (2014) Toward embedded detection of polyps in wce images for early diagnosis of colorectal cancer. Int J CARS 9(2):283–293
Article Google Scholar
Srivastava, RK, Greff, K, Schmidhuber, J (2015) Highway networks. arXiv preprint arXiv:1505.00387
Sun, Q, Fang, N, Liu, Z, Zhao, L, Wen, Y, Lin, H, et al (2021) Hybridctrm: Bridging cnn and transformer for multimodal brain image segmentation. Journal of Healthcare Engineering 2021
Tajbakhsh N, Gurudu SR, Liang J (2015) Automated polyp detection in colonoscopy videos using shape and context information. IEEE Trans Med Imaging 35(2):630–644
Article PubMed Google Scholar
Taud, H, Mas, J (2018) Multilayer perceptron (mlp). Geomatic approaches for modeling land change scenarios, pp 451–455
Tomar, NK, Jha, D, Riegler, MA, Johansen, HD, Johansen, D, Rittscher, J, Halvorsen, P, Ali, S (2022) Fanet: A feedback attention network for improved biomedical image segmentation. IEEE Transactions on Neural Networks and Learning Systems
Touvron, H, Cord, M, Douze, M, Massa, F, Sablayrolles, A, Jégou, H (2021) Training data-efficient image transformers & distillation through attention. In: International conference on machine learning, pp 10347–10357. PMLR
Valanarasu, JMJ, Sindagi, VA, Hacihaliloglu, I, Patel, VM (2020) Kiu-net: Towards accurate segmentation of biomedical images using over-complete representations. In: Medical image computing and computer assisted intervention–MICCAI 2020: 23rd International Conference, Lima, Peru, October 4–8, 2020, Proceedings, Part IV 23, pp 363–373. Springer
Vaswani, A, Shazeer, N, Parmar, N, Uszkoreit, J, Jones, L, Gomez, AN, Kaiser, Ł, Polosukhin, I (2017) Attention is all you need. Advances in Neural Information Processing Systems 30
Vázquez, D, Bernal, J, Sánchez, F.J, Fernández-Esparrach, G, López, A.M, Romero, A, Drozdzal, M, Courville, A (2017) A benchmark for endoluminal scene segmentation of colonoscopy images. Journal of Healthcare Engineering 2017
Wang, L, Fang, S, Zhang, C, Li, R, Duan, C (2021) Efficient hybrid transformer: Learning global-local context for urban scene segmentation. arXiv preprint arXiv:2109.08937
Wang, X, Girshick, R, Gupta, A, He, K (2018) Non-local neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7794–7803
Weng Y, Zhou T, Li Y, Qiu X (2019) Nas-unet: Neural architecture search for medical image segmentation. IEEE Access 7:44247–44257
Article Google Scholar
Winograd S (1976) On computing the discrete fourier transform. Proc Natl Acad Sci 73(4):1005–1006
Article ADS MathSciNet CAS PubMed PubMed Central Google Scholar
Xiao, X, Lian, S, Luo, Z, Li, S (2018) Weighted res-unet for high-quality retina vessel segmentation. In: 2018 9th International conference on information technology in medicine and education (ITME) pp 327–331. IEEE
Xu, S, Quan, H (2021) Ect-nas: Searching efficient cnn-transformers architecture for medical image segmentation. In: 2021 IEEE international conference on bioinformatics and biomedicine (BIBM) pp 1601–1604 (2021). https://doi.org/10.1109/BIBM52615.2021.9669734
Zhang, Y, Liu, H, Hu, Q (2021) Transfuse: Fusing transformers and cnns for medical image segmentation. In: International conference on medical image computing and computer-assisted intervention, pp 14–24 . Springer
Zheng, S, Lu, J, Zhao, H, Zhu, X, Luo, Z, Wang, Y, Fu, Y, Feng, J, Xiang, T, Torr, P.H, et al (2021) Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6881–6890
Zhou, H-Y, Guo, J, Zhang, Y, Yu, L, Wang, L, Yu, Y (2021) nnformer: Interleaved transformer for volumetric segmentation. arXiv preprint arXiv:2109.03201
Zhou, Z, Rahman Siddiquee, MM, Tajbakhsh, N, Liang, J (2018) Unet++: A nested u-net architecture for medical image segmentation. In: Deep learning in medical image analysis and multimodal learning for clinical decision support: 4th international workshop, DLMIA 2018, and 8th International Workshop, ML-CDS 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, September 20, 2018, Proceedings 4, pp 3–11. Springer
Zhou Z, Siddiquee MMR, Tajbakhsh N, Liang J (2019) Unet++: Redesigning skip connections to exploit multiscale features in image segmentation. IEEE Trans Med Imaging 39(6):1856–1867
Article PubMed PubMed Central Google Scholar

Download references

Author information

Authors and Affiliations

LabSI Laboratory, Faculty of Sciences, Ibn Zohr University, 80000, Agadir, Morocco
Ismayl Labbihi & Mohamed Benaddy
SARS Group, National School of Applied Sciences, Cadi Ayyad University, 46000, Safi, Morocco
Othmane El Meslouhi
PRIME Laboratory, Department of Computer Sciences, Université de Moncton, 18 Antonine-Maillet Ave, E1A 3E9, Moncton, NB, Canada
Mustapha Kardouchi & Moulay Akhloufi

Authors

Ismayl Labbihi
View author publications
You can also search for this author in PubMed Google Scholar
Othmane El Meslouhi
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed Benaddy
View author publications
You can also search for this author in PubMed Google Scholar
Mustapha Kardouchi
View author publications
You can also search for this author in PubMed Google Scholar
Moulay Akhloufi
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Ismayl Labbihi: Conceptualization Data curation; Formal analysis; Investigation; Methodology; Resources; Software; Validation; Visualization; Writing - original draft. Othmane El Meslouhi: Conceptualization; Formal analysis; Investigation; Methodology; Resources; Validation; supervision; Writing - review and editing. Mohamed Benaddy: Conceptualization; Formal analysis; Investigation; Methodology; Validation; supervision; Writing - review and editing. Mustapha Kardouchi: Project administration; Supervision; Validation: Visualization; Writing - review and editing; Resources;Data curation. Moulay Akhloufi: Project administration; Supervision; Validation; Visualization; Resources; Data curation.

Corresponding author

Correspondence to Ismayl Labbihi.

Ethics declarations

Conflicts of interest

The responsibility for the content of this article rests with its authors. Also, the authors report that is no conflict of interest between the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Othmane El Meslouhi, Mohamed Benaddy, Mustapha Kardouchi and Moulay Akhloufi contributed equally to this work.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Labbihi, I., El Meslouhi, O., Benaddy, M. et al. Combining frequency transformer and CNNs for medical image segmentation. Multimed Tools Appl 83, 21197–21212 (2024). https://doi.org/10.1007/s11042-023-16279-9

Download citation

Received: 17 August 2022
Revised: 01 July 2023
Accepted: 04 July 2023
Published: 31 July 2023
Issue Date: February 2024
DOI: https://doi.org/10.1007/s11042-023-16279-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Combining frequency transformer and CNNs for medical image segmentation

Abstract

Access this article

Similar content being viewed by others

U-Net: Convolutional Networks for Biomedical Image Segmentation

UNet++: A Nested U-Net Architecture for Medical Image Segmentation

Swin-Unet: Unet-Like Pure Transformer for Medical Image Segmentation

Data Availability

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Combining frequency transformer and CNNs for medical image segmentation

Abstract

Access this article

Similar content being viewed by others

U-Net: Convolutional Networks for Biomedical Image Segmentation

UNet++: A Nested U-Net Architecture for Medical Image Segmentation

Swin-Unet: Unet-Like Pure Transformer for Medical Image Segmentation

Data Availability

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation