Fusing Convolution and Self-Attention Parallel in Frequency Domain for Image Deblurring

Huang, Xuandong; He, JingSong

doi:10.1007/s11063-023-11228-x

Fusing Convolution and Self-Attention Parallel in Frequency Domain for Image Deblurring

Published: 22 March 2023

Volume 55, pages 9811–9829, (2023)
Cite this article

Neural Processing Letters Aims and scope Submit manuscript

Xuandong Huang¹ &
JingSong He¹

336 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Deblurring is a challenging problem in image restoration. It’s important to use both local details and global information of the image for deblurring. Therefore, this paper proposes a deblurring model that integrates Convolution and Transformer in a parallel manner. Unlike existing methods that use single operators or serial combinations, the convolution operation and self-attention mechanism in this model are learned in parallel and extract features separately, and then the features are fused in the frequency domain. The convolution operation is beneficial in extracting local information while the self-attention mechanism focuses more on global information. The parallel structure enables the model to capture both local and global information simultaneously. Additionally, the frequency domain fusion module is proposed based on the analysis of the mathematical model of image blur, and the results indicate that the proposed model is reasonable. Experiments on multiple deblurring datasets verify that the proposed parallel structure of Convolution-Transformer and frequency domain fusion method are effective.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Fig. 8

Methods for image denoising using convolutional neural network: a review

Article Open access 10 June 2021

Learning a Deep Convolutional Network for Image Super-Resolution

Single image super-resolution: a comprehensive review and recent insight

Article 04 September 2023

Data Availibility

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

References

Li J, Tan W, Yan B (2021) Perceptual variousness motion deblurring with light global context refinement. In: 2021 IEEE/CVF international conference on computer vision (ICCV), pp. 4096–4105 https://doi.org/10.1109/ICCV48922.2021.00408
Agrawal A, Raskar R (2009) Optimal single image capture for motion deblurring. In: 2009 IEEE conference on computer vision and pattern recognition, pp. 2560–2567 https://doi.org/10.1109/CVPR.2009.5206546
Xu L, Zheng S, Jia J (2013) Unnatural L0 sparse representation for natural image deblurring. In: 2013 IEEE conference on computer vision and pattern recognition, Portland, OR, USA, June 23-28, pp. 1107–1114. https://doi.org/10.1109/CVPR.2013.147
Ge X, Tan J, Zhang L (2021) Blind image deblurring using a non-linear channel prior based on dark and bright channels. IEEE Trans Image Process 30:6970–6984. https://doi.org/10.1109/TIP.2021.3101154
Article MathSciNet Google Scholar
Tao X, Gao H, Shen X, Wang J, Jia J (2018) Scale-recurrent network for deep image deblurring. In: 2018 IEEE conference on computer vision and pattern recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, pp. 8174–8182. https://doi.org/10.1109/CVPR.2018.00853
Cho S, Ji, S, Hong J, Jung S, Ko S (2021) Rethinking coarse-to-fine approach in single image deblurring. In: 2021 IEEE/CVF international conference on computer vision, ICCV 2021, Montreal, QC, Canada, October 10-17, 2021, pp. 4621–4630. https://doi.org/10.1109/ICCV48922.2021.00460
Zamir SW, Arora A, Khan SH, Hayat M, Khan FS, Yang M, Shao L (2021) Multi-stage progressive image restoration. In: IEEE conference on computer vision and pattern recognition, CVPR 2021, Virtual, June 19-25, pp. 14821–14831. https://doi.org/10.1109/WACV48630.2021.00275
Zhang H, Dai Y, Li H, Koniusz P (2019) Deep stacked hierarchical multi-patch network for image deblurring. In: IEEE conference on computer vision and pattern recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, pp. 5978–5986. https://doi.org/10.1109/CVPR.2019.00613
Fu Z, Zheng Y, Ma T, Ye H, Yang J, He L (2022) Edge-aware deep image deblurring. Neurocomputing 502:37–47. https://doi.org/10.1016/j.neucom.2022.06.051
Article Google Scholar
Chen H, Wang Y, Guo T, Xu C, Deng Y, Liu Z, Ma S, Xu C, Xu C, Gao W (2020) Pre-trained image processing transformer. CoRR arXiv:2012.00364
Liang J, Cao J, Sun G, Zhang K, Van Gool L, Timofte R (2021) Swinir: image restoration using swin transformer. In: proceedings of the IEEE/CVF international conference on computer vision, pp. 1833–1844
Wu H, Xiao B, Codella N, Liu M, Dai X, Yuan L, Zhang L (2021) Cvt: Introducing convolutions to vision transformers. In: 2021 IEEE/CVF international conference on computer vision, ICCV 2021, Montreal, QC, Canada, October 10-17, pp. 22–31. https://doi.org/10.1109/ICCV48922.2021.00009
Yuan K, Guo S, Liu Z, Zhou A, Yu F, Wu W (2021) Incorporating convolution designs into visual transformers. In: 2021 IEEE/CVF international conference on computer vision, ICCV 2021, Montreal, QC, Canada, October 10-17, pp. 559–568. https://doi.org/10.1109/ICCV48922.2021.00062
Li M, Shan L, Li X, Bai Y, Zhou D, Wang W, Lv K, Luo B, Chen S (2020) Global-local attention network for semantic segmentation in aerial images. In: 25th international conference on pattern recognition, ICPR 2020, Virtual Event / Milan, Italy, January 10-15, pp. 5704–5711. https://doi.org/10.1109/ICPR48806.2021.9412089. https://doi.org/10.1109/ICPR48806.2021.9412089
Cheng HK, Chung J, Tai Y, Tang C (2020) Cascadepsp: Toward class-agnostic and very high-resolution segmentation via global and local refinement. In: 2020 IEEE/CVF conference on computer vision and pattern recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020, IEEE, pp. 8887–8896. https://doi.org/10.1109/CVPR42600.2020.00891
Song M, Song W, Yang G, Chen C (2022) Improving RGB-D salient object detection via modality-aware decoder. IEEE Trans Image Process 31:6124–6138. https://doi.org/10.1109/TIP.2022.3205747
Article Google Scholar
Wang G, Chen C, Fan D, Hao A, Qin H (2021) From semantic categories to fixations: A novel weakly-supervised visual-auditory saliency detection approach. In: IEEE conference on computer vision and pattern recognition, CVPR 2021, Virtual, June 19-25, 2021, pp. 15119–15128. Computer Vision Foundation / IEEE, https://doi.org/10.1109/CVPR46437.2021.01487. https://openaccess.thecvf.com/content/CVPR2021/html/Wang_From_Semantic_Categories_to_Fixations_A_Novel_Weakly-Supervised_Visual-Auditory_Saliency_CVPR_2021_paper.html
Ma G, Li S, Chen C, Hao A, Qin H (2021) Rethinking image salient object detection: object-level semantic saliency reranking first, pixelwise saliency refinement later. IEEE Trans Image Process 30:4238–4252. https://doi.org/10.1109/TIP.2021.3068649
Article Google Scholar
Wang Z, Cun X, Bao J, Zhou W, Liu J, Li H (2022) Uformer: A general u-shaped transformer for image restoration. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 17683-17693
Zamir SW, Arora A, Khan S, Hayat M, Khan FS, Yang MH (2022) Restormer: Efficient transformer for high-resolution image restoration. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 5728-5739
Wang T, Zhang X, Jiang R, Zhao L, Chen H, Luo W (2021) Video deblurring via spatiotemporal pyramid network and adversarial gradient prior. Comput Vis Image Underst 203:103135. https://doi.org/10.1016/j.cviu.2020.103135
Article Google Scholar
Zhang K, Luo W, Zhong Y, Ma L, Liu W, Li H (2019) Adversarial Spatio-temporal learning for video deblurring. IEEE Trans Image Process 28(1):291–301. https://doi.org/10.1109/TIP.2018.2867733
Article MathSciNet Google Scholar
Li Z, Kovachki NB, Azizzadenesheli K, Liu B, Bhattacharya K, Stuart AM, Anandkumar A (2021) Fourier neural operator for parametric partial differential equations. In: 9th international conference on learning representations, ICLR 2021, Virtual Event, Austria, May 3-7
Guibas J, Mardani M, Li Z, Tao A, Anandkumar A, Catanzaro B (2021) Efficient token mixing for transformers via adaptive fourier neural operators. In: The Tenth International Conference on Learning Representations, (ICLR) 2022, Virtual event, April 25–29, 2022. OpenReview.net
Rao Y, Zhao W, Zhu Z, Lu J, Zhou J (2021) Global filter networks for image classification. In: Ranzato M, Beygelzimer A, Dauphin YN, Liang P, Vaughan JW (eds.) Advances in neural information processing systems 34: annual conference on neural information processing systems 2021, NeurIPS 2021, December 6-14, 2021, Virtual, pp. 980–993
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N (2021) An image is worth 16x16 words: transformers for image recognition at scale. In: 9th International Conference on Learning Representations, (ICLR) 2021, Virtual event, Austria, May 3–7, 2021. OpenReview.net. https://openreview.net/forum?id=YicbFdNTTy
Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin transformer: Hierarchical vision transformer using shifted windows. In: 2021 IEEE/CVF international conference on computer vision, ICCV 2021, Montreal, QC, Canada, October 10-17, pp. 9992–10002. https://doi.org/10.1109/ICCV48922.2021.00986
Kupyn O, Martyniuk T, Wu J, Wang Z (2019) Deblurgan-v2: Deblurring (orders-of-magnitude) faster and better. In: 2019 IEEE/CVF international conference on computer vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, pp. 8877–8886. https://doi.org/10.1109/ICCV.2019.00897
Kupyn O, Budzan V, Mykhailych M, Mishkin D, Matas J (2018) Deblurgan: Blind motion deblurring using conditional adversarial networks. In: 2018 IEEE conference on computer vision and pattern recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, pp. 8183–8192
Pan J, Dong J, Liu Y, Zhang J, Ren JSJ, Tang J, Tai Y, Yang M (2021) Physics-based generative adversarial models for image restoration and beyond. IEEE Trans Pattern Anal Mach Intell 43(7):2449–2462. https://doi.org/10.1109/TPAMI.2020.2969348
Article Google Scholar
Vahdat A, Kautz J (2020) NVAE: A deep hierarchical variational autoencoder. In: Larochelle H, Ranzato M, Hadsell R, Balcan M, Lin H (eds.) Advances in neural information processing systems 33: annual conference on neural information processing systems 2020, NeurIPS 2020, December 6-12, 2020, Virtual
Zhou S, Zhang J, Zuo W, Xie H, Pan J, Ren JS (2019) Davanet: Stereo deblurring with view aggregation. In: IEEE conference on computer vision and pattern recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, pp. 10996–11005. https://doi.org/10.1109/CVPR.2019.01125
Yan Y, Wu Q, Xu B, Zhang J, Ren W (2020) Vdflow: Joint learning for optical flow and video deblurring. In: 2020 IEEE/CVF conference on computer vision and pattern recognition, CVPR Workshops 2020, Seattle, WA, USA, June 14-19, pp. 3808–3816. https://doi.org/10.1109/CVPRW50498.2020.00444
Pan J, Ren W, Hu Z, Yang M (2019) Learning to deblur images with exemplars. IEEE Trans Pattern Anal Mach Intell 41(6):1412–1425. https://doi.org/10.1109/TPAMI.2018.2832125
Article Google Scholar
Tsai F, Peng Y, Lin Y, Tsai C, Lin C (2021) Banet: Blur-aware attention networks for dynamic scene deblurring. CoRR arxiv:2101.07518
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Guyon I, von Luxburg U, Bengio S, Wallach HM, Fergus R, Vishwanathan SVN, Garnett R (eds.) Advances in neural information processing systems 30: annual conference on neural information processing systems 2017, December 4-9, Long Beach, CA, USA, pp. 5998–6008
Xu W, Xu Y, Chang TA, Tu Z (2021) Co-scale conv-attentional image transformers. In: 2021 IEEE/CVF international conference on computer vision, ICCV 2021, Montreal, QC, Canada, October 10-17, pp. 9961–9970. https://doi.org/10.1109/ICCV48922.2021.00983
Yang F, Xiao L, Yang J (2020) Video deblurring via 3d CNN and fourier accumulation learning. In: 2020 IEEE international conference on acoustics, speech and signal processing, ICASSP 2020, Barcelona, Spain, May 4-8, pp. 2443–2447. https://doi.org/10.1109/ICASSP40776.2020.9054514
Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: Navab N, Hornegger J, III WMW, Frangi AF (eds.) Medical image computing and computer-assisted intervention - miccai 2015 - 18th international conference munich, Germany, October 5 - 9, Proceedings, Part III, pp. 234–241. https://doi.org/10.1007/978-3-319-24574-4_28
McGillem CD, Cooper GR (1991) Continuous and discrete signal and system analysis. Oxford University Press, Oxford
MATH Google Scholar
Nah S, Kim TH, Lee KM (2017) Deep multi-scale convolutional neural network for dynamic scene deblurring. In: 2017 IEEE conference on computer vision and pattern recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, pp. 257–265. https://doi.org/10.1109/CVPR.2017.35
Shen Z, Wang W, Lu X, Shen J, Ling H, Xu T, Shao L (2019) Human-aware motion deblurring. In: 2019 IEEE/CVF international conference on computer vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, pp. 5571–5580. https://doi.org/10.1109/ICCV.2019.00567
Charbonnier P, Blanc-Féraud L, Aubert G, Barlaud M (1994) Two deterministic half-quadratic regularization algorithms for computed imaging. In: Proceedings 1994 international conference on image processing, Austin, Texas, USA, November 13-16, pp. 168–172. https://doi.org/10.1109/ICIP.1994.413553
Chen L, Lu X, Zhang J, Chu X, Chen C (2021) Hinet: Half instance normalization network for image restoration. In: IEEE conference on computer vision and pattern recognition workshops, CVPR Workshops 2021, Virtual, June 19-25, pp. 182–192. https://doi.org/10.1109/CVPRW53098.2021.00027

Download references

Funding

No funding was received for conducting this study.

Author information

Authors and Affiliations

School of Microelectronics, University of Science and Technology of China, HuangShan Road, Hefei, 230027, Anhui, China
Xuandong Huang & JingSong He

Authors

Xuandong Huang
View author publications
You can also search for this author in PubMed Google Scholar
JingSong He
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by XH. The first draft of the manuscript was written by XH and JH commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to JingSong He.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Huang, X., He, J. Fusing Convolution and Self-Attention Parallel in Frequency Domain for Image Deblurring. Neural Process Lett 55, 9811–9829 (2023). https://doi.org/10.1007/s11063-023-11228-x

Download citation

Accepted: 03 March 2023
Published: 22 March 2023
Issue Date: December 2023
DOI: https://doi.org/10.1007/s11063-023-11228-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fusing Convolution and Self-Attention Parallel in Frequency Domain for Image Deblurring

Abstract

Access this article

Similar content being viewed by others

Methods for image denoising using convolutional neural network: a review

Learning a Deep Convolutional Network for Image Super-Resolution

Single image super-resolution: a comprehensive review and recent insight

Data Availibility

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Fusing Convolution and Self-Attention Parallel in Frequency Domain for Image Deblurring

Abstract

Access this article

Similar content being viewed by others

Methods for image denoising using convolutional neural network: a review

Learning a Deep Convolutional Network for Image Super-Resolution

Single image super-resolution: a comprehensive review and recent insight

Data Availibility

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation