Multi-level wavelet network based on CNN-Transformer hybrid attention for single image deraining

Liu, Bin; Fang, Siyan

doi:10.1007/s00521-023-08899-x

Multi-level wavelet network based on CNN-Transformer hybrid attention for single image deraining

Original Article
Published: 09 August 2023

Volume 35, pages 22387–22404, (2023)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

479 Accesses
3 Citations
Explore all metrics

Abstract

Removing rain streaks from rainy images can improve the accuracy of computer vision applications such as object detection. In order to make full use of the frequency domain analysis characteristics of wavelet and combine the advantages of Convolutional Neural Network (CNN) and Transformer, a Multi-level Wavelet Network Based on CNN-Transformer Hybrid Attention (MWN-CTHA) for single image deraining is proposed. MWN-CTHA obtains multi-scale low-frequency and high-frequency images through multi-level non-separable lifting wavelet transform and uses CNN-Transformer Hybrid Attention Block (CTHAB) to learn global structure and detail information from low-frequency and high-frequency, respectively. CTHAB consists of CA-SA Layer (CSL) and Detail-enhanced Attention Feed-forward Layer (DAFL). CSL uses the non-local modeling ability of self-attention to capture long-range rain streaks and uses convolutional attention to enhance the search ability for local rain streaks, where convolution can assist self-attention to achieve better feature representation. DAFL utilizes Depth-wise Convolutional Layer to supplement detailed features and filters the information of feed-forward layer through Dual-branch Attention. The experimental results on the four synthetic datasets demonstrate that the proposed method achieves higher PSNR and SSIM than the state-of-the-art method DANet, with an improvement of 1.07 dB and 0.0098, respectively. The code is available at https://github.com/fashyon/MWN-CTHA.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Selective Wavelet Attention Learning for Single Image Deraining

Article 23 January 2021

Self-attentive Pyramid Network for Single Image De-raining

MLTDNet: an efficient multi-level transformer network for single image deraining

Article 20 April 2022

Data availability

The datasets generated during and analyzed during the current study are available from the corresponding author on reasonable request.

References

Chen Y L, Hsu C T. A generalized low-rank appearance model for spatio-temporally correlated rain streaks [C]. Proceedings of the IEEE international conference on computer vision. 2013: 1968–1975.
Luo Y, Xu Y, Ji H. Removing rain from a single image via discriminative sparse coding [C]. Proceedings of the IEEE international conference on computer vision. 2015: 3397–3405.
Kang LW, Lin CW, Fu YH (2011) Automatic single-image-based rain streaks removal via image decomposition [J]. IEEE Trans Image Process 21(4):1742–1755
Article MathSciNet MATH Google Scholar
Li X, Wu J, Lin Z, et al. Recurrent squeeze-and-excitation context aggregation net for single image deraining [C]. Proceedings of the European conference on computer vision (ECCV). 2018: 254–269
Ren D, Zuo W, Hu Q, et al. Progressive image deraining networks: A better and simpler baseline [C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 3937–3946
Wang T, Yang X, Xu K, et al. Spatial attentive single-image deraining with a high quality real rain dataset [C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 12270–12279
Ren D, Shang W, Zhu P et al (2020) Single image deraining using bilateral recurrent network [J]. IEEE Trans Image Process 29:6852–6863
Article MATH Google Scholar
Wang C, Xing X, Wu Y, et al. Dcsfn: Deep cross-scale fusion network for single image rain removal [C]. Proceedings of the 28th ACM international conference on multimedia. 2020: 1643–1651
Wang H, Xie Q, Zhao Q, et al. A model-driven deep neural network for single image rain removal [C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020: 3103–3112
Guo Q, Sun J, Juefei-Xu F, Ma L, Xie X, Feng W, Liu Y, Zhao J (2021) EfficientDeRain: learning pixel-wise dilation filtering for high-efficiency single-image deraining [C]. Proc AAAI Conf Artif Intell 35(2):1487–1495
Google Scholar
Yi Q, Li J, Dai Q, et al. Structure-Preserving Deraining with Residue Channel Prior Guidance [C]. Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021: 4238–4247.
Jiang K, Wang Z, Yi P et al. (2021) Rain-free and residue hand-in-hand: A progressive coupled network for real-time image deraining[J]. IEEE Trans Image Process 30:7404–7418
Article Google Scholar
Cui X, Wang C, Ren D, et al. (2022) Semi-supervised image deraining using knowledge distillation[J]. IEEE Transactions on Circuits and Systems for Video Technology 32(12):8327–8341
Article Google Scholar
Jiang K, Wang Z, Chen C, et al. (2022) Magic ELF: Image deraining meets association learning and transformer[J]. arXiv preprint arXiv:2207.10455
Jiang K, Wang Z, Wang Z, et al. (2022) Danet: Image deraining via dynamic association learning [C]//Proc 31st Int Joint Conf Artif Intell
Liu P, Zhang H, Zhang K, et al. Multi-level wavelet-CNN for image restoration [C]//Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 2018: 773–782
Park Y, Jeon M, Lee J et al (2022) MCW-Net: single image deraining with multi-level connections and wide regional non-local blocks [J]. Signal Process 105:116701
Google Scholar
Cooley JW, Lewis PAW, Welch PD (1969) The fast Fourier transform and its applications [J]. IEEE Trans Educ 12(1):27–34
Article Google Scholar
Deng G, Cahill L W. An adaptive Gaussian filter for noise reduction and edge detection [C]//1993 IEEE conference record nuclear science symposium and medical imaging conference. IEEE, 1993: 1615–1619
Park N, Kim S (2022) How do vision transformers work? [J]. arXiv preprint arXiv:2202.06709
Zhang K, Li Y, Liang J, et al. (2022) Practical blind denoising via swin-conv-unet and data synthesis [J]. arXiv preprint arXiv:2203.13278
Chen L, Chu X, Zhang X, et al. (2022) Simple baselines for image restoration [J]. arXiv preprint arXiv:2204.04676
Ren S, Zhou D, He S, et al. Shunted Self-Attention via Multi-Scale Token Aggregation [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022: 10853–10862
Huang T, Huang L, You S, et al. (2022) LightViT: Towards Light-Weight Convolution-Free Vision Transformers [J]. arXiv preprint arXiv:2207.05557
Fu X, Huang J, Ding X et al (2017) Clearing the skies: a deep network architecture for single-image rain removal [J]. IEEE Trans Image Process 26(6):2944–2956
Article MathSciNet MATH Google Scholar
Zhang H, Patel V M. Density-aware single image de-raining using a multi-stream dense network [C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 695–704
Yang W, Liu J, Yang S et al (2019) Scale-free single image deraining via visibility-enhanced recurrent wavelet learning [J]. IEEE Trans Image Process 28(6):2948–2961
Article MathSciNet MATH Google Scholar
Zhao J, Xie J, Xiong R, et al. Pyramid Convolutional Network for Single Image Deraining [C]//CVPR Workshops. 2019: 9–16.
Vaswani A, Shazeer N, Parmar N, et al. (2017) Attention is all you need [J]. Advances in neural information processing systems. 30
Dosovitskiy A, Beyer L, Kolesnikov A, et al. (2020) An image is worth 16x16 words: Transformers for image recognition at scale [J]. arXiv preprint arXiv:2010.11929
Liu Z, Lin Y, Cao Y, et al. Swin transformer: Hierarchical vision transformer using shifted windows [C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021: 10012–10022.
Liu Z, Hu H, Lin Y, et al. Swin transformer v2: Scaling up capacity and resolution [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022: 12009–12019.
Yang F, Yang H, Fu J, et al. Learning texture transformer network for image super-resolution [C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020: 5791–5800.
Chen H, Wang Y, Guo T, et al. Pre-trained image processing transformer [C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021: 12299–12310.
Liang J, Cao J, Sun G, et al. Swinir: Image restoration using swin transformer [C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021: 1833–1844.
Liu B, Liu W (2018) The lifting factorization of 2D 4-channel nonseparable wavelet transforms [J]. Inf Sci 456:113–130
Article MathSciNet MATH Google Scholar
Wang C, Xu H, Zhang X, et al. Convolutional Embedding Makes Hierarchical Vision Transformer Stronger[J]. arXiv preprint arXiv:2207.13317, 2022.
Liu B, Peng JX (2009) Fusion method of multi-spectral image and panchromatic image based on four channels non-sperable additive wavelets. Chin J Computers 32(2):350–356 (In Chinese)
Article Google Scholar
Wang Z, Bovik AC, Sheikh HR et al (2004) Image quality assessment: from error visibility to structural similarity [J]. IEEE Trans Image Process 13(4):600–612
Article Google Scholar
Yang W, Tan R T, Feng J, et al. Deep joint rain detection and removal from a single image [C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 1357–1366.
Li Y, Tan R T, Guo X, et al. Rain streak removal using layer priors [C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 2736–2744.
Zhang H, Sindagi V, Patel VM (2019) Image de-raining using a conditional generative adversarial network[J]. IEEE Trans Circuits Syst Video Technol 30(11):3943–3956
Article Google Scholar
Kingma D P, Ba J. (2014) Adam: A method for stochastic optimization [J]. arXiv preprint arXiv:1412.6980
Li Y, Monno Y, Okutomi M. Single Image Deraining Network with Rain Embedding Consistency and Layered LSTM [C]. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 2022: 4060–4069
Jiang K, Wang Z, Yi P, et al. Multi-scale progressive fusion network for single image deraining [C]. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020: 8346–8355
Ge Z, Liu S, Wang F, et al. (2021) Yolox: Exceeding yolo series in 2021 [J]. arXiv preprint arXiv:2107.08430

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (61471160).

Author information

Authors and Affiliations

School of Computer and Information Engineering, Hubei University, Wuhan, 430062, China
Bin Liu & Siyan Fang

Authors

Bin Liu
View author publications
You can also search for this author in PubMed Google Scholar
Siyan Fang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Siyan Fang.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1

The time-domain form of the constructed two-dimensional four-channel non-separable lifting wavelet filter bank:

$$\left\{ \begin{gathered} {\varvec{H}}_{0} { = }\left[ {\begin{array}{*{20}c} {0.2980} & {0.1334} & { - 0.0552} & {0.1235} \\ { - 0.0084} & {0.0037} & {0.0015} & {0.0035} \\ {0.0035} & {0.0015} & {0.0037} & { - 0.0084} \\ {0.1235} & { - 0.0552} & {0.1334} & {0.2980} \\ \end{array} } \right] \hfill \\ {\varvec{H}}_{1} { = }\left[ {\begin{array}{*{20}c} { - 0.1235} & { - 0.0552} & { - 0.1334} & {0.2980} \\ {0.0035} & { - 0.0015} & {0.0037} & {0.0084} \\ {0.0084} & {0.0037} & { - 0.0015} & {0.0035} \\ {0.2980} & { - 0.1334} & { - 0.0552} & { - 0.1235} \\ \end{array} } \right] \hfill \\ {\varvec{H}}_{2} { = }\left[ {\begin{array}{*{20}c} { - 0.1065} & { - 0.0477} & { - 0.1362} & {0.3045} \\ {0.0030} & { - 0.0013} & {0.0038} & {0.0085} \\ { - 0.0085} & { - 0.0038} & {0.0013} & { - 0.0030} \\ { - 0.3045} & {0.1362} & {0.0477} & {0.1065} \\ \end{array} } \right] \hfill \\ {\varvec{H}}_{3} { = }\left[ {\begin{array}{*{20}c} {0.3045} & {0.1362} & { - 0.0477} & {0.1065} \\ { - 0.0085} & {0.0038} & {0.0013} & {0.0030} \\ { - 0.0030} & { - 0.0013} & { - 0.0038} & {0.0085} \\ { - 0.1065} & {0.0477} & { - 0.1362} & { - 0.3045} \\ \end{array} } \right] \hfill \\ \end{gathered} \right.$$

Appendix 2

The predict and update operators of the filter bank:

$$\left\{ {\begin{array}{*{20}l} {\user2{Predict}_{1} = \left[ {\begin{array}{*{20}c} 1 & 0 & 0 & 0 \\ { - 0.0281} & 1 & 0 & 0 \\ {0.4475} & 0 & 1 & 0 \\ {0.0126} & { - 0.4475} & {0.0281} & 1 \\ \end{array} } \right]} \\ {\user2{Predict}_{2} {\text{ = }}\left[ {\begin{array}{*{20}c} 1 & 0 & 0 & 0 \\ {0.4142} & 1 & 0 & 0 \\ {0.4142} & 1 & 1 & 0 \\ 1 & 0 & {0.3499} & 1 \\ \end{array} } \right]} \\ \begin{gathered} \user2{Update}_{1} = \left[ {\begin{array}{*{20}c} {0.9124} & {0.0256} & { - 0.4083} & {0.0115} \\ 0 & {0.9131} & 0 & {0.4086} \\ 0 & 0 & {1.0951} & { - 0.0307} \\ 0 & 0 & 0 & {1.0960} \\ \end{array} } \right] \hfill \\ \user2{Update}_{2} = \left[ {\begin{array}{*{20}c} {1.3066} & { - 0.5412} & { - 0.4671} & {1.3349} \\ 0 & {1.5307} & { - 1.1414} & { - 1.0200} \\ 0 & 0 & {2.6697} & {0.9342} \\ 0 & 0 & 0 & { - 2.9966} \\ \end{array} } \right] \hfill \\ \end{gathered} \\ \end{array} } \right.$$

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Liu, B., Fang, S. Multi-level wavelet network based on CNN-Transformer hybrid attention for single image deraining. Neural Comput & Applic 35, 22387–22404 (2023). https://doi.org/10.1007/s00521-023-08899-x

Download citation

Received: 21 October 2022
Accepted: 14 July 2023
Published: 09 August 2023
Issue Date: October 2023
DOI: https://doi.org/10.1007/s00521-023-08899-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-level wavelet network based on CNN-Transformer hybrid attention for single image deraining

Abstract

Access this article

Similar content being viewed by others

Selective Wavelet Attention Learning for Single Image Deraining

Self-attentive Pyramid Network for Single Image De-raining

MLTDNet: an efficient multi-level transformer network for single image deraining

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix 1

Appendix 2

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multi-level wavelet network based on CNN-Transformer hybrid attention for single image deraining

Abstract

Access this article

Similar content being viewed by others

Selective Wavelet Attention Learning for Single Image Deraining

Self-attentive Pyramid Network for Single Image De-raining

MLTDNet: an efficient multi-level transformer network for single image deraining

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix 1

Appendix 2

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation