Skip to main content
Log in

Multi-level wavelet network based on CNN-Transformer hybrid attention for single image deraining

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Removing rain streaks from rainy images can improve the accuracy of computer vision applications such as object detection. In order to make full use of the frequency domain analysis characteristics of wavelet and combine the advantages of Convolutional Neural Network (CNN) and Transformer, a Multi-level Wavelet Network Based on CNN-Transformer Hybrid Attention (MWN-CTHA) for single image deraining is proposed. MWN-CTHA obtains multi-scale low-frequency and high-frequency images through multi-level non-separable lifting wavelet transform and uses CNN-Transformer Hybrid Attention Block (CTHAB) to learn global structure and detail information from low-frequency and high-frequency, respectively. CTHAB consists of CA-SA Layer (CSL) and Detail-enhanced Attention Feed-forward Layer (DAFL). CSL uses the non-local modeling ability of self-attention to capture long-range rain streaks and uses convolutional attention to enhance the search ability for local rain streaks, where convolution can assist self-attention to achieve better feature representation. DAFL utilizes Depth-wise Convolutional Layer to supplement detailed features and filters the information of feed-forward layer through Dual-branch Attention. The experimental results on the four synthetic datasets demonstrate that the proposed method achieves higher PSNR and SSIM than the state-of-the-art method DANet, with an improvement of 1.07 dB and 0.0098, respectively. The code is available at https://github.com/fashyon/MWN-CTHA.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21

Similar content being viewed by others

Data availability

The datasets generated during and analyzed during the current study are available from the corresponding author on reasonable request.

References

  1. Chen Y L, Hsu C T. A generalized low-rank appearance model for spatio-temporally correlated rain streaks [C]. Proceedings of the IEEE international conference on computer vision. 2013: 1968–1975.

  2. Luo Y, Xu Y, Ji H. Removing rain from a single image via discriminative sparse coding [C]. Proceedings of the IEEE international conference on computer vision. 2015: 3397–3405.

  3. Kang LW, Lin CW, Fu YH (2011) Automatic single-image-based rain streaks removal via image decomposition [J]. IEEE Trans Image Process 21(4):1742–1755

    Article  MathSciNet  MATH  Google Scholar 

  4. Li X, Wu J, Lin Z, et al. Recurrent squeeze-and-excitation context aggregation net for single image deraining [C]. Proceedings of the European conference on computer vision (ECCV). 2018: 254–269

  5. Ren D, Zuo W, Hu Q, et al. Progressive image deraining networks: A better and simpler baseline [C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 3937–3946

  6. Wang T, Yang X, Xu K, et al. Spatial attentive single-image deraining with a high quality real rain dataset [C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 12270–12279

  7. Ren D, Shang W, Zhu P et al (2020) Single image deraining using bilateral recurrent network [J]. IEEE Trans Image Process 29:6852–6863

    Article  MATH  Google Scholar 

  8. Wang C, Xing X, Wu Y, et al. Dcsfn: Deep cross-scale fusion network for single image rain removal [C]. Proceedings of the 28th ACM international conference on multimedia. 2020: 1643–1651

  9. Wang H, Xie Q, Zhao Q, et al. A model-driven deep neural network for single image rain removal [C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020: 3103–3112

  10. Guo Q, Sun J, Juefei-Xu F, Ma L, Xie X, Feng W, Liu Y, Zhao J (2021) EfficientDeRain: learning pixel-wise dilation filtering for high-efficiency single-image deraining [C]. Proc AAAI Conf Artif Intell 35(2):1487–1495

    Google Scholar 

  11. Yi Q, Li J, Dai Q, et al. Structure-Preserving Deraining with Residue Channel Prior Guidance [C]. Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021: 4238–4247.

  12. Jiang K, Wang Z, Yi P et al. (2021) Rain-free and residue hand-in-hand: A progressive coupled network for real-time image deraining[J]. IEEE Trans Image Process 30:7404–7418

    Article  Google Scholar 

  13. Cui X, Wang C, Ren D, et al. (2022) Semi-supervised image deraining using knowledge distillation[J]. IEEE Transactions on Circuits and Systems for Video Technology 32(12):8327–8341

    Article  Google Scholar 

  14. Jiang K, Wang Z, Chen C, et al. (2022) Magic ELF: Image deraining meets association learning and transformer[J]. arXiv preprint arXiv:2207.10455

  15. Jiang K, Wang Z, Wang Z, et al. (2022) Danet: Image deraining via dynamic association learning [C]//Proc 31st Int Joint Conf Artif Intell

  16. Liu P, Zhang H, Zhang K, et al. Multi-level wavelet-CNN for image restoration [C]//Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 2018: 773–782

  17. Park Y, Jeon M, Lee J et al (2022) MCW-Net: single image deraining with multi-level connections and wide regional non-local blocks [J]. Signal Process 105:116701

    Google Scholar 

  18. Cooley JW, Lewis PAW, Welch PD (1969) The fast Fourier transform and its applications [J]. IEEE Trans Educ 12(1):27–34

    Article  Google Scholar 

  19. Deng G, Cahill L W. An adaptive Gaussian filter for noise reduction and edge detection [C]//1993 IEEE conference record nuclear science symposium and medical imaging conference. IEEE, 1993: 1615–1619

  20. Park N, Kim S (2022) How do vision transformers work? [J]. arXiv preprint arXiv:2202.06709

  21. Zhang K, Li Y, Liang J, et al. (2022) Practical blind denoising via swin-conv-unet and data synthesis [J]. arXiv preprint arXiv:2203.13278

  22. Chen L, Chu X, Zhang X, et al. (2022) Simple baselines for image restoration [J]. arXiv preprint arXiv:2204.04676

  23. Ren S, Zhou D, He S, et al. Shunted Self-Attention via Multi-Scale Token Aggregation [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022: 10853–10862

  24. Huang T, Huang L, You S, et al. (2022) LightViT: Towards Light-Weight Convolution-Free Vision Transformers [J]. arXiv preprint arXiv:2207.05557

  25. Fu X, Huang J, Ding X et al (2017) Clearing the skies: a deep network architecture for single-image rain removal [J]. IEEE Trans Image Process 26(6):2944–2956

    Article  MathSciNet  MATH  Google Scholar 

  26. Zhang H, Patel V M. Density-aware single image de-raining using a multi-stream dense network [C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 695–704

  27. Yang W, Liu J, Yang S et al (2019) Scale-free single image deraining via visibility-enhanced recurrent wavelet learning [J]. IEEE Trans Image Process 28(6):2948–2961

    Article  MathSciNet  MATH  Google Scholar 

  28. Zhao J, Xie J, Xiong R, et al. Pyramid Convolutional Network for Single Image Deraining [C]//CVPR Workshops. 2019: 9–16.

  29. Vaswani A, Shazeer N, Parmar N, et al. (2017) Attention is all you need [J]. Advances in neural information processing systems. 30

  30. Dosovitskiy A, Beyer L, Kolesnikov A, et al. (2020) An image is worth 16x16 words: Transformers for image recognition at scale [J]. arXiv preprint arXiv:2010.11929

  31. Liu Z, Lin Y, Cao Y, et al. Swin transformer: Hierarchical vision transformer using shifted windows [C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021: 10012–10022.

  32. Liu Z, Hu H, Lin Y, et al. Swin transformer v2: Scaling up capacity and resolution [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022: 12009–12019.

  33. Yang F, Yang H, Fu J, et al. Learning texture transformer network for image super-resolution [C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020: 5791–5800.

  34. Chen H, Wang Y, Guo T, et al. Pre-trained image processing transformer [C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021: 12299–12310.

  35. Liang J, Cao J, Sun G, et al. Swinir: Image restoration using swin transformer [C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021: 1833–1844.

  36. Liu B, Liu W (2018) The lifting factorization of 2D 4-channel nonseparable wavelet transforms [J]. Inf Sci 456:113–130

    Article  MathSciNet  MATH  Google Scholar 

  37. Wang C, Xu H, Zhang X, et al. Convolutional Embedding Makes Hierarchical Vision Transformer Stronger[J]. arXiv preprint arXiv:2207.13317, 2022.

  38. Liu B, Peng JX (2009) Fusion method of multi-spectral image and panchromatic image based on four channels non-sperable additive wavelets. Chin J Computers 32(2):350–356 (In Chinese)

    Article  Google Scholar 

  39. Wang Z, Bovik AC, Sheikh HR et al (2004) Image quality assessment: from error visibility to structural similarity [J]. IEEE Trans Image Process 13(4):600–612

    Article  Google Scholar 

  40. Yang W, Tan R T, Feng J, et al. Deep joint rain detection and removal from a single image [C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 1357–1366.

  41. Li Y, Tan R T, Guo X, et al. Rain streak removal using layer priors [C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 2736–2744.

  42. Zhang H, Sindagi V, Patel VM (2019) Image de-raining using a conditional generative adversarial network[J]. IEEE Trans Circuits Syst Video Technol 30(11):3943–3956

    Article  Google Scholar 

  43. Kingma D P, Ba J. (2014) Adam: A method for stochastic optimization [J]. arXiv preprint arXiv:1412.6980

  44. Li Y, Monno Y, Okutomi M. Single Image Deraining Network with Rain Embedding Consistency and Layered LSTM [C]. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 2022: 4060–4069

  45. Jiang K, Wang Z, Yi P, et al. Multi-scale progressive fusion network for single image deraining [C]. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020: 8346–8355

  46. Ge Z, Liu S, Wang F, et al. (2021) Yolox: Exceeding yolo series in 2021 [J]. arXiv preprint arXiv:2107.08430

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (61471160).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Siyan Fang.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1

The time-domain form of the constructed two-dimensional four-channel non-separable lifting wavelet filter bank:

$$\left\{ \begin{gathered} {\varvec{H}}_{0} { = }\left[ {\begin{array}{*{20}c} {0.2980} & {0.1334} & { - 0.0552} & {0.1235} \\ { - 0.0084} & {0.0037} & {0.0015} & {0.0035} \\ {0.0035} & {0.0015} & {0.0037} & { - 0.0084} \\ {0.1235} & { - 0.0552} & {0.1334} & {0.2980} \\ \end{array} } \right] \hfill \\ {\varvec{H}}_{1} { = }\left[ {\begin{array}{*{20}c} { - 0.1235} & { - 0.0552} & { - 0.1334} & {0.2980} \\ {0.0035} & { - 0.0015} & {0.0037} & {0.0084} \\ {0.0084} & {0.0037} & { - 0.0015} & {0.0035} \\ {0.2980} & { - 0.1334} & { - 0.0552} & { - 0.1235} \\ \end{array} } \right] \hfill \\ {\varvec{H}}_{2} { = }\left[ {\begin{array}{*{20}c} { - 0.1065} & { - 0.0477} & { - 0.1362} & {0.3045} \\ {0.0030} & { - 0.0013} & {0.0038} & {0.0085} \\ { - 0.0085} & { - 0.0038} & {0.0013} & { - 0.0030} \\ { - 0.3045} & {0.1362} & {0.0477} & {0.1065} \\ \end{array} } \right] \hfill \\ {\varvec{H}}_{3} { = }\left[ {\begin{array}{*{20}c} {0.3045} & {0.1362} & { - 0.0477} & {0.1065} \\ { - 0.0085} & {0.0038} & {0.0013} & {0.0030} \\ { - 0.0030} & { - 0.0013} & { - 0.0038} & {0.0085} \\ { - 0.1065} & {0.0477} & { - 0.1362} & { - 0.3045} \\ \end{array} } \right] \hfill \\ \end{gathered} \right.$$

Appendix 2

The predict and update operators of the filter bank:

$$\left\{ {\begin{array}{*{20}l} {\user2{Predict}_{1} = \left[ {\begin{array}{*{20}c} 1 & 0 & 0 & 0 \\ { - 0.0281} & 1 & 0 & 0 \\ {0.4475} & 0 & 1 & 0 \\ {0.0126} & { - 0.4475} & {0.0281} & 1 \\ \end{array} } \right]} \\ {\user2{Predict}_{2} {\text{ = }}\left[ {\begin{array}{*{20}c} 1 & 0 & 0 & 0 \\ {0.4142} & 1 & 0 & 0 \\ {0.4142} & 1 & 1 & 0 \\ 1 & 0 & {0.3499} & 1 \\ \end{array} } \right]} \\ \begin{gathered} \user2{Update}_{1} = \left[ {\begin{array}{*{20}c} {0.9124} & {0.0256} & { - 0.4083} & {0.0115} \\ 0 & {0.9131} & 0 & {0.4086} \\ 0 & 0 & {1.0951} & { - 0.0307} \\ 0 & 0 & 0 & {1.0960} \\ \end{array} } \right] \hfill \\ \user2{Update}_{2} = \left[ {\begin{array}{*{20}c} {1.3066} & { - 0.5412} & { - 0.4671} & {1.3349} \\ 0 & {1.5307} & { - 1.1414} & { - 1.0200} \\ 0 & 0 & {2.6697} & {0.9342} \\ 0 & 0 & 0 & { - 2.9966} \\ \end{array} } \right] \hfill \\ \end{gathered} \\ \end{array} } \right.$$

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, B., Fang, S. Multi-level wavelet network based on CNN-Transformer hybrid attention for single image deraining. Neural Comput & Applic 35, 22387–22404 (2023). https://doi.org/10.1007/s00521-023-08899-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-023-08899-x

Keywords

Navigation