Abstract
Removing rain streaks from rainy images can improve the accuracy of computer vision applications such as object detection. In order to make full use of the frequency domain analysis characteristics of wavelet and combine the advantages of Convolutional Neural Network (CNN) and Transformer, a Multi-level Wavelet Network Based on CNN-Transformer Hybrid Attention (MWN-CTHA) for single image deraining is proposed. MWN-CTHA obtains multi-scale low-frequency and high-frequency images through multi-level non-separable lifting wavelet transform and uses CNN-Transformer Hybrid Attention Block (CTHAB) to learn global structure and detail information from low-frequency and high-frequency, respectively. CTHAB consists of CA-SA Layer (CSL) and Detail-enhanced Attention Feed-forward Layer (DAFL). CSL uses the non-local modeling ability of self-attention to capture long-range rain streaks and uses convolutional attention to enhance the search ability for local rain streaks, where convolution can assist self-attention to achieve better feature representation. DAFL utilizes Depth-wise Convolutional Layer to supplement detailed features and filters the information of feed-forward layer through Dual-branch Attention. The experimental results on the four synthetic datasets demonstrate that the proposed method achieves higher PSNR and SSIM than the state-of-the-art method DANet, with an improvement of 1.07 dB and 0.0098, respectively. The code is available at https://github.com/fashyon/MWN-CTHA.
Similar content being viewed by others
Data availability
The datasets generated during and analyzed during the current study are available from the corresponding author on reasonable request.
References
Chen Y L, Hsu C T. A generalized low-rank appearance model for spatio-temporally correlated rain streaks [C]. Proceedings of the IEEE international conference on computer vision. 2013: 1968–1975.
Luo Y, Xu Y, Ji H. Removing rain from a single image via discriminative sparse coding [C]. Proceedings of the IEEE international conference on computer vision. 2015: 3397–3405.
Kang LW, Lin CW, Fu YH (2011) Automatic single-image-based rain streaks removal via image decomposition [J]. IEEE Trans Image Process 21(4):1742–1755
Li X, Wu J, Lin Z, et al. Recurrent squeeze-and-excitation context aggregation net for single image deraining [C]. Proceedings of the European conference on computer vision (ECCV). 2018: 254–269
Ren D, Zuo W, Hu Q, et al. Progressive image deraining networks: A better and simpler baseline [C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 3937–3946
Wang T, Yang X, Xu K, et al. Spatial attentive single-image deraining with a high quality real rain dataset [C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 12270–12279
Ren D, Shang W, Zhu P et al (2020) Single image deraining using bilateral recurrent network [J]. IEEE Trans Image Process 29:6852–6863
Wang C, Xing X, Wu Y, et al. Dcsfn: Deep cross-scale fusion network for single image rain removal [C]. Proceedings of the 28th ACM international conference on multimedia. 2020: 1643–1651
Wang H, Xie Q, Zhao Q, et al. A model-driven deep neural network for single image rain removal [C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020: 3103–3112
Guo Q, Sun J, Juefei-Xu F, Ma L, Xie X, Feng W, Liu Y, Zhao J (2021) EfficientDeRain: learning pixel-wise dilation filtering for high-efficiency single-image deraining [C]. Proc AAAI Conf Artif Intell 35(2):1487–1495
Yi Q, Li J, Dai Q, et al. Structure-Preserving Deraining with Residue Channel Prior Guidance [C]. Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021: 4238–4247.
Jiang K, Wang Z, Yi P et al. (2021) Rain-free and residue hand-in-hand: A progressive coupled network for real-time image deraining[J]. IEEE Trans Image Process 30:7404–7418
Cui X, Wang C, Ren D, et al. (2022) Semi-supervised image deraining using knowledge distillation[J]. IEEE Transactions on Circuits and Systems for Video Technology 32(12):8327–8341
Jiang K, Wang Z, Chen C, et al. (2022) Magic ELF: Image deraining meets association learning and transformer[J]. arXiv preprint arXiv:2207.10455
Jiang K, Wang Z, Wang Z, et al. (2022) Danet: Image deraining via dynamic association learning [C]//Proc 31st Int Joint Conf Artif Intell
Liu P, Zhang H, Zhang K, et al. Multi-level wavelet-CNN for image restoration [C]//Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 2018: 773–782
Park Y, Jeon M, Lee J et al (2022) MCW-Net: single image deraining with multi-level connections and wide regional non-local blocks [J]. Signal Process 105:116701
Cooley JW, Lewis PAW, Welch PD (1969) The fast Fourier transform and its applications [J]. IEEE Trans Educ 12(1):27–34
Deng G, Cahill L W. An adaptive Gaussian filter for noise reduction and edge detection [C]//1993 IEEE conference record nuclear science symposium and medical imaging conference. IEEE, 1993: 1615–1619
Park N, Kim S (2022) How do vision transformers work? [J]. arXiv preprint arXiv:2202.06709
Zhang K, Li Y, Liang J, et al. (2022) Practical blind denoising via swin-conv-unet and data synthesis [J]. arXiv preprint arXiv:2203.13278
Chen L, Chu X, Zhang X, et al. (2022) Simple baselines for image restoration [J]. arXiv preprint arXiv:2204.04676
Ren S, Zhou D, He S, et al. Shunted Self-Attention via Multi-Scale Token Aggregation [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022: 10853–10862
Huang T, Huang L, You S, et al. (2022) LightViT: Towards Light-Weight Convolution-Free Vision Transformers [J]. arXiv preprint arXiv:2207.05557
Fu X, Huang J, Ding X et al (2017) Clearing the skies: a deep network architecture for single-image rain removal [J]. IEEE Trans Image Process 26(6):2944–2956
Zhang H, Patel V M. Density-aware single image de-raining using a multi-stream dense network [C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 695–704
Yang W, Liu J, Yang S et al (2019) Scale-free single image deraining via visibility-enhanced recurrent wavelet learning [J]. IEEE Trans Image Process 28(6):2948–2961
Zhao J, Xie J, Xiong R, et al. Pyramid Convolutional Network for Single Image Deraining [C]//CVPR Workshops. 2019: 9–16.
Vaswani A, Shazeer N, Parmar N, et al. (2017) Attention is all you need [J]. Advances in neural information processing systems. 30
Dosovitskiy A, Beyer L, Kolesnikov A, et al. (2020) An image is worth 16x16 words: Transformers for image recognition at scale [J]. arXiv preprint arXiv:2010.11929
Liu Z, Lin Y, Cao Y, et al. Swin transformer: Hierarchical vision transformer using shifted windows [C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021: 10012–10022.
Liu Z, Hu H, Lin Y, et al. Swin transformer v2: Scaling up capacity and resolution [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022: 12009–12019.
Yang F, Yang H, Fu J, et al. Learning texture transformer network for image super-resolution [C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020: 5791–5800.
Chen H, Wang Y, Guo T, et al. Pre-trained image processing transformer [C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021: 12299–12310.
Liang J, Cao J, Sun G, et al. Swinir: Image restoration using swin transformer [C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021: 1833–1844.
Liu B, Liu W (2018) The lifting factorization of 2D 4-channel nonseparable wavelet transforms [J]. Inf Sci 456:113–130
Wang C, Xu H, Zhang X, et al. Convolutional Embedding Makes Hierarchical Vision Transformer Stronger[J]. arXiv preprint arXiv:2207.13317, 2022.
Liu B, Peng JX (2009) Fusion method of multi-spectral image and panchromatic image based on four channels non-sperable additive wavelets. Chin J Computers 32(2):350–356 (In Chinese)
Wang Z, Bovik AC, Sheikh HR et al (2004) Image quality assessment: from error visibility to structural similarity [J]. IEEE Trans Image Process 13(4):600–612
Yang W, Tan R T, Feng J, et al. Deep joint rain detection and removal from a single image [C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 1357–1366.
Li Y, Tan R T, Guo X, et al. Rain streak removal using layer priors [C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 2736–2744.
Zhang H, Sindagi V, Patel VM (2019) Image de-raining using a conditional generative adversarial network[J]. IEEE Trans Circuits Syst Video Technol 30(11):3943–3956
Kingma D P, Ba J. (2014) Adam: A method for stochastic optimization [J]. arXiv preprint arXiv:1412.6980
Li Y, Monno Y, Okutomi M. Single Image Deraining Network with Rain Embedding Consistency and Layered LSTM [C]. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 2022: 4060–4069
Jiang K, Wang Z, Yi P, et al. Multi-scale progressive fusion network for single image deraining [C]. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020: 8346–8355
Ge Z, Liu S, Wang F, et al. (2021) Yolox: Exceeding yolo series in 2021 [J]. arXiv preprint arXiv:2107.08430
Acknowledgements
This work was supported by the National Natural Science Foundation of China (61471160).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix 1
The time-domain form of the constructed two-dimensional four-channel non-separable lifting wavelet filter bank:
Appendix 2
The predict and update operators of the filter bank:
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Liu, B., Fang, S. Multi-level wavelet network based on CNN-Transformer hybrid attention for single image deraining. Neural Comput & Applic 35, 22387–22404 (2023). https://doi.org/10.1007/s00521-023-08899-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-023-08899-x