Skip to main content
Log in

Spatial-guided informative semantic joint transformer for single-image deraining

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Single-image deraining is of great significance for image recognition and analysis. However, the majority of current methods face challenges such as incomplete removal of tiny rain streaks, blurred restoration of background structure, and feature interference. To address these issues, a spatial-guided informative semantic joint transformer (SISTrans) is proposed. Specifically, a high-dimensional spatial feature mapping module is put forward to restrain the receptive field of filters by increasing the spatial resolution of the image, guiding the entire module to focus on local features and thereby learn the distribution of small rain streaks. Subsequently, a wavelet-content-aware-based dual-level module is designed to capture the high-level semantic information by using an improved Swin transformer and to establish effective long-range dependencies via a shifted window mechanism, thereby enhancing the quality of background restoration. Ultimately, a dynamic hybrid cross-fusion module is proposed to effectively avoid feature interference by recalibrating the features of two branches and fusing the calibrated features with a set of learnable parameters. Extensive experiments conducted on eight commonly benchmark datasets demonstrate that the proposed SISTrans outperforms the state-of-the-art methods. Code is available at: https://github.com/SL-Pen/SISTrans.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

References

  1. He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp 2961–2969

  2. Duan K, Xie L, Qi H, Bai S, Huang Q, Tian Q (2020) Corner proposal network for anchor-free, two-stage object detection. In: Computer Vision—ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part III. Springer, pp 399–416

  3. Comaniciu D, Ramesh V, Meer P (2003) Kernel-based object tracking. IEEE Trans Pattern Anal Mach Intell 25(5):564–577

    Article  Google Scholar 

  4. Janai J, Güney F, Behl A, Geiger A (2020) Computer vision for autonomous vehicles: problems, datasets and state of the art. Found Trends Comput Graphics Vis 12(1–3):1–308

    Article  Google Scholar 

  5. Yang W, Tan RT, Wang S, Fang Y, Liu J (2020) Single image deraining: from model-based to data-driven and beyond. IEEE Trans Pattern Anal Mach Intell 43(11):4059–4077

    Article  Google Scholar 

  6. Li X, Wu J, Lin Z, Liu H, Zha H (2018) Recurrent squeeze-and-excitation context aggregation net for single image deraining. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 254–269

  7. Zhang H, Patel VM (2018) Density-aware single image de-raining using a multi-stream dense network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 695–704

  8. Fu X, Huang J, Zeng D, Huang Y, Ding X, Paisley J (2017) Removing rain from single images via a deep detail network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3855–3863

  9. Ren D, Zuo W, Hu Q, Zhu P, Meng D (2019) Progressive image deraining networks: a better and simpler baseline. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 3937–3946

  10. Yasarla R, Patel VM (2019) Uncertainty guided multi-scale residual learning-using a cycle spinning CNN for single image de-raining. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 8405–8414

  11. Jiang K et al (2020) Multi-scale progressive fusion network for single image deraining. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 8346–8355

  12. Hu X, Fu C-W, Zhu L, Heng P-A (2019) Depth-attentional features for single-image rain removal. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 8022–8031

  13. Zhang K, Li D, Luo W, Ren W (2021) Dual attention-in-attention model for joint rain streak and raindrop removal. IEEE Trans Image Process 30:7608–7619

    Article  ADS  PubMed  Google Scholar 

  14. Otter DW, Medina JR, Kalita JK (2020) A survey of the usages of deep learning for natural language processing. IEEE Trans Neural Netw Learn Syst 32(2):604–624

    Article  MathSciNet  Google Scholar 

  15. Gao Y, Liu X, Li J, Fang Z, Jiang X, Huq KMS (2022) LFT-Net: Local feature transformer network for point clouds analysis. IEEE Trans Intell Transp Syst 24(2):2158–2168

    Google Scholar 

  16. Gou C, Zhou Y, Li D (2022) Driver attention prediction based on convolution and transformers. J Supercomput 78(6):8268–8284

    Article  Google Scholar 

  17. Dosovitskiy A et al (2020) An image is worth 16 × 16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929

  18. Chen H et al (2021) Pre-trained image processing transformer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 12299–12310

  19. Ji H, Feng X, Pei W, Li J, Lu G (2021) U2-former: a nested u-shaped transformer for image restoration. arXiv preprint arXiv:2112.02279

  20. Fan Z, Wu H, Fu X, Huang Y, Ding X (2018) Residual-guide network for single image deraining. In: Proceedings of the 26th ACM International Conference on Multimedia, pp 1751–1759

  21. Li G, He X, Zhang W, Chang H, Dong L, Lin L (2018) Non-locally enhanced encoder-decoder network for single image de-raining. In: Proceedings of the 26th ACM International Conference on Multimedia, pp 1056–1064

  22. Mustaniemi J, Kannala J, Särkkä S, Matas J, Heikkilä J Inertial-aided motion deblurring with deep networks. CoRR arXiv, 1810

  23. Ren W, Liu S, Zhang H, Pan J, Cao X, Yang M-H (2016) Single image dehazing via multi-scale convolutional neural networks. In: Computer Vision—ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, proceedings, part II 14. Springer, pp 154–169

  24. Zhang H, Patel VM (2018) Densely connected pyramid dehazing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3194–3203

  25. Dong C, Loy CC, He K, Tang X (2015) Image super-resolution using deep convolutional networks. IEEE Trans Pattern Anal Mach Intell 38(2):295–307

    Article  Google Scholar 

  26. Yu J et al (2018) Wide activation for efficient and accurate image super-resolution. arXiv preprint arXiv:1808.08718

  27. Fu X, Huang J, Ding X, Liao Y, Paisley J (2017) Clearing the skies: a deep network architecture for single-image rain removal. IEEE Trans Image Process 26(6):2944–2956

    Article  ADS  MathSciNet  Google Scholar 

  28. Yang W, Tan RT, Feng J, Liu J, Guo Z, Yan S (2017) Deep joint rain detection and removal from a single image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1357–1366

  29. Deng S et al (2020) Detail-recovery image deraining via context aggregation networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 14560–14569

  30. Ding Z, Li H, Zhou D, Liu Y, Hou R (2023) A robust infrared and visible image fusion framework via multi-receptive-field attention and color visual perception. Appl Intell 53(7):8114–8132

    Article  Google Scholar 

  31. Wang T, Yang X, Xu K, Chen S, Zhang Q, Lau RW (2019) Spatial attentive single-image deraining with a high quality real rain dataset. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 12270–12279

  32. Ren W, Tian J, Wang Q, Tang Y (2020) Dually connected deraining net using pixel-wise attention. IEEE Signal Process Lett 27:316–320

    Article  ADS  Google Scholar 

  33. Jiang K et al (2020) Decomposition makes better rain removal: an improved attention-guided deraining network. IEEE Trans Circuits Syst Video Technol 31(10):3981–3995

    Article  Google Scholar 

  34. Xie E, Wang W, Yu Z, Anandkumar A, Alvarez JM, Luo P (2021) SegFormer: simple and efficient design for semantic segmentation with transformers. Adv Neural Inf Process Syst 34:12077–12090

    Google Scholar 

  35. Guo M-H, Cai J-X, Liu Z-N, Mu T-J, Martin RR, Hu S-M (2021) PCT: point cloud transformer. Comput Vis Media 7:187–199

    Article  Google Scholar 

  36. Liu Z et al (2021) Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 10012–10022

  37. Xiao J, Fu X, Liu A, Wu F, Zha Z-J (2022) Image de-raining transformer. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2022.3183612

    Article  PubMed  Google Scholar 

  38. Chen X, Li H, Li M, Pan J (2023) Learning a sparse transformer network for effective image deraining. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 5896–5905

  39. Li Y, Lu J, Chen H, Wu X, Chen X (2023) Dilated convolutional transformer for high-quality image deraining. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 4198–4206

  40. Yang H, Zhou D, Cao J, Zhao Q, Li M (2023) RainFormer: a pyramid transformer for single image deraining. J Supercomput 79(6):6115–6140

    Article  Google Scholar 

  41. Liang Y, Anwar S, Liu Y (2022) DRT: a lightweight single image deraining recursive transformer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 589–598

  42. Yasarla R, Valanarasu JMJ, Patel VM (2020) Exploring overcomplete representations for single image deraining using CNNs. IEEE J Sel Top Signal Process 15(2):229–239

    Article  ADS  Google Scholar 

  43. Xue X, Ding Y, Ma L, Wang Y, Liu R, Fan X (2021) Temporal rain decomposition with spatial structure guidance for video deraining. In: ICASSP 2021–2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, pp 2015–2019

  44. Liang X, Zhao F (2022) Single-image rain removal network based on an attention mechanism and a residual structure. IEEE Access 10:52472–52480

    Article  Google Scholar 

  45. Liu L et al (2020) Wavelet-based dual-branch network for image demoiréing. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XIII 16. Springer, pp 86–102

  46. Wang J, Chen K, Xu R, Liu Z, Loy CC, Lin D (2019) Carafe: content-aware reassembly of features. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 3007–3016

  47. Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848

    Article  PubMed  Google Scholar 

  48. Chen L-C, Papandreou G, Schroff F, Adam H (2017) Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587

  49. Wang P et al (2018) Understanding convolution for semantic segmentation. In: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, pp 1451–1460

  50. Y Li, Tan RT, Guo X, Lu J, Brown MS (2016) Rain streak removal using layer priors. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2736–2744

  51. Li S et al (2019) Single image deraining: a comprehensive benchmark analysis. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 3838–3847

  52. Zamir SW et al (2021) Multi-stage progressive image restoration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 14821–14831

  53. Zheng S, Lu C, Wu Y, Gupta G (2022) SAPNet: segmentation-aware progressive network for perceptual contrastive deraining. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp 52–62

  54. Liu B, Fang S (2023) Multi-level wavelet network based on CNN-transformer hybrid attention for single image deraining. Neural Comput Appl 35:1–18

    Article  Google Scholar 

  55. Huynh-Thu Q, Ghanbari M (2008) Scope of validity of PSNR in image/video quality assessment. Electron Lett 44(13):800–801

    Article  ADS  Google Scholar 

  56. Mittal A, Soundararajan R, Bovik AC (2012) Making a “completely blind” image quality analyzer. IEEE Signal Process Lett 20(3):209–212

    Article  ADS  Google Scholar 

  57. Talebi H, Milanfar P (2018) NIMA: neural image assessment. IEEE Trans Image Process 27(8):3998–4011

    Article  ADS  MathSciNet  Google Scholar 

Download references

Funding

This research was supported by “Famous teacher of teaching” of Yunnan 10000 Talents Program, The National Natural Science Foundation of China under Grants 62266049, 62166048 and 62066046.

Author information

Authors and Affiliations

Authors

Contributions

HL and SP wrote the main manuscript text. HL, SP, and XL designed the deraining method. HL, SP, and SY improved the performance of the model and designed the ablation experiments. HL and HL performed the experiments. All authors reviewed the manuscript.

Corresponding author

Correspondence to Xun Lang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

The experiments in this article are all realized through program operation, which will not cause harm to humans and animals and will not cause moral and ethical problems.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, H., Peng, S., Lang, X. et al. Spatial-guided informative semantic joint transformer for single-image deraining. J Supercomput 80, 6522–6551 (2024). https://doi.org/10.1007/s11227-023-05697-z

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-023-05697-z

Keywords

Navigation