Abstract
Single-image deraining is of great significance for image recognition and analysis. However, the majority of current methods face challenges such as incomplete removal of tiny rain streaks, blurred restoration of background structure, and feature interference. To address these issues, a spatial-guided informative semantic joint transformer (SISTrans) is proposed. Specifically, a high-dimensional spatial feature mapping module is put forward to restrain the receptive field of filters by increasing the spatial resolution of the image, guiding the entire module to focus on local features and thereby learn the distribution of small rain streaks. Subsequently, a wavelet-content-aware-based dual-level module is designed to capture the high-level semantic information by using an improved Swin transformer and to establish effective long-range dependencies via a shifted window mechanism, thereby enhancing the quality of background restoration. Ultimately, a dynamic hybrid cross-fusion module is proposed to effectively avoid feature interference by recalibrating the features of two branches and fusing the calibrated features with a set of learnable parameters. Extensive experiments conducted on eight commonly benchmark datasets demonstrate that the proposed SISTrans outperforms the state-of-the-art methods. Code is available at: https://github.com/SL-Pen/SISTrans.
Similar content being viewed by others
References
He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp 2961–2969
Duan K, Xie L, Qi H, Bai S, Huang Q, Tian Q (2020) Corner proposal network for anchor-free, two-stage object detection. In: Computer Vision—ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part III. Springer, pp 399–416
Comaniciu D, Ramesh V, Meer P (2003) Kernel-based object tracking. IEEE Trans Pattern Anal Mach Intell 25(5):564–577
Janai J, Güney F, Behl A, Geiger A (2020) Computer vision for autonomous vehicles: problems, datasets and state of the art. Found Trends Comput Graphics Vis 12(1–3):1–308
Yang W, Tan RT, Wang S, Fang Y, Liu J (2020) Single image deraining: from model-based to data-driven and beyond. IEEE Trans Pattern Anal Mach Intell 43(11):4059–4077
Li X, Wu J, Lin Z, Liu H, Zha H (2018) Recurrent squeeze-and-excitation context aggregation net for single image deraining. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 254–269
Zhang H, Patel VM (2018) Density-aware single image de-raining using a multi-stream dense network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 695–704
Fu X, Huang J, Zeng D, Huang Y, Ding X, Paisley J (2017) Removing rain from single images via a deep detail network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3855–3863
Ren D, Zuo W, Hu Q, Zhu P, Meng D (2019) Progressive image deraining networks: a better and simpler baseline. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 3937–3946
Yasarla R, Patel VM (2019) Uncertainty guided multi-scale residual learning-using a cycle spinning CNN for single image de-raining. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 8405–8414
Jiang K et al (2020) Multi-scale progressive fusion network for single image deraining. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 8346–8355
Hu X, Fu C-W, Zhu L, Heng P-A (2019) Depth-attentional features for single-image rain removal. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 8022–8031
Zhang K, Li D, Luo W, Ren W (2021) Dual attention-in-attention model for joint rain streak and raindrop removal. IEEE Trans Image Process 30:7608–7619
Otter DW, Medina JR, Kalita JK (2020) A survey of the usages of deep learning for natural language processing. IEEE Trans Neural Netw Learn Syst 32(2):604–624
Gao Y, Liu X, Li J, Fang Z, Jiang X, Huq KMS (2022) LFT-Net: Local feature transformer network for point clouds analysis. IEEE Trans Intell Transp Syst 24(2):2158–2168
Gou C, Zhou Y, Li D (2022) Driver attention prediction based on convolution and transformers. J Supercomput 78(6):8268–8284
Dosovitskiy A et al (2020) An image is worth 16 × 16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929
Chen H et al (2021) Pre-trained image processing transformer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 12299–12310
Ji H, Feng X, Pei W, Li J, Lu G (2021) U2-former: a nested u-shaped transformer for image restoration. arXiv preprint arXiv:2112.02279
Fan Z, Wu H, Fu X, Huang Y, Ding X (2018) Residual-guide network for single image deraining. In: Proceedings of the 26th ACM International Conference on Multimedia, pp 1751–1759
Li G, He X, Zhang W, Chang H, Dong L, Lin L (2018) Non-locally enhanced encoder-decoder network for single image de-raining. In: Proceedings of the 26th ACM International Conference on Multimedia, pp 1056–1064
Mustaniemi J, Kannala J, Särkkä S, Matas J, Heikkilä J Inertial-aided motion deblurring with deep networks. CoRR arXiv, 1810
Ren W, Liu S, Zhang H, Pan J, Cao X, Yang M-H (2016) Single image dehazing via multi-scale convolutional neural networks. In: Computer Vision—ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, proceedings, part II 14. Springer, pp 154–169
Zhang H, Patel VM (2018) Densely connected pyramid dehazing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3194–3203
Dong C, Loy CC, He K, Tang X (2015) Image super-resolution using deep convolutional networks. IEEE Trans Pattern Anal Mach Intell 38(2):295–307
Yu J et al (2018) Wide activation for efficient and accurate image super-resolution. arXiv preprint arXiv:1808.08718
Fu X, Huang J, Ding X, Liao Y, Paisley J (2017) Clearing the skies: a deep network architecture for single-image rain removal. IEEE Trans Image Process 26(6):2944–2956
Yang W, Tan RT, Feng J, Liu J, Guo Z, Yan S (2017) Deep joint rain detection and removal from a single image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1357–1366
Deng S et al (2020) Detail-recovery image deraining via context aggregation networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 14560–14569
Ding Z, Li H, Zhou D, Liu Y, Hou R (2023) A robust infrared and visible image fusion framework via multi-receptive-field attention and color visual perception. Appl Intell 53(7):8114–8132
Wang T, Yang X, Xu K, Chen S, Zhang Q, Lau RW (2019) Spatial attentive single-image deraining with a high quality real rain dataset. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 12270–12279
Ren W, Tian J, Wang Q, Tang Y (2020) Dually connected deraining net using pixel-wise attention. IEEE Signal Process Lett 27:316–320
Jiang K et al (2020) Decomposition makes better rain removal: an improved attention-guided deraining network. IEEE Trans Circuits Syst Video Technol 31(10):3981–3995
Xie E, Wang W, Yu Z, Anandkumar A, Alvarez JM, Luo P (2021) SegFormer: simple and efficient design for semantic segmentation with transformers. Adv Neural Inf Process Syst 34:12077–12090
Guo M-H, Cai J-X, Liu Z-N, Mu T-J, Martin RR, Hu S-M (2021) PCT: point cloud transformer. Comput Vis Media 7:187–199
Liu Z et al (2021) Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 10012–10022
Xiao J, Fu X, Liu A, Wu F, Zha Z-J (2022) Image de-raining transformer. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2022.3183612
Chen X, Li H, Li M, Pan J (2023) Learning a sparse transformer network for effective image deraining. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 5896–5905
Li Y, Lu J, Chen H, Wu X, Chen X (2023) Dilated convolutional transformer for high-quality image deraining. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 4198–4206
Yang H, Zhou D, Cao J, Zhao Q, Li M (2023) RainFormer: a pyramid transformer for single image deraining. J Supercomput 79(6):6115–6140
Liang Y, Anwar S, Liu Y (2022) DRT: a lightweight single image deraining recursive transformer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 589–598
Yasarla R, Valanarasu JMJ, Patel VM (2020) Exploring overcomplete representations for single image deraining using CNNs. IEEE J Sel Top Signal Process 15(2):229–239
Xue X, Ding Y, Ma L, Wang Y, Liu R, Fan X (2021) Temporal rain decomposition with spatial structure guidance for video deraining. In: ICASSP 2021–2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, pp 2015–2019
Liang X, Zhao F (2022) Single-image rain removal network based on an attention mechanism and a residual structure. IEEE Access 10:52472–52480
Liu L et al (2020) Wavelet-based dual-branch network for image demoiréing. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XIII 16. Springer, pp 86–102
Wang J, Chen K, Xu R, Liu Z, Loy CC, Lin D (2019) Carafe: content-aware reassembly of features. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 3007–3016
Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
Chen L-C, Papandreou G, Schroff F, Adam H (2017) Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587
Wang P et al (2018) Understanding convolution for semantic segmentation. In: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, pp 1451–1460
Y Li, Tan RT, Guo X, Lu J, Brown MS (2016) Rain streak removal using layer priors. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2736–2744
Li S et al (2019) Single image deraining: a comprehensive benchmark analysis. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 3838–3847
Zamir SW et al (2021) Multi-stage progressive image restoration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 14821–14831
Zheng S, Lu C, Wu Y, Gupta G (2022) SAPNet: segmentation-aware progressive network for perceptual contrastive deraining. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp 52–62
Liu B, Fang S (2023) Multi-level wavelet network based on CNN-transformer hybrid attention for single image deraining. Neural Comput Appl 35:1–18
Huynh-Thu Q, Ghanbari M (2008) Scope of validity of PSNR in image/video quality assessment. Electron Lett 44(13):800–801
Mittal A, Soundararajan R, Bovik AC (2012) Making a “completely blind” image quality analyzer. IEEE Signal Process Lett 20(3):209–212
Talebi H, Milanfar P (2018) NIMA: neural image assessment. IEEE Trans Image Process 27(8):3998–4011
Funding
This research was supported by “Famous teacher of teaching” of Yunnan 10000 Talents Program, The National Natural Science Foundation of China under Grants 62266049, 62166048 and 62066046.
Author information
Authors and Affiliations
Contributions
HL and SP wrote the main manuscript text. HL, SP, and XL designed the deraining method. HL, SP, and SY improved the performance of the model and designed the ablation experiments. HL and HL performed the experiments. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
The experiments in this article are all realized through program operation, which will not cause harm to humans and animals and will not cause moral and ethical problems.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, H., Peng, S., Lang, X. et al. Spatial-guided informative semantic joint transformer for single-image deraining. J Supercomput 80, 6522–6551 (2024). https://doi.org/10.1007/s11227-023-05697-z
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-023-05697-z