Spatial-guided informative semantic joint transformer for single-image deraining

Li, Haiyan; Peng, Shaolin; Lang, Xun; Ye, Shuhua; Li, Hongsong

doi:10.1007/s11227-023-05697-z

Spatial-guided informative semantic joint transformer for single-image deraining

Published: 25 October 2023

Volume 80, pages 6522–6551, (2024)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Haiyan Li¹,
Shaolin Peng¹,
Xun Lang¹,
Shuhua Ye² &
…
Hongsong Li¹

163 Accesses
Explore all metrics

Abstract

Single-image deraining is of great significance for image recognition and analysis. However, the majority of current methods face challenges such as incomplete removal of tiny rain streaks, blurred restoration of background structure, and feature interference. To address these issues, a spatial-guided informative semantic joint transformer (SISTrans) is proposed. Specifically, a high-dimensional spatial feature mapping module is put forward to restrain the receptive field of filters by increasing the spatial resolution of the image, guiding the entire module to focus on local features and thereby learn the distribution of small rain streaks. Subsequently, a wavelet-content-aware-based dual-level module is designed to capture the high-level semantic information by using an improved Swin transformer and to establish effective long-range dependencies via a shifted window mechanism, thereby enhancing the quality of background restoration. Ultimately, a dynamic hybrid cross-fusion module is proposed to effectively avoid feature interference by recalibrating the features of two branches and fusing the calibrated features with a set of learnable parameters. Extensive experiments conducted on eight commonly benchmark datasets demonstrate that the proposed SISTrans outperforms the state-of-the-art methods. Code is available at: https://github.com/SL-Pen/SISTrans.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A two-stage network with wavelet transformation for single-image deraining

Article 13 June 2022

Multi-resolution Parallel Aggregation Network for Single Image Deraining

Residual Contextual Hourglass Network for Single-Image Deraining

Article Open access 22 February 2024

References

He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp 2961–2969
Duan K, Xie L, Qi H, Bai S, Huang Q, Tian Q (2020) Corner proposal network for anchor-free, two-stage object detection. In: Computer Vision—ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part III. Springer, pp 399–416
Comaniciu D, Ramesh V, Meer P (2003) Kernel-based object tracking. IEEE Trans Pattern Anal Mach Intell 25(5):564–577
Article Google Scholar
Janai J, Güney F, Behl A, Geiger A (2020) Computer vision for autonomous vehicles: problems, datasets and state of the art. Found Trends Comput Graphics Vis 12(1–3):1–308
Article Google Scholar
Yang W, Tan RT, Wang S, Fang Y, Liu J (2020) Single image deraining: from model-based to data-driven and beyond. IEEE Trans Pattern Anal Mach Intell 43(11):4059–4077
Article Google Scholar
Li X, Wu J, Lin Z, Liu H, Zha H (2018) Recurrent squeeze-and-excitation context aggregation net for single image deraining. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 254–269
Zhang H, Patel VM (2018) Density-aware single image de-raining using a multi-stream dense network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 695–704
Fu X, Huang J, Zeng D, Huang Y, Ding X, Paisley J (2017) Removing rain from single images via a deep detail network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3855–3863
Ren D, Zuo W, Hu Q, Zhu P, Meng D (2019) Progressive image deraining networks: a better and simpler baseline. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 3937–3946
Yasarla R, Patel VM (2019) Uncertainty guided multi-scale residual learning-using a cycle spinning CNN for single image de-raining. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 8405–8414
Jiang K et al (2020) Multi-scale progressive fusion network for single image deraining. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 8346–8355
Hu X, Fu C-W, Zhu L, Heng P-A (2019) Depth-attentional features for single-image rain removal. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 8022–8031
Zhang K, Li D, Luo W, Ren W (2021) Dual attention-in-attention model for joint rain streak and raindrop removal. IEEE Trans Image Process 30:7608–7619
Article ADS PubMed Google Scholar
Otter DW, Medina JR, Kalita JK (2020) A survey of the usages of deep learning for natural language processing. IEEE Trans Neural Netw Learn Syst 32(2):604–624
Article MathSciNet Google Scholar
Gao Y, Liu X, Li J, Fang Z, Jiang X, Huq KMS (2022) LFT-Net: Local feature transformer network for point clouds analysis. IEEE Trans Intell Transp Syst 24(2):2158–2168
Google Scholar
Gou C, Zhou Y, Li D (2022) Driver attention prediction based on convolution and transformers. J Supercomput 78(6):8268–8284
Article Google Scholar
Dosovitskiy A et al (2020) An image is worth 16 × 16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929
Chen H et al (2021) Pre-trained image processing transformer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 12299–12310
Ji H, Feng X, Pei W, Li J, Lu G (2021) U2-former: a nested u-shaped transformer for image restoration. arXiv preprint arXiv:2112.02279
Fan Z, Wu H, Fu X, Huang Y, Ding X (2018) Residual-guide network for single image deraining. In: Proceedings of the 26th ACM International Conference on Multimedia, pp 1751–1759
Li G, He X, Zhang W, Chang H, Dong L, Lin L (2018) Non-locally enhanced encoder-decoder network for single image de-raining. In: Proceedings of the 26th ACM International Conference on Multimedia, pp 1056–1064
Mustaniemi J, Kannala J, Särkkä S, Matas J, Heikkilä J Inertial-aided motion deblurring with deep networks. CoRR arXiv, 1810
Ren W, Liu S, Zhang H, Pan J, Cao X, Yang M-H (2016) Single image dehazing via multi-scale convolutional neural networks. In: Computer Vision—ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, proceedings, part II 14. Springer, pp 154–169
Zhang H, Patel VM (2018) Densely connected pyramid dehazing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3194–3203
Dong C, Loy CC, He K, Tang X (2015) Image super-resolution using deep convolutional networks. IEEE Trans Pattern Anal Mach Intell 38(2):295–307
Article Google Scholar
Yu J et al (2018) Wide activation for efficient and accurate image super-resolution. arXiv preprint arXiv:1808.08718
Fu X, Huang J, Ding X, Liao Y, Paisley J (2017) Clearing the skies: a deep network architecture for single-image rain removal. IEEE Trans Image Process 26(6):2944–2956
Article ADS MathSciNet Google Scholar
Yang W, Tan RT, Feng J, Liu J, Guo Z, Yan S (2017) Deep joint rain detection and removal from a single image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1357–1366
Deng S et al (2020) Detail-recovery image deraining via context aggregation networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 14560–14569
Ding Z, Li H, Zhou D, Liu Y, Hou R (2023) A robust infrared and visible image fusion framework via multi-receptive-field attention and color visual perception. Appl Intell 53(7):8114–8132
Article Google Scholar
Wang T, Yang X, Xu K, Chen S, Zhang Q, Lau RW (2019) Spatial attentive single-image deraining with a high quality real rain dataset. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 12270–12279
Ren W, Tian J, Wang Q, Tang Y (2020) Dually connected deraining net using pixel-wise attention. IEEE Signal Process Lett 27:316–320
Article ADS Google Scholar
Jiang K et al (2020) Decomposition makes better rain removal: an improved attention-guided deraining network. IEEE Trans Circuits Syst Video Technol 31(10):3981–3995
Article Google Scholar
Xie E, Wang W, Yu Z, Anandkumar A, Alvarez JM, Luo P (2021) SegFormer: simple and efficient design for semantic segmentation with transformers. Adv Neural Inf Process Syst 34:12077–12090
Google Scholar
Guo M-H, Cai J-X, Liu Z-N, Mu T-J, Martin RR, Hu S-M (2021) PCT: point cloud transformer. Comput Vis Media 7:187–199
Article Google Scholar
Liu Z et al (2021) Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 10012–10022
Xiao J, Fu X, Liu A, Wu F, Zha Z-J (2022) Image de-raining transformer. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2022.3183612
Article PubMed Google Scholar
Chen X, Li H, Li M, Pan J (2023) Learning a sparse transformer network for effective image deraining. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 5896–5905
Li Y, Lu J, Chen H, Wu X, Chen X (2023) Dilated convolutional transformer for high-quality image deraining. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 4198–4206
Yang H, Zhou D, Cao J, Zhao Q, Li M (2023) RainFormer: a pyramid transformer for single image deraining. J Supercomput 79(6):6115–6140
Article Google Scholar
Liang Y, Anwar S, Liu Y (2022) DRT: a lightweight single image deraining recursive transformer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 589–598
Yasarla R, Valanarasu JMJ, Patel VM (2020) Exploring overcomplete representations for single image deraining using CNNs. IEEE J Sel Top Signal Process 15(2):229–239
Article ADS Google Scholar
Xue X, Ding Y, Ma L, Wang Y, Liu R, Fan X (2021) Temporal rain decomposition with spatial structure guidance for video deraining. In: ICASSP 2021–2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, pp 2015–2019
Liang X, Zhao F (2022) Single-image rain removal network based on an attention mechanism and a residual structure. IEEE Access 10:52472–52480
Article Google Scholar
Liu L et al (2020) Wavelet-based dual-branch network for image demoiréing. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XIII 16. Springer, pp 86–102
Wang J, Chen K, Xu R, Liu Z, Loy CC, Lin D (2019) Carafe: content-aware reassembly of features. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 3007–3016
Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
Article PubMed Google Scholar
Chen L-C, Papandreou G, Schroff F, Adam H (2017) Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587
Wang P et al (2018) Understanding convolution for semantic segmentation. In: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, pp 1451–1460
Y Li, Tan RT, Guo X, Lu J, Brown MS (2016) Rain streak removal using layer priors. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2736–2744
Li S et al (2019) Single image deraining: a comprehensive benchmark analysis. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 3838–3847
Zamir SW et al (2021) Multi-stage progressive image restoration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 14821–14831
Zheng S, Lu C, Wu Y, Gupta G (2022) SAPNet: segmentation-aware progressive network for perceptual contrastive deraining. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp 52–62
Liu B, Fang S (2023) Multi-level wavelet network based on CNN-transformer hybrid attention for single image deraining. Neural Comput Appl 35:1–18
Article Google Scholar
Huynh-Thu Q, Ghanbari M (2008) Scope of validity of PSNR in image/video quality assessment. Electron Lett 44(13):800–801
Article ADS Google Scholar
Mittal A, Soundararajan R, Bovik AC (2012) Making a “completely blind” image quality analyzer. IEEE Signal Process Lett 20(3):209–212
Article ADS Google Scholar
Talebi H, Milanfar P (2018) NIMA: neural image assessment. IEEE Trans Image Process 27(8):3998–4011
Article ADS MathSciNet Google Scholar

Download references

Funding

This research was supported by “Famous teacher of teaching” of Yunnan 10000 Talents Program, The National Natural Science Foundation of China under Grants 62266049, 62166048 and 62066046.

Author information

Authors and Affiliations

School of Information, Yunnan University, Kunming, 650504, China
Haiyan Li, Shaolin Peng, Xun Lang & Hongsong Li
Asset Management Department, The Third Affiliated Hospital of Kunming Medical University, Kunming, 650118, Yunnan Province, China
Shuhua Ye

Authors

Haiyan Li
View author publications
You can also search for this author in PubMed Google Scholar
Shaolin Peng
View author publications
You can also search for this author in PubMed Google Scholar
Xun Lang
View author publications
You can also search for this author in PubMed Google Scholar
Shuhua Ye
View author publications
You can also search for this author in PubMed Google Scholar
Hongsong Li
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

HL and SP wrote the main manuscript text. HL, SP, and XL designed the deraining method. HL, SP, and SY improved the performance of the model and designed the ablation experiments. HL and HL performed the experiments. All authors reviewed the manuscript.

Corresponding author

Correspondence to Xun Lang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

The experiments in this article are all realized through program operation, which will not cause harm to humans and animals and will not cause moral and ethical problems.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Li, H., Peng, S., Lang, X. et al. Spatial-guided informative semantic joint transformer for single-image deraining. J Supercomput 80, 6522–6551 (2024). https://doi.org/10.1007/s11227-023-05697-z

Download citation

Accepted: 30 September 2023
Published: 25 October 2023
Issue Date: March 2024
DOI: https://doi.org/10.1007/s11227-023-05697-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Spatial-guided informative semantic joint transformer for single-image deraining

Abstract

Access this article

Similar content being viewed by others

A two-stage network with wavelet transformation for single-image deraining

Multi-resolution Parallel Aggregation Network for Single Image Deraining

Residual Contextual Hourglass Network for Single-Image Deraining

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Spatial-guided informative semantic joint transformer for single-image deraining

Abstract

Access this article

Similar content being viewed by others

A two-stage network with wavelet transformation for single-image deraining

Multi-resolution Parallel Aggregation Network for Single Image Deraining

Residual Contextual Hourglass Network for Single-Image Deraining

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation