research-article

Multi-Scale Channel Transformer Network for Single Image Deraining

Authors:
Yuto Namba

Yamaguchi University, Yamaguchi, Japan

Yamaguchi University, Yamaguchi, Japan
View Profile

,
Xian-Hua Han

Yamaguchi University, Yamaguchi, Japan

Yamaguchi University, Yamaguchi, Japan
View Profile

MMAsia '22: Proceedings of the 4th ACM International Conference on Multimedia in AsiaDecember 2022Article No.: 20Pages 1–7https://doi.org/10.1145/3551626.3564946

Published:13 December 2022Publication History

MMAsia '22: Proceedings of the 4th ACM International Conference on Multimedia in Asia

Pages 1–7

ABSTRACT

Single image deraining is a very challenging task, as it requires not only restoring the spatial details and high contextual structures of the images, but also removing multiple layers of rain with varying degrees of blurring and resolutions. Recently, due to the powerful modeling capability of long-dependency, transformer-based models have manifested superior performance for high-level vision tasks, and have begun to be applied for low-level vision tasks such as various image restoration applications. However, its computational complexity increases quadratically with spatial resolutions, making it impossible to apply it to high-resolution images. In this study, we propose a novel Channel Transformer, which performs self-attention in the channel direction instead of the spatial direction. Specifically, we first incorporate multiple channel transformer blocks into a multi-scale architecture to extract multi-scale contexts and exploit channel long-dependence, and then learn a coarse estimation of the rain-free image. Finally, an original-resolution CNN-based module is employed to refine the coarse estimation via leveraging the previously learned multi-scale contexts. Experiments on several benchmark datasets demonstrate its superiority over the state-of-the-art methods.

References

S. Bengio, I. J. Goodfellow, and A. Kurakin. Adversarial machine learning at scale. arXiv:1611.01236, 2017.Google Scholar
T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, et al. Language models are few-shot learners. arXiv:2005.14165, 2020.Google Scholar
N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, and S. Zagoruyko. End-to-end object detection with transformers. In ECCV, 2020.Google ScholarDigital Library
H. Chen, Y. Wang, T. Guo, C. Xu, Y. Deng, Z. Liu, S. Ma, C. Xu, C. Xu, and W. Gao. Pre-trained image processing transformer. In CVPR, 2021.Google ScholarCross Ref
D. Chen, C. Chen, and L. Kang. Visual depth guided color image rain streaks removal using sparse coding. IEEE Transactions on Circuits and Systems for Video Technology, 24:1430--1455, 2014.Google ScholarCross Ref
A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, et al. An image is worth 16x16 words: Transformers for image recognition at scale. In ICLR, 2021.Google Scholar
Z.W. Fan, H.F. Wu, X.Y. Fu, and Y. Huang. Residual-guide feature fusion network for single image deraining. ACMMM, 2018.Google Scholar
W. Fedus, B. Zoph, and N. Shazeer. Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity. arXiv:2101.03961, 2021.Google Scholar
X. Fu, B. Liang, Y. Huang, X. Ding, and J. Paisley. Lightweight pyramid networks for image deraining. TNNLS, 31:1794 -- 1807, 2019.Google Scholar
X. Fu, J. Huang, D. Zeng, Y. Huang, X. Ding, and J. Paisley. Removing rain from single images via a deep detail network. CVPR, 2017.Google ScholarCross Ref
X. Fu, J. Huang, X. Ding, Y. Liao, and J. Paisley. Clearing the skies: A deep network architecture for single-image rain removal. IEEE Transactions on Image Processing, 26:2944--2956, 2017.Google ScholarDigital Library
X.Y. Fu, Q. Qi, Z.J. Zha, Y.R. Zhu, and X.B. Ding. Rain streak removal via dual graph convolutional network. AAAI, 2021.Google ScholarCross Ref
X. Gao, Y. Wang, J. Cheng, M. Xu, and M. Wang. Metalearning based relation and representation learning networks for single-image deraining. Pattern Recognition, 120:108124, 2021.Google ScholarDigital Library
J. Hu, S. Li, and G. Sun. Squeeze-and-excitation networks. IEEE Conference on Computer Vision and Pattern Recognition, 2018.Google ScholarCross Ref
K. Jiang, Z.y. Wang, P. Yi, C. Chen, B.j. Huang, Y.m. Luo, J.y. Ma, and J.j. Jiang. Multi-scale progressive fusion network for single image deraining. CVPR, page 8346--8355, 2020.Google Scholar
X. Jin, Z.b. Chen, J.x. Lin, Z.k. Chen, and W. Zhou. Unsupervised single image deraining with self-supervised constraints. ICIP, page 2761--2765, 2019.Google Scholar
L. Kang, C. Lin, and Y. Fu. Automatic single-image-based rain streaks removal via image decomposition. IEEE Transactions on Image Processing, 21:1742--1755, 2012.Google ScholarDigital Library
S. Khan, M. Naseer, M. Hayat, S. W. Zamir, F. S. Khan, and M. Shah. Transformers in vision: A survey. arXiv:2101.01169, 2021.Google Scholar
M. Kumar, D. Weissenborn, and N. Kalchbrenner. Colorization transformer. In ICLR, 2021.Google Scholar
X. Li, J.l. Wu, Z.c. Lin, H. Liu, and H.b. Zha. Recurrent squeeze-and-excitation context aggregation net for single image deraining. ECCV, pages 254--269, 2018.Google ScholarDigital Library
Y. Li, R. T. Tan, X. Guo, J. Lu, and M. S. Brown. Rain streak removal using layer priors. CVPR, page 2736--2744, 2016.Google Scholar
J. Liang, J. Cao, G. Sun, K. Zhang, L. V. Gool, and R. Timofte. SwinIR: Image restoration using swin transformer. In ICCV Workshops, 2021.Google ScholarCross Ref
Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov. RoBERTa: A robustly optimized bert pretraining approach. arXiv:1907.11692, 2019.Google Scholar
Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo. Swin transformer: Hierarchical vision transformer using shifted windows. arXiv:2103.14030, 2021.Google Scholar
J. Long, E. Shelhamer, and T. Darrell. Fully convolutional networks for semantic segmentation. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.Google ScholarCross Ref
Y. Luo, Y. Xu, and H. Ji. Removing rain from a single image via discriminative sparse coding. ICCV, page 3397--3405, 2015.Google Scholar
P. Mu, J. Chen, R.s. Liu, X. Fan, and Z.x. Luo. Learning bilevel layer priors for single image rain streaks removal. IEEE Signal Processing Letters, 16:307--311, 2019.Google ScholarCross Ref
J.C. Pu, X.S. Chen, L. Zhang, Q.H. Zhou, and Y. Zhao. Removing rain based on a cycle generative adversarial network. ICIEA, 2018.Google ScholarCross Ref
K. Purohit, M. Suin, A. Rajagopalan, and V. N. Boddeti. Spatially-adaptive image restoration using distortion-guided networks. In ICCV, 2021.Google ScholarCross Ref
R. Qian, R. T. Tan, W. Yang, J. Su, and J. Liu. Attentive generative adversarial network for raindrop removal from a single image. CVPR, page 2482--2491, 2018.Google Scholar
A. Radford, K. Narasimhan, T. Salimans, and I. Sutskever. Improving language understanding by generative pre-training. Technical report, OpenAI, 2018.Google Scholar
D.w. Ren, W.m. Zuo, Q.h. Hu, P.f. Zhu, and D.y. Meng. Progressive image deraining networks: a better and simpler baseline. CVPR, page 3937--394, 2019.Google Scholar
O. Ronneberger, P. Fischer, and T. Brox. U-net: Convolutional networks for biomedical image segmentation. In MICCAI, 2015.Google ScholarCross Ref
H. Touvron, M. Cord, M. Douze, F. Massa, A. Sablayrolles, and H. Jegou. Training data-efficient image transformers & distillation through attention. In ICML, 2021.Google Scholar
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin. Attention is all you need. In NeurIPS, 2017.Google ScholarDigital Library
G.q. Wang, C.m. Sun, and A. Sowmya. Erlnet: Entangled representation learning for single image deraining. ICCV, page 5644--5652, 2019.Google Scholar
H. Wang, Q. Xie, Q. Zhao, and D.Y. Meng. Joint rain detection and removal from a single image with contextualized deep networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42:1377--1393, 2020.Google ScholarCross Ref
H.Wang, Q. Xie, Q. Zhao, Y. Liang, and D.Y. Meng. Rcdnet: An interpretable rain convolutional dictionary network for single image deraining. arXiv:2107.06808, 2021.Google Scholar
H.Wang, Y.c.Wu, Q. Xie, Q. Zhao, Y. Liang, S.j. Zhang, and D.y. Meng. Structural residual learning for single image rain removal. Knowledge-Based Systems, page 106595, 2020.Google Scholar
T.y Wang, X. Yang, K. Xu, S.z. Chen, Q. Zhang, and R. WH Lau. Spatial attentive single-image deraining with a high quality real rain dataset. CVPR, page 12270--12279, 2019.Google Scholar
W. Wang, E. Xie, X. Li, D. Fan, K. Song, D. Liang, T. Lu, P. Luo, and L. Shao. Pyramid vision transformer: A versatile backbone for dense prediction without convolutions. In ICCV, 2021.Google ScholarCross Ref
Z. Wang, A.C. Bovik, H.R. Sheikh, and E.P. Simoncelli. Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing, 13:600--612.Google ScholarDigital Library
Z. Wang, X. Cun, J. Bao, and J. Liu. Uformer: A general u-shaped transformer for image restoration. arXiv:2106.03106, 2021.Google Scholar
W. Wei, D.y. Meng, Q Zhao, Z.b. Xu, and Y. Wu. Semisupervised transfer learning for image rain removal. CVPR, page 3877--3886, 2019.Google Scholar
E. Xie, W. Wang, Z. Yu, A. Anandkumar, J. M. Alvarez, and P. Luo. Segformer: Simple and efficient design for semantic segmentation with transformers. arXiv:2105.15203, 2021.Google Scholar
K. Yamamichi and X.-H. Han. Mcgkt-net: Multi-level context gating knowledge transfer network for single image deraining. ACCV, page 68--83, 2020.Google Scholar
F. Yang, H. Yang, J. Fu, H. Lu, and B. Guo. Learning texture transformer network for image super-resolution. In CVPR, 2020.Google ScholarCross Ref
W. Yang, R. T. Tan, J. Feng, J. Liu, Z. Guo, and S. Yan. Deep joint rain detection and removal from a single image. CVPR, 2017.Google ScholarCross Ref
W.H. Yang, R. T. Tan, J.A. Feng, Z.M. Guo, S.C. Yan, and J.Y. Liu. Rcdnet: a model-driven deep neural network for single image rain removal. CVPR, 2020.Google Scholar
W.H. Yang, S.Q. Wang, and J.Y. Liu. Removing arbitraryscale rain streaks via fractal band learning with selfsupervision. IEEE Transactions on Image Processing, 42:6759--6772, 2020.Google ScholarCross Ref
Y.Z. Yang, W. Ran, and H. Lu. Rddan: A residual dense dilated aggregated network for single image deraining. ICME, 2020.Google ScholarCross Ref
R. Yasarla and V. M Patel. Uncertainty guided multi-scale residual learning-using a cycle spinning cnn for single image de-raining. CVPR, page 8405--8414, 2019.Google Scholar
R. Yasarla, V. A. Sindagi, and V. M Patel. Syn2real transfer learning for image deraining using gaussian processes. CVPR, 2020.Google ScholarCross Ref
L. Yuan, Y. Chen, T. Wang, W. Yu, Y. Shi, Z. Jiang, F. E. Tay, J. Feng, and S. Yan. Tokens-to-token vit: Training vision transformers from scratch on imagenet. arXiv:2101.11986, 2021.Google Scholar
S.W. Zamir, A. Arora, S. Khan, M. Hayat, F. S. Khan, M.-H. Yang, and L. Shao. Multi-stage progressive image restoration. ACM Multimedia Conference, page 14821--14831, 2018.Google Scholar
S. W. Zamir, A. Arora, S. Khan, M. Hayat, F. S. Khan, M. Yang, and L. Shao. Multi-stage progressive image restoration. In CVPR, 2021.Google ScholarCross Ref
H. Zhang and V. M. Patel. Density-aware single image deraining using a multi-stream dense network. CVPR, page 695--704, 2018.Google Scholar
H. Zhang, V. Sindagi, and V. M. Patel. Image de-raining using a conditional generative adversarial network. TCSVT, 2019.Google Scholar
S. Zheng, J. Lu, H. Zhao, X. Zhu, Z. Luo, Y. Wang, Y. Fu, J. Feng, T. Xiang, P. H. Torr, et al. Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In CVPR, 2021.Google ScholarCross Ref
Y. P. Zheng, X. Yu, M. M. Liu, and S. L. Zhang. Residual multiscale based single image deraining. BMVC, 2019.Google Scholar
X. Zhu, W. Su, L. Lu, B. Li, X. Wang, and J. Dai. Deformable DETR: Deformable transformers for end-to-end object detection. arXiv:2010.04159, 2020.Google Scholar

Index Terms

Multi-Scale Channel Transformer Network for Single Image Deraining
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks
  2. Machine learning

Recommendations

RainFormer: a pyramid transformer for single image deraining
Abstract
Rain impairs the performance of outdoor vision systems, such as automated driving systems and outdoor surveillance systems. Therefore, as an image preprocessing technique, image deraining has great potential for application. Defects of ...
Read More
Multi-scale Attentive Residual Network for Single Image Deraining
Human Centered Computing
Abstract
Removing rain streaks from a single image is extremely challenging since the appearance of rain streaks in shapes, scales and densities is ever changing. Therefore, we propose a novel end-to-end two- stage multi-scale attentive residual network ...
Read More
A Single Image Deraining Network Based on Global Feature Perception
CSAI '22: Proceedings of the 2022 6th International Conference on Computer Science and Artificial Intelligence

Rainy weather can degrade the quality of image captured outdoors, which in turn reduces the effectiveness of the subsequent computer vision algorithm. Therefore, as a way to improve the performance of subsequent visual tasks, single image deraining has ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MMAsia '22: Proceedings of the 4th ACM International Conference on Multimedia in Asia
December 2022
296 pages
ISBN:9781450394789
DOI:10.1145/3551626
Conference Chair:
Shuqiang Jiang
CASROLE@GENERAL CHAIR
,
General Chairs:
Kiyoharu Aizawa
The University of Tokyo
,
Phoebe Chen
La Trobe
,
Keiji Yanai
The University of Electro-Communications
Copyright © 2022 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 13 December 2022
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
computer vision
low-level vision task
single image deraining
transformer
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate59of204submissions,29%
Upcoming Conference
MM '24

Sponsor:

sigmm

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 110
  Total Downloads
- Downloads (Last 12 months)59
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Multi-Scale Channel Transformer Network for Single Image Deraining

MMAsia '22: Proceedings of the 4th ACM International Conference on Multimedia in Asia

ABSTRACT

References

Cited By

Index Terms

Recommendations

RainFormer: a pyramid transformer for single image deraining

Multi-scale Attentive Residual Network for Single Image Deraining

A Single Image Deraining Network Based on Global Feature Perception

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Multi-Scale Channel Transformer Network for Single Image Deraining

MMAsia '22: Proceedings of the 4th ACM International Conference on Multimedia in Asia

ABSTRACT

References

Cited By

Index Terms

Recommendations

RainFormer: a pyramid transformer for single image deraining

Multi-scale Attentive Residual Network for Single Image Deraining

A Single Image Deraining Network Based on Global Feature Perception

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media