ABSTRACT
Single image deraining is a very challenging task, as it requires not only restoring the spatial details and high contextual structures of the images, but also removing multiple layers of rain with varying degrees of blurring and resolutions. Recently, due to the powerful modeling capability of long-dependency, transformer-based models have manifested superior performance for high-level vision tasks, and have begun to be applied for low-level vision tasks such as various image restoration applications. However, its computational complexity increases quadratically with spatial resolutions, making it impossible to apply it to high-resolution images. In this study, we propose a novel Channel Transformer, which performs self-attention in the channel direction instead of the spatial direction. Specifically, we first incorporate multiple channel transformer blocks into a multi-scale architecture to extract multi-scale contexts and exploit channel long-dependence, and then learn a coarse estimation of the rain-free image. Finally, an original-resolution CNN-based module is employed to refine the coarse estimation via leveraging the previously learned multi-scale contexts. Experiments on several benchmark datasets demonstrate its superiority over the state-of-the-art methods.
- S. Bengio, I. J. Goodfellow, and A. Kurakin. Adversarial machine learning at scale. arXiv:1611.01236, 2017.Google Scholar
- T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, et al. Language models are few-shot learners. arXiv:2005.14165, 2020.Google Scholar
- N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, and S. Zagoruyko. End-to-end object detection with transformers. In ECCV, 2020.Google ScholarDigital Library
- H. Chen, Y. Wang, T. Guo, C. Xu, Y. Deng, Z. Liu, S. Ma, C. Xu, C. Xu, and W. Gao. Pre-trained image processing transformer. In CVPR, 2021.Google ScholarCross Ref
- D. Chen, C. Chen, and L. Kang. Visual depth guided color image rain streaks removal using sparse coding. IEEE Transactions on Circuits and Systems for Video Technology, 24:1430--1455, 2014.Google ScholarCross Ref
- A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, et al. An image is worth 16x16 words: Transformers for image recognition at scale. In ICLR, 2021.Google Scholar
- Z.W. Fan, H.F. Wu, X.Y. Fu, and Y. Huang. Residual-guide feature fusion network for single image deraining. ACMMM, 2018.Google Scholar
- W. Fedus, B. Zoph, and N. Shazeer. Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity. arXiv:2101.03961, 2021.Google Scholar
- X. Fu, B. Liang, Y. Huang, X. Ding, and J. Paisley. Lightweight pyramid networks for image deraining. TNNLS, 31:1794 -- 1807, 2019.Google Scholar
- X. Fu, J. Huang, D. Zeng, Y. Huang, X. Ding, and J. Paisley. Removing rain from single images via a deep detail network. CVPR, 2017.Google ScholarCross Ref
- X. Fu, J. Huang, X. Ding, Y. Liao, and J. Paisley. Clearing the skies: A deep network architecture for single-image rain removal. IEEE Transactions on Image Processing, 26:2944--2956, 2017.Google ScholarDigital Library
- X.Y. Fu, Q. Qi, Z.J. Zha, Y.R. Zhu, and X.B. Ding. Rain streak removal via dual graph convolutional network. AAAI, 2021.Google ScholarCross Ref
- X. Gao, Y. Wang, J. Cheng, M. Xu, and M. Wang. Metalearning based relation and representation learning networks for single-image deraining. Pattern Recognition, 120:108124, 2021.Google ScholarDigital Library
- J. Hu, S. Li, and G. Sun. Squeeze-and-excitation networks. IEEE Conference on Computer Vision and Pattern Recognition, 2018.Google ScholarCross Ref
- K. Jiang, Z.y. Wang, P. Yi, C. Chen, B.j. Huang, Y.m. Luo, J.y. Ma, and J.j. Jiang. Multi-scale progressive fusion network for single image deraining. CVPR, page 8346--8355, 2020.Google Scholar
- X. Jin, Z.b. Chen, J.x. Lin, Z.k. Chen, and W. Zhou. Unsupervised single image deraining with self-supervised constraints. ICIP, page 2761--2765, 2019.Google Scholar
- L. Kang, C. Lin, and Y. Fu. Automatic single-image-based rain streaks removal via image decomposition. IEEE Transactions on Image Processing, 21:1742--1755, 2012.Google ScholarDigital Library
- S. Khan, M. Naseer, M. Hayat, S. W. Zamir, F. S. Khan, and M. Shah. Transformers in vision: A survey. arXiv:2101.01169, 2021.Google Scholar
- M. Kumar, D. Weissenborn, and N. Kalchbrenner. Colorization transformer. In ICLR, 2021.Google Scholar
- X. Li, J.l. Wu, Z.c. Lin, H. Liu, and H.b. Zha. Recurrent squeeze-and-excitation context aggregation net for single image deraining. ECCV, pages 254--269, 2018.Google ScholarDigital Library
- Y. Li, R. T. Tan, X. Guo, J. Lu, and M. S. Brown. Rain streak removal using layer priors. CVPR, page 2736--2744, 2016.Google Scholar
- J. Liang, J. Cao, G. Sun, K. Zhang, L. V. Gool, and R. Timofte. SwinIR: Image restoration using swin transformer. In ICCV Workshops, 2021.Google ScholarCross Ref
- Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov. RoBERTa: A robustly optimized bert pretraining approach. arXiv:1907.11692, 2019.Google Scholar
- Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo. Swin transformer: Hierarchical vision transformer using shifted windows. arXiv:2103.14030, 2021.Google Scholar
- J. Long, E. Shelhamer, and T. Darrell. Fully convolutional networks for semantic segmentation. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.Google ScholarCross Ref
- Y. Luo, Y. Xu, and H. Ji. Removing rain from a single image via discriminative sparse coding. ICCV, page 3397--3405, 2015.Google Scholar
- P. Mu, J. Chen, R.s. Liu, X. Fan, and Z.x. Luo. Learning bilevel layer priors for single image rain streaks removal. IEEE Signal Processing Letters, 16:307--311, 2019.Google ScholarCross Ref
- J.C. Pu, X.S. Chen, L. Zhang, Q.H. Zhou, and Y. Zhao. Removing rain based on a cycle generative adversarial network. ICIEA, 2018.Google ScholarCross Ref
- K. Purohit, M. Suin, A. Rajagopalan, and V. N. Boddeti. Spatially-adaptive image restoration using distortion-guided networks. In ICCV, 2021.Google ScholarCross Ref
- R. Qian, R. T. Tan, W. Yang, J. Su, and J. Liu. Attentive generative adversarial network for raindrop removal from a single image. CVPR, page 2482--2491, 2018.Google Scholar
- A. Radford, K. Narasimhan, T. Salimans, and I. Sutskever. Improving language understanding by generative pre-training. Technical report, OpenAI, 2018.Google Scholar
- D.w. Ren, W.m. Zuo, Q.h. Hu, P.f. Zhu, and D.y. Meng. Progressive image deraining networks: a better and simpler baseline. CVPR, page 3937--394, 2019.Google Scholar
- O. Ronneberger, P. Fischer, and T. Brox. U-net: Convolutional networks for biomedical image segmentation. In MICCAI, 2015.Google ScholarCross Ref
- H. Touvron, M. Cord, M. Douze, F. Massa, A. Sablayrolles, and H. Jegou. Training data-efficient image transformers & distillation through attention. In ICML, 2021.Google Scholar
- A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin. Attention is all you need. In NeurIPS, 2017.Google ScholarDigital Library
- G.q. Wang, C.m. Sun, and A. Sowmya. Erlnet: Entangled representation learning for single image deraining. ICCV, page 5644--5652, 2019.Google Scholar
- H. Wang, Q. Xie, Q. Zhao, and D.Y. Meng. Joint rain detection and removal from a single image with contextualized deep networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42:1377--1393, 2020.Google ScholarCross Ref
- H.Wang, Q. Xie, Q. Zhao, Y. Liang, and D.Y. Meng. Rcdnet: An interpretable rain convolutional dictionary network for single image deraining. arXiv:2107.06808, 2021.Google Scholar
- H.Wang, Y.c.Wu, Q. Xie, Q. Zhao, Y. Liang, S.j. Zhang, and D.y. Meng. Structural residual learning for single image rain removal. Knowledge-Based Systems, page 106595, 2020.Google Scholar
- T.y Wang, X. Yang, K. Xu, S.z. Chen, Q. Zhang, and R. WH Lau. Spatial attentive single-image deraining with a high quality real rain dataset. CVPR, page 12270--12279, 2019.Google Scholar
- W. Wang, E. Xie, X. Li, D. Fan, K. Song, D. Liang, T. Lu, P. Luo, and L. Shao. Pyramid vision transformer: A versatile backbone for dense prediction without convolutions. In ICCV, 2021.Google ScholarCross Ref
- Z. Wang, A.C. Bovik, H.R. Sheikh, and E.P. Simoncelli. Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing, 13:600--612.Google ScholarDigital Library
- Z. Wang, X. Cun, J. Bao, and J. Liu. Uformer: A general u-shaped transformer for image restoration. arXiv:2106.03106, 2021.Google Scholar
- W. Wei, D.y. Meng, Q Zhao, Z.b. Xu, and Y. Wu. Semisupervised transfer learning for image rain removal. CVPR, page 3877--3886, 2019.Google Scholar
- E. Xie, W. Wang, Z. Yu, A. Anandkumar, J. M. Alvarez, and P. Luo. Segformer: Simple and efficient design for semantic segmentation with transformers. arXiv:2105.15203, 2021.Google Scholar
- K. Yamamichi and X.-H. Han. Mcgkt-net: Multi-level context gating knowledge transfer network for single image deraining. ACCV, page 68--83, 2020.Google Scholar
- F. Yang, H. Yang, J. Fu, H. Lu, and B. Guo. Learning texture transformer network for image super-resolution. In CVPR, 2020.Google ScholarCross Ref
- W. Yang, R. T. Tan, J. Feng, J. Liu, Z. Guo, and S. Yan. Deep joint rain detection and removal from a single image. CVPR, 2017.Google ScholarCross Ref
- W.H. Yang, R. T. Tan, J.A. Feng, Z.M. Guo, S.C. Yan, and J.Y. Liu. Rcdnet: a model-driven deep neural network for single image rain removal. CVPR, 2020.Google Scholar
- W.H. Yang, S.Q. Wang, and J.Y. Liu. Removing arbitraryscale rain streaks via fractal band learning with selfsupervision. IEEE Transactions on Image Processing, 42:6759--6772, 2020.Google ScholarCross Ref
- Y.Z. Yang, W. Ran, and H. Lu. Rddan: A residual dense dilated aggregated network for single image deraining. ICME, 2020.Google ScholarCross Ref
- R. Yasarla and V. M Patel. Uncertainty guided multi-scale residual learning-using a cycle spinning cnn for single image de-raining. CVPR, page 8405--8414, 2019.Google Scholar
- R. Yasarla, V. A. Sindagi, and V. M Patel. Syn2real transfer learning for image deraining using gaussian processes. CVPR, 2020.Google ScholarCross Ref
- L. Yuan, Y. Chen, T. Wang, W. Yu, Y. Shi, Z. Jiang, F. E. Tay, J. Feng, and S. Yan. Tokens-to-token vit: Training vision transformers from scratch on imagenet. arXiv:2101.11986, 2021.Google Scholar
- S.W. Zamir, A. Arora, S. Khan, M. Hayat, F. S. Khan, M.-H. Yang, and L. Shao. Multi-stage progressive image restoration. ACM Multimedia Conference, page 14821--14831, 2018.Google Scholar
- S. W. Zamir, A. Arora, S. Khan, M. Hayat, F. S. Khan, M. Yang, and L. Shao. Multi-stage progressive image restoration. In CVPR, 2021.Google ScholarCross Ref
- H. Zhang and V. M. Patel. Density-aware single image deraining using a multi-stream dense network. CVPR, page 695--704, 2018.Google Scholar
- H. Zhang, V. Sindagi, and V. M. Patel. Image de-raining using a conditional generative adversarial network. TCSVT, 2019.Google Scholar
- S. Zheng, J. Lu, H. Zhao, X. Zhu, Z. Luo, Y. Wang, Y. Fu, J. Feng, T. Xiang, P. H. Torr, et al. Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In CVPR, 2021.Google ScholarCross Ref
- Y. P. Zheng, X. Yu, M. M. Liu, and S. L. Zhang. Residual multiscale based single image deraining. BMVC, 2019.Google Scholar
- X. Zhu, W. Su, L. Lu, B. Li, X. Wang, and J. Dai. Deformable DETR: Deformable transformers for end-to-end object detection. arXiv:2010.04159, 2020.Google Scholar
Index Terms
- Multi-Scale Channel Transformer Network for Single Image Deraining
Recommendations
RainFormer: a pyramid transformer for single image deraining
AbstractRain impairs the performance of outdoor vision systems, such as automated driving systems and outdoor surveillance systems. Therefore, as an image preprocessing technique, image deraining has great potential for application. Defects of ...
Multi-scale Attentive Residual Network for Single Image Deraining
Human Centered ComputingAbstractRemoving rain streaks from a single image is extremely challenging since the appearance of rain streaks in shapes, scales and densities is ever changing. Therefore, we propose a novel end-to-end two- stage multi-scale attentive residual network ...
A Single Image Deraining Network Based on Global Feature Perception
CSAI '22: Proceedings of the 2022 6th International Conference on Computer Science and Artificial IntelligenceRainy weather can degrade the quality of image captured outdoors, which in turn reduces the effectiveness of the subsequent computer vision algorithm. Therefore, as a way to improve the performance of subsequent visual tasks, single image deraining has ...
Comments