skip to main content
10.1145/3551626.3564946acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Multi-Scale Channel Transformer Network for Single Image Deraining

Published:13 December 2022Publication History

ABSTRACT

Single image deraining is a very challenging task, as it requires not only restoring the spatial details and high contextual structures of the images, but also removing multiple layers of rain with varying degrees of blurring and resolutions. Recently, due to the powerful modeling capability of long-dependency, transformer-based models have manifested superior performance for high-level vision tasks, and have begun to be applied for low-level vision tasks such as various image restoration applications. However, its computational complexity increases quadratically with spatial resolutions, making it impossible to apply it to high-resolution images. In this study, we propose a novel Channel Transformer, which performs self-attention in the channel direction instead of the spatial direction. Specifically, we first incorporate multiple channel transformer blocks into a multi-scale architecture to extract multi-scale contexts and exploit channel long-dependence, and then learn a coarse estimation of the rain-free image. Finally, an original-resolution CNN-based module is employed to refine the coarse estimation via leveraging the previously learned multi-scale contexts. Experiments on several benchmark datasets demonstrate its superiority over the state-of-the-art methods.

References

  1. S. Bengio, I. J. Goodfellow, and A. Kurakin. Adversarial machine learning at scale. arXiv:1611.01236, 2017.Google ScholarGoogle Scholar
  2. T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, et al. Language models are few-shot learners. arXiv:2005.14165, 2020.Google ScholarGoogle Scholar
  3. N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, and S. Zagoruyko. End-to-end object detection with transformers. In ECCV, 2020.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. H. Chen, Y. Wang, T. Guo, C. Xu, Y. Deng, Z. Liu, S. Ma, C. Xu, C. Xu, and W. Gao. Pre-trained image processing transformer. In CVPR, 2021.Google ScholarGoogle ScholarCross RefCross Ref
  5. D. Chen, C. Chen, and L. Kang. Visual depth guided color image rain streaks removal using sparse coding. IEEE Transactions on Circuits and Systems for Video Technology, 24:1430--1455, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  6. A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, et al. An image is worth 16x16 words: Transformers for image recognition at scale. In ICLR, 2021.Google ScholarGoogle Scholar
  7. Z.W. Fan, H.F. Wu, X.Y. Fu, and Y. Huang. Residual-guide feature fusion network for single image deraining. ACMMM, 2018.Google ScholarGoogle Scholar
  8. W. Fedus, B. Zoph, and N. Shazeer. Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity. arXiv:2101.03961, 2021.Google ScholarGoogle Scholar
  9. X. Fu, B. Liang, Y. Huang, X. Ding, and J. Paisley. Lightweight pyramid networks for image deraining. TNNLS, 31:1794 -- 1807, 2019.Google ScholarGoogle Scholar
  10. X. Fu, J. Huang, D. Zeng, Y. Huang, X. Ding, and J. Paisley. Removing rain from single images via a deep detail network. CVPR, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  11. X. Fu, J. Huang, X. Ding, Y. Liao, and J. Paisley. Clearing the skies: A deep network architecture for single-image rain removal. IEEE Transactions on Image Processing, 26:2944--2956, 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. X.Y. Fu, Q. Qi, Z.J. Zha, Y.R. Zhu, and X.B. Ding. Rain streak removal via dual graph convolutional network. AAAI, 2021.Google ScholarGoogle ScholarCross RefCross Ref
  13. X. Gao, Y. Wang, J. Cheng, M. Xu, and M. Wang. Metalearning based relation and representation learning networks for single-image deraining. Pattern Recognition, 120:108124, 2021.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. Hu, S. Li, and G. Sun. Squeeze-and-excitation networks. IEEE Conference on Computer Vision and Pattern Recognition, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  15. K. Jiang, Z.y. Wang, P. Yi, C. Chen, B.j. Huang, Y.m. Luo, J.y. Ma, and J.j. Jiang. Multi-scale progressive fusion network for single image deraining. CVPR, page 8346--8355, 2020.Google ScholarGoogle Scholar
  16. X. Jin, Z.b. Chen, J.x. Lin, Z.k. Chen, and W. Zhou. Unsupervised single image deraining with self-supervised constraints. ICIP, page 2761--2765, 2019.Google ScholarGoogle Scholar
  17. L. Kang, C. Lin, and Y. Fu. Automatic single-image-based rain streaks removal via image decomposition. IEEE Transactions on Image Processing, 21:1742--1755, 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. S. Khan, M. Naseer, M. Hayat, S. W. Zamir, F. S. Khan, and M. Shah. Transformers in vision: A survey. arXiv:2101.01169, 2021.Google ScholarGoogle Scholar
  19. M. Kumar, D. Weissenborn, and N. Kalchbrenner. Colorization transformer. In ICLR, 2021.Google ScholarGoogle Scholar
  20. X. Li, J.l. Wu, Z.c. Lin, H. Liu, and H.b. Zha. Recurrent squeeze-and-excitation context aggregation net for single image deraining. ECCV, pages 254--269, 2018.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Y. Li, R. T. Tan, X. Guo, J. Lu, and M. S. Brown. Rain streak removal using layer priors. CVPR, page 2736--2744, 2016.Google ScholarGoogle Scholar
  22. J. Liang, J. Cao, G. Sun, K. Zhang, L. V. Gool, and R. Timofte. SwinIR: Image restoration using swin transformer. In ICCV Workshops, 2021.Google ScholarGoogle ScholarCross RefCross Ref
  23. Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov. RoBERTa: A robustly optimized bert pretraining approach. arXiv:1907.11692, 2019.Google ScholarGoogle Scholar
  24. Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo. Swin transformer: Hierarchical vision transformer using shifted windows. arXiv:2103.14030, 2021.Google ScholarGoogle Scholar
  25. J. Long, E. Shelhamer, and T. Darrell. Fully convolutional networks for semantic segmentation. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.Google ScholarGoogle ScholarCross RefCross Ref
  26. Y. Luo, Y. Xu, and H. Ji. Removing rain from a single image via discriminative sparse coding. ICCV, page 3397--3405, 2015.Google ScholarGoogle Scholar
  27. P. Mu, J. Chen, R.s. Liu, X. Fan, and Z.x. Luo. Learning bilevel layer priors for single image rain streaks removal. IEEE Signal Processing Letters, 16:307--311, 2019.Google ScholarGoogle ScholarCross RefCross Ref
  28. J.C. Pu, X.S. Chen, L. Zhang, Q.H. Zhou, and Y. Zhao. Removing rain based on a cycle generative adversarial network. ICIEA, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  29. K. Purohit, M. Suin, A. Rajagopalan, and V. N. Boddeti. Spatially-adaptive image restoration using distortion-guided networks. In ICCV, 2021.Google ScholarGoogle ScholarCross RefCross Ref
  30. R. Qian, R. T. Tan, W. Yang, J. Su, and J. Liu. Attentive generative adversarial network for raindrop removal from a single image. CVPR, page 2482--2491, 2018.Google ScholarGoogle Scholar
  31. A. Radford, K. Narasimhan, T. Salimans, and I. Sutskever. Improving language understanding by generative pre-training. Technical report, OpenAI, 2018.Google ScholarGoogle Scholar
  32. D.w. Ren, W.m. Zuo, Q.h. Hu, P.f. Zhu, and D.y. Meng. Progressive image deraining networks: a better and simpler baseline. CVPR, page 3937--394, 2019.Google ScholarGoogle Scholar
  33. O. Ronneberger, P. Fischer, and T. Brox. U-net: Convolutional networks for biomedical image segmentation. In MICCAI, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  34. H. Touvron, M. Cord, M. Douze, F. Massa, A. Sablayrolles, and H. Jegou. Training data-efficient image transformers & distillation through attention. In ICML, 2021.Google ScholarGoogle Scholar
  35. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin. Attention is all you need. In NeurIPS, 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. G.q. Wang, C.m. Sun, and A. Sowmya. Erlnet: Entangled representation learning for single image deraining. ICCV, page 5644--5652, 2019.Google ScholarGoogle Scholar
  37. H. Wang, Q. Xie, Q. Zhao, and D.Y. Meng. Joint rain detection and removal from a single image with contextualized deep networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42:1377--1393, 2020.Google ScholarGoogle ScholarCross RefCross Ref
  38. H.Wang, Q. Xie, Q. Zhao, Y. Liang, and D.Y. Meng. Rcdnet: An interpretable rain convolutional dictionary network for single image deraining. arXiv:2107.06808, 2021.Google ScholarGoogle Scholar
  39. H.Wang, Y.c.Wu, Q. Xie, Q. Zhao, Y. Liang, S.j. Zhang, and D.y. Meng. Structural residual learning for single image rain removal. Knowledge-Based Systems, page 106595, 2020.Google ScholarGoogle Scholar
  40. T.y Wang, X. Yang, K. Xu, S.z. Chen, Q. Zhang, and R. WH Lau. Spatial attentive single-image deraining with a high quality real rain dataset. CVPR, page 12270--12279, 2019.Google ScholarGoogle Scholar
  41. W. Wang, E. Xie, X. Li, D. Fan, K. Song, D. Liang, T. Lu, P. Luo, and L. Shao. Pyramid vision transformer: A versatile backbone for dense prediction without convolutions. In ICCV, 2021.Google ScholarGoogle ScholarCross RefCross Ref
  42. Z. Wang, A.C. Bovik, H.R. Sheikh, and E.P. Simoncelli. Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing, 13:600--612.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Z. Wang, X. Cun, J. Bao, and J. Liu. Uformer: A general u-shaped transformer for image restoration. arXiv:2106.03106, 2021.Google ScholarGoogle Scholar
  44. W. Wei, D.y. Meng, Q Zhao, Z.b. Xu, and Y. Wu. Semisupervised transfer learning for image rain removal. CVPR, page 3877--3886, 2019.Google ScholarGoogle Scholar
  45. E. Xie, W. Wang, Z. Yu, A. Anandkumar, J. M. Alvarez, and P. Luo. Segformer: Simple and efficient design for semantic segmentation with transformers. arXiv:2105.15203, 2021.Google ScholarGoogle Scholar
  46. K. Yamamichi and X.-H. Han. Mcgkt-net: Multi-level context gating knowledge transfer network for single image deraining. ACCV, page 68--83, 2020.Google ScholarGoogle Scholar
  47. F. Yang, H. Yang, J. Fu, H. Lu, and B. Guo. Learning texture transformer network for image super-resolution. In CVPR, 2020.Google ScholarGoogle ScholarCross RefCross Ref
  48. W. Yang, R. T. Tan, J. Feng, J. Liu, Z. Guo, and S. Yan. Deep joint rain detection and removal from a single image. CVPR, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  49. W.H. Yang, R. T. Tan, J.A. Feng, Z.M. Guo, S.C. Yan, and J.Y. Liu. Rcdnet: a model-driven deep neural network for single image rain removal. CVPR, 2020.Google ScholarGoogle Scholar
  50. W.H. Yang, S.Q. Wang, and J.Y. Liu. Removing arbitraryscale rain streaks via fractal band learning with selfsupervision. IEEE Transactions on Image Processing, 42:6759--6772, 2020.Google ScholarGoogle ScholarCross RefCross Ref
  51. Y.Z. Yang, W. Ran, and H. Lu. Rddan: A residual dense dilated aggregated network for single image deraining. ICME, 2020.Google ScholarGoogle ScholarCross RefCross Ref
  52. R. Yasarla and V. M Patel. Uncertainty guided multi-scale residual learning-using a cycle spinning cnn for single image de-raining. CVPR, page 8405--8414, 2019.Google ScholarGoogle Scholar
  53. R. Yasarla, V. A. Sindagi, and V. M Patel. Syn2real transfer learning for image deraining using gaussian processes. CVPR, 2020.Google ScholarGoogle ScholarCross RefCross Ref
  54. L. Yuan, Y. Chen, T. Wang, W. Yu, Y. Shi, Z. Jiang, F. E. Tay, J. Feng, and S. Yan. Tokens-to-token vit: Training vision transformers from scratch on imagenet. arXiv:2101.11986, 2021.Google ScholarGoogle Scholar
  55. S.W. Zamir, A. Arora, S. Khan, M. Hayat, F. S. Khan, M.-H. Yang, and L. Shao. Multi-stage progressive image restoration. ACM Multimedia Conference, page 14821--14831, 2018.Google ScholarGoogle Scholar
  56. S. W. Zamir, A. Arora, S. Khan, M. Hayat, F. S. Khan, M. Yang, and L. Shao. Multi-stage progressive image restoration. In CVPR, 2021.Google ScholarGoogle ScholarCross RefCross Ref
  57. H. Zhang and V. M. Patel. Density-aware single image deraining using a multi-stream dense network. CVPR, page 695--704, 2018.Google ScholarGoogle Scholar
  58. H. Zhang, V. Sindagi, and V. M. Patel. Image de-raining using a conditional generative adversarial network. TCSVT, 2019.Google ScholarGoogle Scholar
  59. S. Zheng, J. Lu, H. Zhao, X. Zhu, Z. Luo, Y. Wang, Y. Fu, J. Feng, T. Xiang, P. H. Torr, et al. Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In CVPR, 2021.Google ScholarGoogle ScholarCross RefCross Ref
  60. Y. P. Zheng, X. Yu, M. M. Liu, and S. L. Zhang. Residual multiscale based single image deraining. BMVC, 2019.Google ScholarGoogle Scholar
  61. X. Zhu, W. Su, L. Lu, B. Li, X. Wang, and J. Dai. Deformable DETR: Deformable transformers for end-to-end object detection. arXiv:2010.04159, 2020.Google ScholarGoogle Scholar

Index Terms

  1. Multi-Scale Channel Transformer Network for Single Image Deraining

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        MMAsia '22: Proceedings of the 4th ACM International Conference on Multimedia in Asia
        December 2022
        296 pages
        ISBN:9781450394789
        DOI:10.1145/3551626

        Copyright © 2022 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 13 December 2022

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate59of204submissions,29%

        Upcoming Conference

        MM '24
        MM '24: The 32nd ACM International Conference on Multimedia
        October 28 - November 1, 2024
        Melbourne , VIC , Australia
      • Article Metrics

        • Downloads (Last 12 months)59
        • Downloads (Last 6 weeks)1

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader