skip to main content
10.1145/3343031.3351008acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Joint-attention Discriminator for Accurate Super-resolution via Adversarial Training

Published:15 October 2019Publication History

ABSTRACT

Tremendous progress has been witnessed on single image super-resolution (SR), where existing deep SR models achieve impressive performance in objective criteria, e.g., PSNR and SSIM. However, most of the SR methods are limited in visual perception, for example, they look too smooth. Generative adversarial network (GAN) favors SR visual effects over most of the deep SR models but is poor in objective criteria. In order to trade off the objective and subjective SR performance, we design a joint-attention discriminator with which GAN improves the SR performance in PSNR and SSIM, as well as maintaining the visual effect compared with non-attention GAN based SR models. The joint-attention discriminator contains dense channel-wise attention and cross-layer attention blocks. The former is applied in the shallow layers of the discriminator for channel-wise weighting combination of feature maps. The latter is employed to select feature maps in some middle and deep layers for effective discrimination. Extensive experiments are conducted on six benchmark datasets and the experimental results show that our proposed discriminator combining with different generators can achieve more realistic visual performances.

References

  1. Eirikur Agustsson and Radu Timofte. 2017. NTIRE 2017 challenge on single image super-resolution: Dataset and study. In CVPR Workshop. 126--135.Google ScholarGoogle Scholar
  2. Namhyuk Ahn, Byungkon Kang, and Kyung Ah Sohn. 2018. Fast, accurate, and lightweight super-resolution with cascading residual network. In ECCV. 252--268.Google ScholarGoogle Scholar
  3. Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. Computer Science (2014).Google ScholarGoogle Scholar
  4. Jan Chorowski, Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. End-to-end continuous speech recognition using attention-based recurrent NN: First results. In NIPS Workshop.Google ScholarGoogle Scholar
  5. Chao Dong, Chen Change Loy, Kaiming He, and Xiaoou Tang. 2016. Image super-resolution using deep convolutional networks. TPAMI 38, 2 (2016), 295--307.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Chao Dong, Chen Change Loy, and Xiaoou Tang. 2016. Accelerating the super-resolution convolutional neural network. In ECCV. 391--407.Google ScholarGoogle Scholar
  7. Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In NIPS. 2672--2680.Google ScholarGoogle Scholar
  8. Muhammad Haris, Greg Shakhnarovich, and Norimichi Ukita. 2018. Deep backprojection networks for super-resolution. In CVPR. 1664--1673.Google ScholarGoogle Scholar
  9. Jie Hu, Li Shen, and Gang Sun. 2018. Squeeze-and-excitation networks. In CVPR. 7132--7141.Google ScholarGoogle Scholar
  10. Zheng Hui, Xiumei Wang, and Xinbo Gao. 2018. Fast and accurate single image super-resolution via information distillation network. In CVPR. 723--731.Google ScholarGoogle Scholar
  11. Saumya Jetley, Nicholas A Lord, Namhoon Lee, and Philip HS Torr. 2018. Learn to pay attention. In ICLR.Google ScholarGoogle Scholar
  12. Justin Johnson, Alexandre Alahi, and Li Fei-Fei. 2016. Perceptual losses for real-time style transfer and super-resolution. In ECCV. 694--711.Google ScholarGoogle Scholar
  13. Jiwon Kim, Jung Kwon Lee, and Kyoung Mu Lee. 2016. Accurate image super-resolution using very deep convolutional networks. In CVPR. 1646--1654.Google ScholarGoogle Scholar
  14. Jiwon Kim, Jung Kwon Lee, and Kyoung Mu Lee. 2016. Deeplyrecursive convolutional network for image super-resolution. In CVPR. 1637--1645.Google ScholarGoogle Scholar
  15. Christian Ledig, Lucas Theis, Ferenc Husz´ar, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew P Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, et al. 2017. Photo-realistic single image super-resolution using a generative adversarial network. In CVPR. 4681--4690.Google ScholarGoogle Scholar
  16. Xia Li, Jianlong Wu, Zhouchen Lin, Hong Liu, and Hongbin Zha. 2018. Recurrent squeeze-and-excitation context aggregation net for single image deraining. In ECCV Workshop. 262--277.Google ScholarGoogle ScholarCross RefCross Ref
  17. Bee Lim, Sanghyun Son, Heewon Kim, Seungjun Nah, and Kyoung Mu Lee. 2017. Enhanced deep residual networks for single image super-resolution. In CVPR workshop. 136--144.Google ScholarGoogle ScholarCross RefCross Ref
  18. Xiaotong Luo, Rong Chen, Yuan Xie, Yanyun Qu, and Cuihua Li. 2018. Bi-GANs-ST for perceptual image super-resolution. In ECCV Workshop.Google ScholarGoogle Scholar
  19. Xiaojiao Mao, Chunhua Shen, and Yu-Bin Yang. 2016. Image restoration using very deep convolutional encoder-decoder networks with symmetric skip connections. In NIPS. 2802--2810.Google ScholarGoogle Scholar
  20. Roey Mechrez, Itamar Talmi, F Shama, and L Zelnik-Manor. 2018. Maintaining natural image statistics with the contextual loss. arXiv preprint arXiv:1803.04626 (2018), 1--16.Google ScholarGoogle Scholar
  21. Roey Mechrez, Itamar Talmi, and Lihi Zelnik-Manor. 2018. The contextual loss for image transformation with non-aligned data. In ECCV. 768--783.Google ScholarGoogle Scholar
  22. Volodymyr Mnih, Nicolas Heess, Alex Graves, et al. 2014. Recurrent models of visual attention. In NIPS. 2204--2212.Google ScholarGoogle Scholar
  23. Augustus Odena, Vincent Dumoulin, and Chris Olah. 2016. Deconvolution and checkerboard artifacts. Distill 1, 10 (2016), e3.Google ScholarGoogle ScholarCross RefCross Ref
  24. Mehdi SM Sajjadi, Bernhard Sch¨olkopf, and Michael Hirsch. 2017. Enhancenet: Single image super-resolution through automated texture synthesis. In ICCV. 4501--4510.Google ScholarGoogle Scholar
  25. Changhao Shan, Junbo Zhang, Yujun Wang, and Lei Xie. 2018. Attention-based end-to-end speech recognition on voice search. In ICASSP. 4764--4768.Google ScholarGoogle Scholar
  26. Wenzhe Shi, Jose Caballero, Ferenc Husz´ar, Johannes Totz, Andrew P Aitken, Rob Bishop, Daniel Rueckert, and Zehan Wang. 2016. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In CVPR. 1874--1883.Google ScholarGoogle Scholar
  27. Ying Tai, Jian Yang, Xiaoming Liu, and Chunyan Xu. 2017. Memnet: A persistent memory network for image restoration. In CVPR. 4539--4547.Google ScholarGoogle Scholar
  28. Tong Tong, Gen Li, Xiejie Liu, and Qinquan Gao. 2017. Image super-resolution using dense skip connections. In ICCV. 4809-- 4817.Google ScholarGoogle Scholar
  29. Xintao Wang, Ke Yu, Chao Dong, and Chen Change Loy. 2018. Recovering realistic texture in image super-resolution by deep spatial feature transform. In CVPR. 606--615.Google ScholarGoogle Scholar
  30. Xintao Wang, Ke Yu, Shixiang Wu, Jinjin Gu, Yihao Liu, Chao Dong, Yu Qiao, and Chen Change Loy. 2018. ESRGAN: Enhanced super-resolution generative adversarial networks. In ECCV Workshop.Google ScholarGoogle Scholar
  31. Yibo Yang, Zhisheng Zhong, Tiancheng Shen, and Zhouchen Lin. 2018. Convolutional neural networks with alternately updated clique. In CVPR. 2413--2422.Google ScholarGoogle Scholar
  32. Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shechtman, and Oliver Wang. 2018. The unreasonable effectiveness of deep features as a perceptual metric. In CVPR. 586--595.Google ScholarGoogle Scholar
  33. Yulun Zhang, Kunpeng Li, Kai Li, Lichen Wang, Bineng Zhong, and Yun Fu. 2018. Image super-resolution using very deep residual channel attention networks. In ECCV. 286--301.Google ScholarGoogle Scholar
  34. Yulun Zhang, Yapeng Tian, Yu Kong, Bineng Zhong, and Yun Fu. 2018. Residual dense network for image super-resolution. In CVPR. 2472--2481.Google ScholarGoogle Scholar

Index Terms

  1. Joint-attention Discriminator for Accurate Super-resolution via Adversarial Training

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      MM '19: Proceedings of the 27th ACM International Conference on Multimedia
      October 2019
      2794 pages
      ISBN:9781450368896
      DOI:10.1145/3343031

      Copyright © 2019 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 15 October 2019

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      MM '19 Paper Acceptance Rate252of936submissions,27%Overall Acceptance Rate995of4,171submissions,24%

      Upcoming Conference

      MM '24
      MM '24: The 32nd ACM International Conference on Multimedia
      October 28 - November 1, 2024
      Melbourne , VIC , Australia

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader