Abstract
Generative adversarial networks (GANs) are widely used for image super-resolution (SR) and have recently attracted increasing attention due to their potential to generate rich details. However, generators are usually based on convolutional neural networks, which lack global modeling capacity and limit the performance of the network. To address this problem, we propose a hierarchical partitioned Transformer block to extract features at different scales, which alleviates the loss of information and helps global modelling. We then design a Transformer in residual block to reconstruct more natural structural textures in SR results. Finally, we integrate the intensify perception Transformer network with an existing discriminator network to form the intensify perception Transformer generative adversarial network (IPTGAN). We conducted experiments on several benchmark datasets, RealSR dataset and PIRM self-validation dataset to verify the generalization ability of our IPTGAN. The results show that our IPTGAN exhibits better visual quality and significantly less complexity compared to several state-of-the-art GAN-based image SR methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Agustsson, E., Timofte, R.: NTIRE 2017 challenge on single image super-resolution: dataset and study. In: CVPRW (2017)
Blau, Y., Mechrez, R., Timofte, R., et al.: The 2018 PIRM challenge on perceptual image super-resolution. In: ECCV (2018)
Cai, J., Zeng, H., Yong, H., Cao, Z., Zhang, L.: Toward real-world single image super-resolution: a new benchmark and a new model. In: International Conference on Computer Vision, ICCV, pp. 3086ā3095 (2019)
Denton, E.L., Chintala, S., Szlam, A., et al.: Deep generative image models using a laplacian pyramid of adversarial networks. In: NeurIPS (2015)
Dong, C., Loy, C.C., He, K., Tang, X.: Learning a deep convolutional network for image super-resolution. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8692, pp. 184ā199. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10593-2_13
Dosovitskiy, A., Beyer, L., Kolesnikov, A., et al.: An image is worth 16\(\times \)16 words: transformers for image recognition at scale. In: ICLR (2021)
Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., et al.: Generative adversarial nets. In: NeurIPS (2014)
Huang, J., Singh, A., Ahuja, N.: Single image super-resolution from transformed self-exemplars. In: CVPR (2015)
Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 694ā711. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_43
Jolicoeur-Martineau, A.: The relativistic discriminator: a key element missing from standard GAN. In: ICLR (2019)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (2015)
Ledig, C., Theis, L., Huszar, F., et al.: Photo-realistic single image super-resolution using a generative adversarial network. In: CVPR (2017)
Liang, J., Cao, J., et al.: Swinir: image restoration using swin transformer. In: ICCVW (2021)
Liu, Z., Hu, H., Lin, Y., et al.: Swin transformer V2: scaling up capacity and resolution. In: CVPR (2022)
Liu, Z., Lin, Y., Cao, Y., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: ICCV (2021)
Martin, D.R., Fowlkes, C.C., Tal, D., et al.: A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In: ICCV (2001)
Matsui, Y., et al.: Sketch-based manga retrieval using manga109 dataset. Multimedia Tools Appl. 76, 21811ā21838 (2017)
Park, S., Son, H., Cho, S., Hong, K., Lee, S.: Srfeat: single image super-resolution with feature discrimination. In: ECCV (2018)
Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. In: ICLR (2016)
Sajjadi, M.S.M., Schƶlkopf, B., Hirsch, M.: Enhancenet: single image super-resolution through automated texture synthesis. In: ICCV (2017)
Wang, X., Yu, K., Wu, S., et al.: ESRGAN: enhanced super-resolution generative adversarial networks. In: ECCV (2018)
Zeyde, R., Elad, M., Protter, M.: On single image scale-up using sparse-representations. In: Curves and Surfaces (2010)
Zhang, K., Liang, J., Gool, L.V., et al.: Designing a practical degradation model for deep blind image super-resolution. In: ICCV (2021)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
Ā© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Chen, Y., Wang, G., Chen, R., Hui, Z. (2023). Intensify Perception Transformer Generative Adversarial Network forĀ Image Super-Resolution. In: Lu, H., et al. Image and Graphics . ICIG 2023. Lecture Notes in Computer Science, vol 14358. Springer, Cham. https://doi.org/10.1007/978-3-031-46314-3_25
Download citation
DOI: https://doi.org/10.1007/978-3-031-46314-3_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-46313-6
Online ISBN: 978-3-031-46314-3
eBook Packages: Computer ScienceComputer Science (R0)