research-article

Recurrent SinGAN: Towards Scale-Agnostic Single Image GANs

Authors:

Zhenyong FuAuthors Info & Claims

EITCE '21: Proceedings of the 2021 5th International Conference on Electronic Information Technology and Computer Engineering

Pages 361 - 366

https://doi.org/10.1145/3501409.3501476

Published: 31 December 2021 Publication History

Abstract

Learning a deep generative model on a single image has attracted considerable attention recently. In this paper, we present a single image generative model, named recurrent SinGAN, based on the recent proposed SinGAN [22]. The original SinGAN relies on a pyramid architecture of generators to learn the patch distribution from the pyramid of multiple image scales. We propose a recurrent generator to replace the pyramid of generators in SinGAN with a single recurrent generator. Our single recurrent generator can learn the patch distributions across multiple scales, yielding a scale-agnostic single image generative model.

On the image synthesis task, our recurrent SinGAN performs on par with the original SinGAN, however our method reduces the training time by almost 60% and has 4.5x fewer parameters.

Moreover, the results on various image manipulation tasks, including paint-to-image, image editing, harmonization, and image super-resolution, further verify the effectiveness of our proposed method.

References

[1]

Sefi Bell-Kligler, Assaf Shocher, and Michal Irani. 2019. Blind super-resolution kernel estimation using an internal-gan. In Advances in Neural Information Processing Systems. 284--293.

[2]

Taeg Sang Cho, Moshe Butman, Shai Avidan, and William T Freeman. 2008. The patch transform and its applications to image editing. In 2008 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1--8.

[3]

Emily L Denton, Soumith Chintala, Rob Fergus, et al. 2015. Deep generative image models using a laplacian pyramid of adversarial networks. In Advances in neural information processing systems. 1486--1494.

[4]

Laurent Dinh, David Krueger, and Yoshua Bengio. 2014. Nice: Non-linear independent components estimation. arXiv preprintarXiv:1410.8516(2014).

[5]

Daniel Glasner, Shai Bagon, and Michal Irani. 2009. Super-resolution from a single image. In2009 IEEE 12th international conference on computer vision. IEEE, 349--356.

[6]

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio.2014. Generative adversarial nets. In Advances in neural information processing systems. 2672--2680.

[7]

Karol Gregor, Ivo Danihelka, Alex Graves, Danilo Jimenez Rezende, and Daan Wierstra. 2015. Draw: A recurrent neural network for imagegeneration. arXiv preprint arXiv:1502.04623(2015).

Digital Library

[8]

Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, and Aaron C Courville. 2017. Improved training of wasserstein gans. In Advances in neural information processing systems. 5767--5777.

[9]

Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. 2017. Gans trained by a two time-scale update rule converge to a local nash equilibrium. arXiv preprint arXiv:1706.08500(2017).

[10]

Tobias Hinz, Matthew Fisher, Oliver Wang, and Stefan Wermter. 2020. Improved Techniques for Training Single-Image GANs. arXiv preprint arXiv:2003.11512(2020).

[11]

Jia-Bin Huang, Abhishek Singh, and Narendra Ahuja. 2015. Singleimage super-resolution from transformed self-exemplars. In Proceedings of the IEEE conference on computer vision and pattern recognition. 5197--5206.

[12]

Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1125--1134.

[13]

Tero Karras, Samuli Laine, and Timo Aila. 2019. A style-based genera-tor architecture for generative adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4401--4410.

[14]

Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980(2014).

[15]

Durk P Kingma and Prafulla Dhariwal. 2018. Glow: Generative flow with invertible 1x1 convolutions. In Advances in neural information processing systems. 10215--10224.

[16]

Diederik P Kingma and Max Welling. 2013. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114(2013).

[17]

Christian Ledig, Lucas Theis, Ferenc Huszár, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, et al. 2017. Photo-realistic single image super-resolution using a generative adversarial network. In Proceedings of the IEEE conference on computer vision and pattern recognition.4681--4690.

[18]

Yanghao Li, Yuntao Chen, Naiyan Wang, and Zhaoxiang Zhang. 2019. Scale-aware trident networks for object detection. In Proceedings of the IEEE international conference on computer vision. 6054--6063.

[19]

D. Martin, C. Fowlkes, D. Tal, and J. Malik. 2001. A Database of Human Segmented Natural Images and its Application to Evaluating Segmentation Algorithms and Measuring Ecological Statistics. In Proc. 8thInt'l Conf. Computer Vision, Vol. 2. 416--423.

[20]

Tomer Michaeli and Michal Irani. 2014. Blind deblurring using in-ternal patch recurrence. In European conference on computer vision. Springer, 783--798.

[21]

Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen. 2016. Improved techniques for training gans. arXiv preprint arXiv:1606.03498(2016).

[22]

Tamar Rott Shaham, Tali Dekel, and Tomer Michaeli. 2019. Singan: Learning a generative model from a single natural image. In Proceedings of the IEEE International Conference on Computer Vision. 4570--4580.

[23]

Assaf Shocher, Shai Bagon, Phillip Isola, and Michal Irani. 2018. In-GAN: Capturing and Remapping the" DNA" of a Natural Image. arXiv preprint arXiv:1812.00231(2018).

[24]

Assaf Shocher, Nadav Cohen, and Michal Irani. 2018. "zero-shot" superresolution using deep internal learning. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3118--3126.

[25]

Assaf Shocher, Yossi Gandelsman, Inbar Mosseri, Michal Yarom, Michal Irani, William T Freeman, and Tali Dekel. 2020. Semantic Pyramid for Image Generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7457--7466.

[26]

Dmitry Ulyanov, Andrea Vedaldi, and Victor Lempitsky. 2016. In-stance normalization: The missing ingredient for fast stylization. arXiv preprint arXiv:1607.08022(2016).

[27]

Dmitry Ulyanov, Andrea Vedaldi, and Victor Lempitsky. 2018. Deep image prior. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 9446--9454.

[28]

Aaron Van den Oord, Nal Kalchbrenner, Lasse Espeholt, Oriol Vinyals, Alex Graves, et al. 2016. Conditional image generation with pixelcnn decoders. In Advances in neural information processing systems. 4790--4798.

[29]

Aaron Van Oord, Nal Kalchbrenner, and Koray Kavukcuoglu. 2016. Pixel Recurrent Neural Networks. In International Conference on Ma-chine Learning. 1747--1756.

[30]

Bolei Zhou, Agata Lapedriza, Jianxiong Xiao, Antonio Torralba, and Aude Oliva. 2014. Learning deep features for scene recognition using places database. In Advances in neural information processing systems. 487--495.

[31]

Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. 2017. Un-paired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision. 2223--2232.

[32]

Maria Zontak, Inbar Mosseri, and Michal Irani. 2013. Separating signal from noise using patch recurrence across scales. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1195--1202.

Digital Library

Cited By

Sushko VZhang DGall JKhoreva A(2023)One-Shot Synthesis of Images and Segmentation Masks2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV56688.2023.00622(6274-6283)Online publication date: Jan-2023
https://doi.org/10.1109/WACV56688.2023.00622
Chatillon PGousseau YLefebvre S(2022)A statistically constrained internal method for single image super-resolution2022 26th International Conference on Pattern Recognition (ICPR)10.1109/ICPR56361.2022.9956498(1322-1328)Online publication date: 21-Aug-2022
https://doi.org/10.1109/ICPR56361.2022.9956498

Index Terms

Recurrent SinGAN: Towards Scale-Agnostic Single Image GANs
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision representations
        Image representations
  2. Computer graphics
    1. Image manipulation
      1. Image processing

Recommendations

Diverse single image generation with controllable global structure
Abstract
Image generation from a single image using generative adversarial networks is quite interesting due to the realism of generated images. However, recent approaches need improvement for such realistic and diverse image generation, when ...
DRGAN: A dual resolution guided low-resolution image inpainting
Abstract
Although image inpainting is a challenging task in computer vision, most existing image inpainting methods have achieved remarkable progress. However, occlusion and low resolution often appear on one image simultaneously in the real ...
A single-image GAN model using self-attention mechanism and DenseNets
Abstract
Image generation from a single natural image using generative adversarial networks (GANs) has attracted extensive attention recently due to the GANs’ practical ability to produce photo-realistic images and their potential applications in computer ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

EITCE '21: Proceedings of the 2021 5th International Conference on Electronic Information Technology and Computer Engineering

October 2021

1723 pages

ISBN:9781450384322

DOI:10.1145/3501409

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 31 December 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

EITCE 2021

EITCE 2021: 2021 5th International Conference on Electronic Information Technology and Computer Engineering

October 22 - 24, 2021

Xiamen, China

Acceptance Rates

EITCE '21 Paper Acceptance Rate 294 of 531 submissions, 55%;

Overall Acceptance Rate 508 of 972 submissions, 52%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
79
Total Downloads

Downloads (Last 12 months)11
Downloads (Last 6 weeks)5

Reflects downloads up to 01 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Sushko VZhang DGall JKhoreva A(2023)One-Shot Synthesis of Images and Segmentation Masks2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV56688.2023.00622(6274-6283)Online publication date: Jan-2023
https://doi.org/10.1109/WACV56688.2023.00622
Chatillon PGousseau YLefebvre S(2022)A statistically constrained internal method for single image super-resolution2022 26th International Conference on Pattern Recognition (ICPR)10.1109/ICPR56361.2022.9956498(1322-1328)Online publication date: 21-Aug-2022
https://doi.org/10.1109/ICPR56361.2022.9956498

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents