skip to main content
10.1145/3501409.3501476acmotherconferencesArticle/Chapter ViewAbstractPublication PageseitceConference Proceedingsconference-collections
research-article

Recurrent SinGAN: Towards Scale-Agnostic Single Image GANs

Published: 31 December 2021 Publication History

Abstract

Learning a deep generative model on a single image has attracted considerable attention recently. In this paper, we present a single image generative model, named recurrent SinGAN, based on the recent proposed SinGAN [22]. The original SinGAN relies on a pyramid architecture of generators to learn the patch distribution from the pyramid of multiple image scales. We propose a recurrent generator to replace the pyramid of generators in SinGAN with a single recurrent generator. Our single recurrent generator can learn the patch distributions across multiple scales, yielding a scale-agnostic single image generative model.
On the image synthesis task, our recurrent SinGAN performs on par with the original SinGAN, however our method reduces the training time by almost 60% and has 4.5x fewer parameters.
Moreover, the results on various image manipulation tasks, including paint-to-image, image editing, harmonization, and image super-resolution, further verify the effectiveness of our proposed method.

References

[1]
Sefi Bell-Kligler, Assaf Shocher, and Michal Irani. 2019. Blind super-resolution kernel estimation using an internal-gan. In Advances in Neural Information Processing Systems. 284--293.
[2]
Taeg Sang Cho, Moshe Butman, Shai Avidan, and William T Freeman. 2008. The patch transform and its applications to image editing. In 2008 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1--8.
[3]
Emily L Denton, Soumith Chintala, Rob Fergus, et al. 2015. Deep generative image models using a laplacian pyramid of adversarial networks. In Advances in neural information processing systems. 1486--1494.
[4]
Laurent Dinh, David Krueger, and Yoshua Bengio. 2014. Nice: Non-linear independent components estimation. arXiv preprintarXiv:1410.8516(2014).
[5]
Daniel Glasner, Shai Bagon, and Michal Irani. 2009. Super-resolution from a single image. In2009 IEEE 12th international conference on computer vision. IEEE, 349--356.
[6]
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio.2014. Generative adversarial nets. In Advances in neural information processing systems. 2672--2680.
[7]
Karol Gregor, Ivo Danihelka, Alex Graves, Danilo Jimenez Rezende, and Daan Wierstra. 2015. Draw: A recurrent neural network for imagegeneration. arXiv preprint arXiv:1502.04623(2015).
[8]
Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, and Aaron C Courville. 2017. Improved training of wasserstein gans. In Advances in neural information processing systems. 5767--5777.
[9]
Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. 2017. Gans trained by a two time-scale update rule converge to a local nash equilibrium. arXiv preprint arXiv:1706.08500(2017).
[10]
Tobias Hinz, Matthew Fisher, Oliver Wang, and Stefan Wermter. 2020. Improved Techniques for Training Single-Image GANs. arXiv preprint arXiv:2003.11512(2020).
[11]
Jia-Bin Huang, Abhishek Singh, and Narendra Ahuja. 2015. Singleimage super-resolution from transformed self-exemplars. In Proceedings of the IEEE conference on computer vision and pattern recognition. 5197--5206.
[12]
Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1125--1134.
[13]
Tero Karras, Samuli Laine, and Timo Aila. 2019. A style-based genera-tor architecture for generative adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4401--4410.
[14]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980(2014).
[15]
Durk P Kingma and Prafulla Dhariwal. 2018. Glow: Generative flow with invertible 1x1 convolutions. In Advances in neural information processing systems. 10215--10224.
[16]
Diederik P Kingma and Max Welling. 2013. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114(2013).
[17]
Christian Ledig, Lucas Theis, Ferenc Huszár, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, et al. 2017. Photo-realistic single image super-resolution using a generative adversarial network. In Proceedings of the IEEE conference on computer vision and pattern recognition.4681--4690.
[18]
Yanghao Li, Yuntao Chen, Naiyan Wang, and Zhaoxiang Zhang. 2019. Scale-aware trident networks for object detection. In Proceedings of the IEEE international conference on computer vision. 6054--6063.
[19]
D. Martin, C. Fowlkes, D. Tal, and J. Malik. 2001. A Database of Human Segmented Natural Images and its Application to Evaluating Segmentation Algorithms and Measuring Ecological Statistics. In Proc. 8thInt'l Conf. Computer Vision, Vol. 2. 416--423.
[20]
Tomer Michaeli and Michal Irani. 2014. Blind deblurring using in-ternal patch recurrence. In European conference on computer vision. Springer, 783--798.
[21]
Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen. 2016. Improved techniques for training gans. arXiv preprint arXiv:1606.03498(2016).
[22]
Tamar Rott Shaham, Tali Dekel, and Tomer Michaeli. 2019. Singan: Learning a generative model from a single natural image. In Proceedings of the IEEE International Conference on Computer Vision. 4570--4580.
[23]
Assaf Shocher, Shai Bagon, Phillip Isola, and Michal Irani. 2018. In-GAN: Capturing and Remapping the" DNA" of a Natural Image. arXiv preprint arXiv:1812.00231(2018).
[24]
Assaf Shocher, Nadav Cohen, and Michal Irani. 2018. "zero-shot" superresolution using deep internal learning. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3118--3126.
[25]
Assaf Shocher, Yossi Gandelsman, Inbar Mosseri, Michal Yarom, Michal Irani, William T Freeman, and Tali Dekel. 2020. Semantic Pyramid for Image Generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7457--7466.
[26]
Dmitry Ulyanov, Andrea Vedaldi, and Victor Lempitsky. 2016. In-stance normalization: The missing ingredient for fast stylization. arXiv preprint arXiv:1607.08022(2016).
[27]
Dmitry Ulyanov, Andrea Vedaldi, and Victor Lempitsky. 2018. Deep image prior. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 9446--9454.
[28]
Aaron Van den Oord, Nal Kalchbrenner, Lasse Espeholt, Oriol Vinyals, Alex Graves, et al. 2016. Conditional image generation with pixelcnn decoders. In Advances in neural information processing systems. 4790--4798.
[29]
Aaron Van Oord, Nal Kalchbrenner, and Koray Kavukcuoglu. 2016. Pixel Recurrent Neural Networks. In International Conference on Ma-chine Learning. 1747--1756.
[30]
Bolei Zhou, Agata Lapedriza, Jianxiong Xiao, Antonio Torralba, and Aude Oliva. 2014. Learning deep features for scene recognition using places database. In Advances in neural information processing systems. 487--495.
[31]
Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. 2017. Un-paired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision. 2223--2232.
[32]
Maria Zontak, Inbar Mosseri, and Michal Irani. 2013. Separating signal from noise using patch recurrence across scales. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1195--1202.

Cited By

View all
  • (2023)One-Shot Synthesis of Images and Segmentation Masks2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV56688.2023.00622(6274-6283)Online publication date: Jan-2023
  • (2022)A statistically constrained internal method for single image super-resolution2022 26th International Conference on Pattern Recognition (ICPR)10.1109/ICPR56361.2022.9956498(1322-1328)Online publication date: 21-Aug-2022

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
EITCE '21: Proceedings of the 2021 5th International Conference on Electronic Information Technology and Computer Engineering
October 2021
1723 pages
ISBN:9781450384322
DOI:10.1145/3501409
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 31 December 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. GANs
  2. Image manipulation
  3. Single image generation

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

EITCE 2021

Acceptance Rates

EITCE '21 Paper Acceptance Rate 294 of 531 submissions, 55%;
Overall Acceptance Rate 508 of 972 submissions, 52%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)11
  • Downloads (Last 6 weeks)5
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2023)One-Shot Synthesis of Images and Segmentation Masks2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV56688.2023.00622(6274-6283)Online publication date: Jan-2023
  • (2022)A statistically constrained internal method for single image super-resolution2022 26th International Conference on Pattern Recognition (ICPR)10.1109/ICPR56361.2022.9956498(1322-1328)Online publication date: 21-Aug-2022

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media