research-article

SWAGAN: a style-based wavelet-driven generative model

Authors:
Rinon Gal

Tel Aviv University, Israel

Tel Aviv University, Israel
View Profile

,
Dana Cohen Hochberg

Tel Aviv University, Israel

Tel Aviv University, Israel
View Profile

,
Amit Bermano

Tel Aviv University, Israel

Tel Aviv University, Israel
View Profile

,
Daniel Cohen-Or

Tel Aviv University, Israel

Tel Aviv University, Israel
View Profile

Authors Info & Claims

ACM Transactions on Graphics Volume 40 Issue 4Article No.: 134pp 1–11https://doi.org/10.1145/3450626.3459836

Published:19 July 2021Publication History

ACM Transactions on Graphics

Abstract

In recent years, considerable progress has been made in the visual quality of Generative Adversarial Networks (GANs). Even so, these networks still suffer from degradation in quality for high-frequency content, stemming from a spectrally biased architecture, and similarly unfavorable loss functions. To address this issue, we present a novel general-purpose Style and WAvelet based GAN (SWAGAN) that implements progressive generation in the frequency domain. SWAGAN incorporates wavelets throughout its generator and discriminator architectures, enforcing a frequency-aware latent representation at every step of the way. This approach, designed to directly tackle the spectral bias of neural networks, yields an improvement in the ability to generate medium and high frequency content, including structures which other networks fail to learn. We demonstrate the advantage of our method by integrating it into the SyleGAN2 framework, and verifying that content generation in the wavelet domain leads to more realistic high-frequency content, even when trained for fewer iterations. Furthermore, we verify that our model's latent space retains the qualities that allow StyleGAN to serve as a basis for a multitude of editing tasks, and show that our frequency-aware approach also induces improved high-frequency performance in downstream tasks.

Supplemental Material

3450626.3459836.mp4

Presentation

mp4

164.5 MB

Download

Available for Download

vtt

3450626.3459836.vtt (23.9 KB)

References

Rameen Abdal, Yipeng Qin, and Peter Wonka. 2019. Image2stylegan: How to embed images into the stylegan latent space?. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 4432--4441.Google ScholarCross Ref
Xi Chen, Yan Duan, Rein Houthooft, John Schulman, Ilya Sutskever, and Pieter Abbeel. 2016. Infogan: Interpretable representation learning by information maximizing generative adversarial nets. In Advances in neural information processing systems. 2172--2180.Google ScholarDigital Library
Yuanqi Chen, Ge Li, Cece Jin, Shan Liu, and Thomas Li. 2020. SSD-GAN: Measuring the Realness in the Spatial and Spectral Domains. arXiv preprint arXiv:2012.05535 (2020).Google Scholar
Ingrid Daubechies. 1990. The wavelet transform, time-frequency localization and signal analysis. IEEE transactions on information theory 36, 5 (1990), 961--1005.Google ScholarDigital Library
Ingrid Daubechies. 1992. Ten lectures on wavelets. SIAM.Google Scholar
Emily Denton, Soumith Chintala, Arthur Szlam, and Rob Fergus. 2015. Deep generative image models using a laplacian pyramid of adversarial networks. arXiv preprint arXiv:1506.05751 (2015).Google Scholar
Ricard Durall, Margret Keuper, and Janis Keuper. 2020. Watch your Up-Convolution: CNN Based Generative Deep Neural Networks are Failing to Reproduce Spectral Distributions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7890--7899.Google ScholarCross Ref
Tarik Dzanic, Karan Shah, and Freddie Witherden. 2019. Fourier Spectrum Discrepancies in Deep Network Generated Images. arXiv preprint arXiv:1911.06465 (2019).Google Scholar
Xing Gao and Hongkai Xiong. 2016. A hybrid wavelet convolution network with sparse-coding for image super-resolution. In 2016 IEEE International Conference on Image Processing (ICIP). IEEE, 1439--1443.Google ScholarCross Ref
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets. In Advances in Neural Information Processing Systems 27, Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger (Eds.). Curran Associates, Inc., 2672--2680. http://papers.nips.cc/paper/5423-generative-adversarial-nets.pdfGoogle ScholarDigital Library
Huaibo Huang, Ran He, Zhenan Sun, and Tieniu Tan. 2017. Wavelet-srnet: A wavelet-based cnn for multi-scale face super resolution. In Proceedings of the IEEE International Conference on Computer Vision. 1689--1697.Google ScholarCross Ref
Huaibo Huang, Ran He, Zhenan Sun, and Tieniu Tan. 2019. Wavelet domain generative adversarial network for multi-scale face hallucination. International Journal of Computer Vision 127, 6-7 (2019), 763--784.Google ScholarDigital Library
Eunhee Kang, Junhong Min, and Jong Chul Ye. 2017. A deep convolutional neural network using directional wavelets for low-dose X-ray CT reconstruction. Medical physics 44, 10 (2017), e360--e375.Google Scholar
Animesh Karnewar and Oliver Wang. 2019. MSG-GAN: multi-scale gradient GAN for stable image synthesis. arXiv preprint arXiv:1903.06048 (2019).Google Scholar
Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. 2017. Progressive growing of gans for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196 (2017).Google Scholar
Tero Karras, Miika Aittala, Janne Hellsten, Samuli Laine, Jaakko Lehtinen, and Timo Aila. 2020a. Training Generative Adversarial Networks with Limited Data. In Proc. NeurIPS.Google Scholar
Tero Karras, Samuli Laine, and Timo Aila. 2019. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4401--4410.Google ScholarCross Ref
Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, Jaakko Lehtinen, and Timo Aila. 2020b. Analyzing and improving the image quality of stylegan. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8110--8119.Google ScholarCross Ref
Christian Ledig, Lucas Theis, Ferenc Huszár, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, et al. 2017. Photo-realistic single image super-resolution using a generative adversarial network. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4681--4690.Google ScholarCross Ref
Lin Liu, Jianzhuang Liu, Shanxin Yuan, Gregory Slabaugh, Ales Leonardis, Wengang Zhou, and Qi Tian. 2020. Wavelet-Based Dual-Branch Network for Image Demoiréing. arXiv preprint arXiv:2007.07173 (2020).Google Scholar
Pengju Liu, Hongzhi Zhang, Wei Lian, and Wangmeng Zuo. 2019b. Multi-level wavelet convolutional neural networks. IEEE Access 7 (2019), 74973--74985.Google ScholarCross Ref
Yunfan Liu, Qi Li, and Zhenan Sun. 2019a. Attribute-aware face aging with wavelet-based generative adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 11877--11886.Google ScholarCross Ref
Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. 2018. Large-scale celebfacesattributes (celeba) dataset. (2018).Google Scholar
Ben Mildenhall, Pratul P Srinivasan, Matthew Tancik, Jonathan T Barron, Ravi Ramamoorthi, and Ren Ng. 2020. Nerf: Representing scenes as neural radiance fields for view synthesis. arXiv preprint arXiv:2003.08934 (2020).Google Scholar
Mehdi Mirza and Simon Osindero. 2014. Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784 (2014).Google Scholar
Thu Nguyen-Phuoc, Chuan Li, Lucas Theis, Christian Richardt, and Yong-Liang Yang. 2019. Hologan: Unsupervised learning of 3d representations from natural images. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 7588--7597.Google Scholar
Alec Radford, Luke Metz, and Soumith Chintala. 2015. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434 (2015).Google Scholar
Nasim Rahaman, Aristide Baratin, Devansh Arpit, Felix Draxler, Min Lin, Fred Hamprecht, Yoshua Bengio, and Aaron Courville. 2019. On the spectral bias of neural networks. In International Conference on Machine Learning. PMLR, 5301--5310.Google Scholar
Elad Richardson, Yuval Alaluf, Or Patashnik, Yotam Nitzan, Yaniv Azar, Stav Shapiro, and Daniel Cohen-Or. 2020. Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation. arXiv preprint arXiv:2008.00951 (2020).Google Scholar
Yujun Shen, Jinjin Gu, Xiaoou Tang, and Bolei Zhou. 2020. Interpreting the Latent Space of GANs for Semantic Face Editing. In CVPR.Google Scholar
Gage Skidmore. 2016. Gal Gadot image by Gage Skidmore [CC BY-SA 3.0], via Wikimedia Commons - https://commons.wikimedia.org/w/index.php?curid=50402815. (2016).Google Scholar
Matthew Tancik, Pratul P Srinivasan, Ben Mildenhall, Sara Fridovich-Keil, Nithin Raghavan, Utkarsh Singhal, Ravi Ramamoorthi, Jonathan T Barron, and Ren Ng. 2020. Fourier features let networks learn high frequency functions in low dimensional domains. arXiv preprint arXiv:2006.10739 (2020).Google Scholar
Justus Thies, Michael Zollhöfer, and Matthias Nießner. 2019. Deferred neural rendering: Image synthesis using neural textures. ACM Transactions on Graphics (TOG) 38, 4 (2019), 1--12.Google ScholarDigital Library
Omer Tov, Yuval Alaluf, Yotam Nitzan, Or Patashnik, and Daniel Cohen-Or. 2021. Designing an Encoder for StyleGAN Image Manipulation. arXiv preprint arXiv:2102.02766 (2021).Google Scholar
Jianyi Wang, Xin Deng, Mai Xu, Congyong Chen, and Yuhang Song. 2020. Multi-level Wavelet-based Generative Adversarial Network for Perceptual Quality Enhancement of Compressed Video. arXiv preprint arXiv:2008.00499 (2020).Google Scholar
Travis Williams and Robert Li. 2018. Wavelet pooling for convolutional neural networks. In International Conference on Learning Representations.Google Scholar
Jaejun Yoo, Youngjung Uh, Sanghyuk Chun, Byeongkyu Kang, and Jung-Woo Ha. 2019. Photorealistic style transfer via wavelet transforms. In Proceedings of the IEEE International Conference on Computer Vision. 9036--9045.Google ScholarCross Ref
Qi Zhang, Huafeng Wang, Tao Du, Sichen Yang, Yuehai Wang, Zhiqiang Xing, Wenle Bai, and Yang Yi. 2019b. Super-resolution reconstruction algorithms based on fusion of deep learning mechanism and wavelet. In Proceedings of the 2nd International Conference on Artificial Intelligence and Pattern Recognition. 102--107.Google ScholarDigital Library
Qi Zhang, Huafeng Wang, and Sichen Yang. 2019a. Image Super-Resolution Using a Wavelet-based Generative Adversarial Network. arXiv preprint arXiv:1907.10213 (2019).Google Scholar
Zhisheng Zhong, Tiancheng Shen, Yibo Yang, Zhouchen Lin, and Chao Zhang. 2018. Joint sub-bands learning with clique structures for wavelet domain super-resolution. Advances in Neural Information Processing Systems 31 (2018), 165--175.Google Scholar

Index Terms

SWAGAN: a style-based wavelet-driven generative model
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision representations
        Image representations
  2. Computer graphics

Recommendations

Denoising Mutation of FPRGA Based on Wavelet Decomposition
CSO '10: Proceedings of the 2010 Third International Joint Conference on Computational Science and Optimization - Volume 01

In order to solve the problem of noises that were generated by operation on floating point representation (FPR) in genetic environment, and its influence on performance of genetic algorithm (GA), FPR genetic algorithm (FPRGA) denoising mutation based on ...
Read More
Unsupervised K-modal styled content generation

The emergence of deep generative models has recently enabled the automatic generation of massive amounts of graphical content, both in 2D and in 3D. Generative Adversarial Networks (GANs) and style control mechanisms, such as Adaptive Instance ...
Read More
Study on the Signal Detection Algorithm of Weak Laser Radar Target Based on Wavelet Transform
ISIP '10: Proceedings of the 2010 Third International Symposium on Information Processing

According to the different transmission characteristics under the wavelet transform(WT) domain and the different distribution characteristics of frequency domain of the signal and noise, a novel approach for detecting weak signal laser radar target ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Graphics Volume 40, Issue 4
August 2021
2170 pages
ISSN:0730-0301
EISSN:1557-7368
DOI:10.1145/3450626
Editor:
Sylvain Paris
Adobe Inc.
Issue’s Table of Contents
Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 19 July 2021
Published in tog Volume 40, Issue 4

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
StyleGAN
generative adversarial networks
wavelet decomposition
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 16
  Total Citations
  View Citations
- 564
  Total Downloads
- Downloads (Last 12 months)104
- Downloads (Last 6 weeks)9
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

SWAGAN: a style-based wavelet-driven generative model

ACM Transactions on Graphics

Abstract

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

Denoising Mutation of FPRGA Based on Wavelet Decomposition

Unsupervised K-modal styled content generation

Study on the Signal Detection Algorithm of Weak Laser Radar Target Based on Wavelet Transform

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

SWAGAN: a style-based wavelet-driven generative model

ACM Transactions on Graphics

Abstract

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

Denoising Mutation of FPRGA Based on Wavelet Decomposition

Unsupervised K-modal styled content generation

Study on the Signal Detection Algorithm of Weak Laser Radar Target Based on Wavelet Transform

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media