I2I translation model based on CondConv and spectral domain realness measurement: BCS-StarGAN

Li, Yuqiang; Shangguan, Xinyi; Liu, Chun; Meng, Haochen

doi:10.1007/s00530-023-01117-7

I2I translation model based on CondConv and spectral domain realness measurement: BCS-StarGAN

Regular Paper
Published: 17 June 2023

Volume 29, pages 2511–2526, (2023)
Cite this article

Multimedia Systems Aims and scope Submit manuscript

Yuqiang Li¹,
Xinyi Shangguan¹,
Chun Liu¹ &
…
Haochen Meng¹

149 Accesses
Explore all metrics

Abstract

In recent years, the research on the Image-to-Image(I2I) translation based on Generative Adversarial Networks has received extensive attention from both industry and academia, and relevant research results have been emerging. As a typical representative, StarGAN v2 has achieved good results in the field of I2I translation. But it still has the problem of insufficient feature extraction in some cases, which leads to the unsatisfactory quality of I2I translation. The conventional method is to increase the depth and width of the model. But this approach increases the complexity of the model, making the already difficult-to-train StarGAN v2 even more difficult to train, thus hindering the application of the model. To this end, this paper proposes an improved model BCS-StarGAN based on conditional parameterized convolution(CondConv) and spectral domain realness measurement. This method can significantly improve I2I translation quality by only adding a small amount of computation. In this paper, we first replace the conventional convolution used by the Bottleneck module in the generator of the StarGAN v2 with CondConv. Furthermore, to better obtain the high frequency data distribution of real images, a lightweight spectral classifier is added to the discriminator. It enables the discriminator to distinguish whether the image has high frequency data to motivate the generator to learn the high frequency information of the image. Finally, we conduct a qualitative and quantitative experimental comparison on three public datasets. The comparative experimental results with mainstream models show that BCS-StarGAN can achieve the best results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Spectral normalization and dual contrastive regularization for image-to-image translation

Article 13 March 2024

Improving Disentanglement-Based Image-to-Image Translation with Feature Joint Block Fusion

AMMUNIT: An Attention-Based Multimodal Multi-domain UNsupervised Image-to-Image Translation Framework

Data availability

Data will be made available on request.

References

Liu, Y.: Improved generative adversarial network and its application in image oil painting style transfer. Image Vis. Comput. 105, 104087 (2021)
Article Google Scholar
Ledig, C., Theis, L., Huszár, F., Caballero, J., Cunningham, A., Acosta, A., Aitken, A., Tejani, A., Totz, J., Wang, Z.: Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4681–4690 (2017)
Shao, Y., Li, L., Ren, W., Gao, C., Sang, N.: Domain adaptation for image dehazing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2808–2817 (2018)
Chen, J., Chen, J., Chao, H., Yang, M.: Image blind denoising with generative adversarial network based noise modeling. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3155–3164 (2020)
Liu, H., Wan, Z., Huang, W., Song, Y., Han, X., Liao, J.: Pd-gan: probabilistic diverse gan for image inpainting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9371–9381 (2021)
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, vol. 27. Curran (2014)
Choi, Y., Choi, M., Kim, M., Ha, J.-W., Kim, S., Choo, J.: Stargan: unified generative adversarial networks for multi-domain image-to-image translation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8789–8797 (2018)
Choi, Y., Uh, Y., Yoo, J., Ha, J.-W.: Stargan v2: diverse image synthesis for multiple domains. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8188–8197 (2020)
Yang, B., Bender, G., Le, Q.V., Ngiam, J.: Condconv: conditionally parameterized convolutions for efficient inference. In: Advances in Neural Information Processing Systems. Springer (2019)
Durall, R., Keuper, M., Keuper, J.: Watch your up-convolution: Cnn based generative deep neural networks are failing to reproduce spectral distributions. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7890–7899 (2020)
Chen, Y., Li, G., Jin, C., Liu, S., Li, T.: Ssd-gan: measuring the realness in the spatial and spectral domains. Proc. AAAI Conf. Artif. Intell. 35, 1105–1112 (2021)
Google Scholar
Isola, P., Zhu, J.-Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1125–1134 (2017)
Wang, T.-C., Liu, M.-Y., Zhu, J.-Y., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional gans. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8798–8807 (2018)
Zhu, J.-Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2223–2232 (2023)
Yi, Z., Zhang, H., Tan, P., Gong, M.: Dualgan: unsupervised dual learning for image-to-image translation. In: Proceedings of the IEEE Conference on Computer Vision, pp. 2849–2857 (2017)
Kim, T., Cha, M., Kim, H., Lee, J.K., Kim, J.: Learning to discover cross-domain relations with generative adversarial networks. In: International Conference on Machine Learning, pp. 1857–1865. PMLR (2017)
Liu, M.-Y., Breuel, T., Kautz, J.: Unsupervised image-to-image translation networks. In: Advances in Neural Information Processing Systems, vol. 30. Springer (2017)
Kim, D., Khan, M.A., Choo, J.: Not just compete, but collaborate: local image-to-image translation via cooperative mask prediction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6509–6518 (2021)
Xu, Y., Xie, S., Wu, W., Zhang, K., Gong, M., Batmanghelich, K.: Maximum spatial perturbation consistency for unpaired image-to-image translation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 18311–18320 (2022)
Theiss, J., Leverett, J., Kim, D., Prakash, A.: Unpaired image translation via vector symbolic architectures. In: Computer Vision-ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXI, pp. 17–32. Springer (2022)
Jihye, Kim, S., Kim, S., Yoo, J., Uh, Y., Kim, S.: Lanit: language-driven image-to-image translation for unlabeled data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023)
Zhu, J.-Y., Zhang, R., Pathak, D., Darrell, T., Efros, A.A., Wang, O., Shechtman, E.: Toward multimodal image-to-image translation. In: Advances in Neural Information Processing Systems, vol. 30. Springer (2017)
Chen, Y., Dai, X., Liu, M., Chen, D., Yuan, L., Liu, Z.: Dynamic convolution: attention over convolution kernels. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11030–11039 (2020)
Zhou, J., Jampani, V., Pi, Z., Liu, Q., Yang, M.-H.: Decoupled dynamic filter networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6647–6656 (2021)
Li, Y., Chen, Y.: Revisiting dynamic convolution via matrix decomposition. In: International Conference on Learning Representations (2021)
Gong, X., Chang, S., Jiang, Y., Wang, Z.: Autogan: neural architecture search for generative adversarial networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3224–3234 (2019)
Tancik, M., Srinivasan, P., Mildenhall, B., Fridovich-Keil, S., Raghavan, N., Singhal, U., Ramamoorthi, R., Barron, J., Ng, R.: Fourier features let networks learn high frequency functions in low dimensional domains. Adv. Neural Inf. Process. Syst. 33, 7537–7547 (2020)
Google Scholar
Rahaman, N., Baratin, A., Arpit, D., Draxler, F., Lin, M., Hamprecht, F., Bengio, Y., Courville, A.: On the spectral bias of neural networks. In: International Conference on Machine Learning, pp. 5301–5310. PMLR (2019)
Liu, P., Zhang, H., Zhang, K., Lin, L., Zuo, W.: Multi-level wavelet-cnn for image restoration. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 773–782 (2018)
Cai, M., Zhang, H., Huang, H., Geng, Q., Li, Y., Huang, G.: Frequency domain image translation: more photo-realistic, better identity-preserving. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 13930–13940 (2021)
Huang, X., Liu, M.-Y., Belongie, S., Kautz, J.: Multimodal unsupervised image-to-image translation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 172–189 (2018)
Lee, H.-Y., Tseng, H.-Y., Huang, J.-B., Singh, M., Yang, M.-H.: Diverse image-to-image translation via disentangled representations. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 35–51 (2018)
Mao, Q., Lee, H.-Y., Tseng, H.-Y., Ma, S., Yang, M.-H.: Mode seeking generative adversarial networks for diverse image synthesis. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1429–1437 (2019)
Zhao, Y., Chen, C.: Unpaired image-to-image translation via latent energy transport. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16418–16427 (2021)
Jung, C., Kwon, G., Ye, J.C.: Exploring patch-wise semantic relation for contrastive learning in image-to-image translation tasks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 18260–18269 (2022)
Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of gans for improved quality, stability, and variation. In: International Conference on Learning Representations (2018)
Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: Gans trained by a two time-scale update rule converge to a local nash equilibrium. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 6629–6640 (2017)
Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 586–595 (2018)

Download references

Author information

Authors and Affiliations

Wuhan University of Technology, Wuhan, Wuhan, 430070, Hubei, China
Yuqiang Li, Xinyi Shangguan, Chun Liu & Haochen Meng

Authors

Yuqiang Li
View author publications
You can also search for this author in PubMed Google Scholar
Xinyi Shangguan
View author publications
You can also search for this author in PubMed Google Scholar
Chun Liu
View author publications
You can also search for this author in PubMed Google Scholar
Haochen Meng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chun Liu.

Additional information

Communicated by J. Gao.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Li, Y., Shangguan, X., Liu, C. et al. I2I translation model based on CondConv and spectral domain realness measurement: BCS-StarGAN. Multimedia Systems 29, 2511–2526 (2023). https://doi.org/10.1007/s00530-023-01117-7

Download citation

Received: 01 June 2022
Accepted: 22 May 2023
Published: 17 June 2023
Issue Date: October 2023
DOI: https://doi.org/10.1007/s00530-023-01117-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

I2I translation model based on CondConv and spectral domain realness measurement: BCS-StarGAN

Abstract

Access this article

Similar content being viewed by others

Spectral normalization and dual contrastive regularization for image-to-image translation

Improving Disentanglement-Based Image-to-Image Translation with Feature Joint Block Fusion

AMMUNIT: An Attention-Based Multimodal Multi-domain UNsupervised Image-to-Image Translation Framework

Data availability

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

I2I translation model based on CondConv and spectral domain realness measurement: BCS-StarGAN

Abstract

Access this article

Similar content being viewed by others

Spectral normalization and dual contrastive regularization for image-to-image translation

Improving Disentanglement-Based Image-to-Image Translation with Feature Joint Block Fusion

AMMUNIT: An Attention-Based Multimodal Multi-domain UNsupervised Image-to-Image Translation Framework

Data availability

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation