Multi-style image generation based on semantic image

Yu, Yue; Li, Ding; Li, Benyuan; Li, Nengli

doi:10.1007/s00371-023-03042-2

Multi-style image generation based on semantic image

Original article
Published: 09 August 2023

Volume 40, pages 3411–3426, (2024)
Cite this article

The Visual Computer Aims and scope Submit manuscript

Yue Yu¹,
Ding Li¹,
Benyuan Li¹ &
…
Nengli Li¹

289 Accesses
3 Citations
Explore all metrics

Abstract

Image generation has always been one of the important research directions in the field of computer vision. It has rich applications in virtual reality, image design, and video synthesis. Our experiments proved that the proposed multi-style image generative network can efficiently generate high-quality images with different artistic styles based on the semantic images. Compared with the current state-of-the-art methods, the result generation speed of our proposed method is the fastest. In this paper, we focus on implementing arbitrary style transfer based on semantic images with high resolution (\(512\times 1024\)). We propose a new multi-channel generative adversarial network which uses fewer parameters to generate multi-style images. The network framework consists of a content feature extraction network, a style feature extraction network, and a content-stylistic feature fusion network. Our qualitative experiments show that the proposed multi-style image generation network can efficiently generate semantic-based, high-quality images with multiple artistic styles and with greater clarity and richer details. We adopt a user preference study, and the results show that the results generated by our method are more popular. Our speed study shows that our proposed method has the fastest result generation speed compared to the current state-of-the-art methods. We publicly release the source code of our project, which can be accessed at https://github.com/JuanMaoHSQ/Multi-style-image-generation-based-on-semantic-image.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

TMGAN: two-stage multi-domain generative adversarial network for landscape image translation

Article 07 December 2023

Multi-style Generative Network for Real-Time Transfer

Multi-Attention Network for Arbitrary Style Transfer

Data Availability

The datasets generated during and analyzed during the current study are available in the cityscapes repository, https://www.cityscapes-dataset.com/ and the WikiArt repository, https://www.wikiart.org/.

References

Virtusio, J.J., Ople, J.J.M., Tan, D.S., Tanveer, M., Kumar, N., Hua, K.-L.: Neural style palette: a multimodal and interactive style transfer from a single style image. IEEE Trans. Multimedia 23, 2245–2258 (2021)
Article Google Scholar
Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial networks. Adv. Neural. Inf. Process. Syst. 3, 2672–2680 (2014)
Google Scholar
Li, Y., Fang, C., Yang, J., Wang, Z., Lu, X., Yang, M.-H.: Universal style transfer via feature transforms. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 385–395 (2017)
Wang, T.-C., Liu, M.-Y., Zhu, J.-Y., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional gans. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8798–8807 (2018)
Park, T., Liu, M.-Y., Wang, T.-C., Zhu, J.-Y.: Semantic image synthesis with spatially-adaptive normalization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2337–2346 (2019)
Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning, pp. 448–456 (2015)
Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2414–2423 (2016)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings (2015)
Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: European Conference Computer Vision 2016, pp. 694–711 (2016)
Liu, J., Yang, W., Sun, X., Zeng, W.: Photo stylistic brush: robust style transfer via superpixel-based bipartite graph. IEEE Trans. Multimedia 20, 1724–1737 (2017)
Article Google Scholar
Reimann, M., Klingbeil, M., Pasewaldt, S., Semmo, A., Trapp, M., Döllner, J.: Locally controllable neural style transfer on mobile devices. Vis. Comput. 35, 1531–1547 (2019)
Article Google Scholar
Wang, L., Wang, Z., Yang, X., Hu, S.-M., Zhang, J.: Photographic style transfer. Vis. Comput. 36, 317–331 (2020)
Article Google Scholar
Zhao, H.-H., Rosin, P.L., Lai, Y.-K., Wang, Y.-N.: Automatic semantic style transfer using deep convolutional neural networks and soft masks. Vis. Comput. 36, 1307–1324 (2020)
Article Google Scholar
Huang, X., Belongie, S.: Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1501–1510 (2017)
Shen, F., Yan, S., Zeng, G.: Neural style transfer via meta networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8061–8069 (2018)
Park, D.Y., Lee, K.H.: Arbitrary style transfer with style-attentional networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5880–5888 (2019)
Li, X., Liu, S., Kautz, J., Yang, M.-H.: Learning linear transformations for fast image and video style transfer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3809–3817 (2019)
Yu, X., Zhou, G.: Arbitrary style transfer via content consistency and style consistency. Vis. Comput. 39, 1–14 (2023)
Google Scholar
Miyato, T., Kataoka, T., Koyama, M., Yoshida, Y.: Spectral normalization for generative adversarial networks. In: International Conference on Learning Representations (2018)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Virtusio, J.J., Tan, D.S., Cheng, W.H., Tanveer, M., Hua, K.L.: Enabling artistic control over pattern density and stroke strength. IEEE Transactions on Multimedia PP(99), 1–1 (2020)
Li, P., Zhao, L., Xu, D., Lu, D.: Incorporating multiscale contextual loss for image style transfer. In: 2018 IEEE 3rd International Conference on Image, Vision and Computing (ICIVC), pp. 241–245 (2018)
Dey, N., Blanc-Feraud, L., Zimmer, C., Roux, P., Kam, Z., Olivo-Marin, J.-C., Zerubia, J.: Richardson–Lucy algorithm with total variation regularization for 3d confocal microscope deconvolution. Microsc. Res. Tech. 69, 260–266 (2006)
Article Google Scholar
Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., Schiele, B.: The cityscapes dataset for semantic urban scene understanding. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3213–3223 (2016)
Zhou, B., Zhao, H., Puig, X., Fidler, S., Barriuso, A., Torralba, A.: Scene parsing through ade20k dataset. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 633–641 (2017)
Li, B., Yu, Y., Wang, M.: Semantic image synthesis with trilateral generative adversarial networks. In: 2020 The 4th International Conference on Video and Image Processing, pp. 218–224 (2020)
Deng, Y., Tang, F., Dong, W., Sun, W., Huang, F., Xu, C.: Arbitrary style transfer via multi-adaptation network. In: Proceedings of the 28th ACM International Conference on Multimedia, pp. 2719–2727 (2020)
Talebi, H., Milanfar, P.: NIMA: neural image assessment. IEEE Trans. Image Process. 27(8), 3998–4011 (2018)
Article MathSciNet Google Scholar
Ding, K., Ma, K., Wang, S., Simoncelli, E.P.: Image quality assessment: unifying structure and texture similarity. IEEE Trans. Pattern Anal. Mach. Intell. 44(5), 2567–2581 (2020)
Google Scholar

Download references

Funding

This work is supported by National Natural Science Foundation of China (61807002).

Author information

Authors and Affiliations

Beijing Institute of Technology, No. 5, South Street, Zhongguancun, Haidian District, Beijing, 100081, China
Yue Yu, Ding Li, Benyuan Li & Nengli Li

Authors

Yue Yu
View author publications
You can also search for this author in PubMed Google Scholar
Ding Li
View author publications
You can also search for this author in PubMed Google Scholar
Benyuan Li
View author publications
You can also search for this author in PubMed Google Scholar
Nengli Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yue Yu.

Ethics declarations

Conflict of interest

No potential conflict of interest was reported by the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Yu, Y., Li, D., Li, B. et al. Multi-style image generation based on semantic image. Vis Comput 40, 3411–3426 (2024). https://doi.org/10.1007/s00371-023-03042-2

Download citation

Accepted: 22 July 2023
Published: 09 August 2023
Issue Date: May 2024
DOI: https://doi.org/10.1007/s00371-023-03042-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-style image generation based on semantic image

Abstract

Access this article

Similar content being viewed by others

TMGAN: two-stage multi-domain generative adversarial network for landscape image translation

Multi-style Generative Network for Real-Time Transfer

Multi-Attention Network for Arbitrary Style Transfer

Data Availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multi-style image generation based on semantic image

Abstract

Access this article

Similar content being viewed by others

TMGAN: two-stage multi-domain generative adversarial network for landscape image translation

Multi-style Generative Network for Real-Time Transfer

Multi-Attention Network for Arbitrary Style Transfer

Data Availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation