Skip to main content
Log in

Multi-style image generation based on semantic image

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Image generation has always been one of the important research directions in the field of computer vision. It has rich applications in virtual reality, image design, and video synthesis. Our experiments proved that the proposed multi-style image generative network can efficiently generate high-quality images with different artistic styles based on the semantic images. Compared with the current state-of-the-art methods, the result generation speed of our proposed method is the fastest. In this paper, we focus on implementing arbitrary style transfer based on semantic images with high resolution (\(512\times 1024\)). We propose a new multi-channel generative adversarial network which uses fewer parameters to generate multi-style images. The network framework consists of a content feature extraction network, a style feature extraction network, and a content-stylistic feature fusion network. Our qualitative experiments show that the proposed multi-style image generation network can efficiently generate semantic-based, high-quality images with multiple artistic styles and with greater clarity and richer details. We adopt a user preference study, and the results show that the results generated by our method are more popular. Our speed study shows that our proposed method has the fastest result generation speed compared to the current state-of-the-art methods. We publicly release the source code of our project, which can be accessed at https://github.com/JuanMaoHSQ/Multi-style-image-generation-based-on-semantic-image.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Data Availability

The datasets generated during and analyzed during the current study are available in the cityscapes repository, https://www.cityscapes-dataset.com/ and the WikiArt repository, https://www.wikiart.org/.

References

  1. Virtusio, J.J., Ople, J.J.M., Tan, D.S., Tanveer, M., Kumar, N., Hua, K.-L.: Neural style palette: a multimodal and interactive style transfer from a single style image. IEEE Trans. Multimedia 23, 2245–2258 (2021)

    Article  Google Scholar 

  2. Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial networks. Adv. Neural. Inf. Process. Syst. 3, 2672–2680 (2014)

    Google Scholar 

  3. Li, Y., Fang, C., Yang, J., Wang, Z., Lu, X., Yang, M.-H.: Universal style transfer via feature transforms. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 385–395 (2017)

  4. Wang, T.-C., Liu, M.-Y., Zhu, J.-Y., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional gans. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8798–8807 (2018)

  5. Park, T., Liu, M.-Y., Wang, T.-C., Zhu, J.-Y.: Semantic image synthesis with spatially-adaptive normalization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2337–2346 (2019)

  6. Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning, pp. 448–456 (2015)

  7. Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2414–2423 (2016)

  8. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings (2015)

  9. Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: European Conference Computer Vision 2016, pp. 694–711 (2016)

  10. Liu, J., Yang, W., Sun, X., Zeng, W.: Photo stylistic brush: robust style transfer via superpixel-based bipartite graph. IEEE Trans. Multimedia 20, 1724–1737 (2017)

    Article  Google Scholar 

  11. Reimann, M., Klingbeil, M., Pasewaldt, S., Semmo, A., Trapp, M., Döllner, J.: Locally controllable neural style transfer on mobile devices. Vis. Comput. 35, 1531–1547 (2019)

    Article  Google Scholar 

  12. Wang, L., Wang, Z., Yang, X., Hu, S.-M., Zhang, J.: Photographic style transfer. Vis. Comput. 36, 317–331 (2020)

    Article  Google Scholar 

  13. Zhao, H.-H., Rosin, P.L., Lai, Y.-K., Wang, Y.-N.: Automatic semantic style transfer using deep convolutional neural networks and soft masks. Vis. Comput. 36, 1307–1324 (2020)

    Article  Google Scholar 

  14. Huang, X., Belongie, S.: Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1501–1510 (2017)

  15. Shen, F., Yan, S., Zeng, G.: Neural style transfer via meta networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8061–8069 (2018)

  16. Park, D.Y., Lee, K.H.: Arbitrary style transfer with style-attentional networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5880–5888 (2019)

  17. Li, X., Liu, S., Kautz, J., Yang, M.-H.: Learning linear transformations for fast image and video style transfer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3809–3817 (2019)

  18. Yu, X., Zhou, G.: Arbitrary style transfer via content consistency and style consistency. Vis. Comput. 39, 1–14 (2023)

    Google Scholar 

  19. Miyato, T., Kataoka, T., Koyama, M., Yoshida, Y.: Spectral normalization for generative adversarial networks. In: International Conference on Learning Representations (2018)

  20. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

  21. Virtusio, J.J., Tan, D.S., Cheng, W.H., Tanveer, M., Hua, K.L.: Enabling artistic control over pattern density and stroke strength. IEEE Transactions on Multimedia PP(99), 1–1 (2020)

  22. Li, P., Zhao, L., Xu, D., Lu, D.: Incorporating multiscale contextual loss for image style transfer. In: 2018 IEEE 3rd International Conference on Image, Vision and Computing (ICIVC), pp. 241–245 (2018)

  23. Dey, N., Blanc-Feraud, L., Zimmer, C., Roux, P., Kam, Z., Olivo-Marin, J.-C., Zerubia, J.: Richardson–Lucy algorithm with total variation regularization for 3d confocal microscope deconvolution. Microsc. Res. Tech. 69, 260–266 (2006)

    Article  Google Scholar 

  24. Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., Schiele, B.: The cityscapes dataset for semantic urban scene understanding. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3213–3223 (2016)

  25. Zhou, B., Zhao, H., Puig, X., Fidler, S., Barriuso, A., Torralba, A.: Scene parsing through ade20k dataset. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 633–641 (2017)

  26. Li, B., Yu, Y., Wang, M.: Semantic image synthesis with trilateral generative adversarial networks. In: 2020 The 4th International Conference on Video and Image Processing, pp. 218–224 (2020)

  27. Deng, Y., Tang, F., Dong, W., Sun, W., Huang, F., Xu, C.: Arbitrary style transfer via multi-adaptation network. In: Proceedings of the 28th ACM International Conference on Multimedia, pp. 2719–2727 (2020)

  28. Talebi, H., Milanfar, P.: NIMA: neural image assessment. IEEE Trans. Image Process. 27(8), 3998–4011 (2018)

    Article  MathSciNet  Google Scholar 

  29. Ding, K., Ma, K., Wang, S., Simoncelli, E.P.: Image quality assessment: unifying structure and texture similarity. IEEE Trans. Pattern Anal. Mach. Intell. 44(5), 2567–2581 (2020)

    Google Scholar 

Download references

Funding

This work is supported by National Natural Science Foundation of China (61807002).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yue Yu.

Ethics declarations

Conflict of interest

No potential conflict of interest was reported by the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yu, Y., Li, D., Li, B. et al. Multi-style image generation based on semantic image. Vis Comput 40, 3411–3426 (2024). https://doi.org/10.1007/s00371-023-03042-2

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-023-03042-2

Keywords

Navigation