Skip to main content

Advancements in Photorealistic Style Translation with a Hybrid Generative Adversarial Network

  • Conference paper
  • First Online:
Pattern Recognition and Computer Vision (PRCV 2024)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15034))

Included in the following conference series:

  • 139 Accesses

Abstract

Photorealistic style translation has gained significant attention in the field of computer vision and graphics due to its potential applications in many areas such as content generation, artistic expression and image editing. In this paper, we propose an improved hybrid Generative Adversarial Network (GAN) framework to perform photorealistic style translation. This model aims to overcome limitations observed in the existing literature by leveraging the strength of both GAN and Autoencoders. A fast and efficient feature extractor method is developed based on EfficientNet B2 to improve the performance of style transfer method. Moreover, two loss functions are proposed which further optimize the efficiency of the model in terms of accuracy and realism. We conduct extensive experiments on state-of-the-art datasets and demonstrate the effectiveness of the proposed model. Additionally, we provide analysis of results and discuss potential avenues for future research in the field of photorealistic style translation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://www.kaggle.com/datasets/arnaud58/landscape-pictures.

  2. 2.

    https://github.com/colemiller94/gatedGAN/tree/master/photo2fourcollection.

References

  1. Pang, Y., Lin, J., Qin, T., et al.: Image-to-image translation: methods and applications. IEEE Trans. Multimed. 3859–3881 (2021)

    Google Scholar 

  2. Huo, Z., Li, X., Qiao, Y., et al.: Efficient photorealistic style transfer with multi-order image statistics. Appl. Intell. 52(11), 12533–12545 (2022)

    Google Scholar 

  3. Chiu, T.Y., Gurari, D.: PhotoWCT2: compact autoencoder for photorealistic style transfer resulting from blockwise training and skip connections of high-frequency residuals. In: IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 2868–2877 (2022)

    Google Scholar 

  4. Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: European Conference on Computer Vision, part II, no. 14, pp. 694–711 (2016)

    Google Scholar 

  5. Gatys, L.A., Ecker, A.S., Bethge, M.: A neural algorithm of artistic style. arXiv:1508.06576 (2015)

  6. Sheng, L., Lin, Z., Shao, Z., et al.: Avatar-net: multi-scale zero-shot style transfer by feature decoration. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8242–8250 (2018)

    Google Scholar 

  7. Liu, A.H., Liu, Y.C., Wang, Y.C.: A unified feature disentangler for multi-domain image translation and manipulation. Adv. Neural Inf. Process. Syst. 31 (2018)

    Google Scholar 

  8. Cheng, K., Tahir, R., Eric, L., et al.: An analysis of generative adversarial networks and variants for image synthesis on MNIST dataset. Multimed. Tools Appl. 79(19), 13725–13752 (2020)

    Google Scholar 

  9. Goodfellow, I., Abadei, P., Mirza, J., et al.: Generative adversarial nets. In: Adv. Neural Inf. Process. Syst. (NIPS) 27(1) (2014)

    Google Scholar 

  10. Luan, F., Paris, S., Shechtman, E., et al.: Deep photo style transfer. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4990–4998 (2017)

    Google Scholar 

  11. Jiang, L., Zhang, C., Huang, M., et al.: Tsit: a simple and versatile framework for image-to-image translation. In: European Conference on Computer Vision, part III, no. 16, pp. 206–222 (2020)

    Google Scholar 

  12. Yoo, J., Uh, Y., Chun, S., et al.: Photorealistic style transfer via wavelet transforms. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9036–9045 (2019)

    Google Scholar 

  13. Rao, D., Wu, X.J., Li, H., et al.: UMFA: a photorealistic style transfer method based on U-Net and multi-layer feature aggregation. J. Electron. Imaging 30(5), 053013 (2021)

    Google Scholar 

  14. Sanakoyeu, A., Kotovenko, D., Lang, S., et al.: A style-aware content loss for real-time hd style transfer. In: European Conference on Computer Vision (ECCV), pp. 698–694 (2018)

    Google Scholar 

  15. An, J., Xiong, J., Huan, J., et al.: Ultrafast photorealistic style transfer via neural architecture search. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 7, pp. 10443–10450 (2020)

    Google Scholar 

  16. Qiao, Y., Cui, J., Huang, F., et al.: Efficient style-corpus constrained learning for photorealistic style transfer. IEEE Trans. Image Process. 30, 3154–3166 (2021)

    Google Scholar 

  17. Bui, N.T., Nguyen, N.T., Cao, X.N.: Structure-aware photorealistic style transfer using ghost bottlenecks. In: International Conference on Pattern Recognition and Artificial Intelligence, pp. 15–24 (2022)

    Google Scholar 

  18. Sunwoo, K., Soohyun, K., Seungryong, K.: Deep translation prior: test-time training for photorealistic style transfer. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 1, pp. 1183–1191 (2022)

    Google Scholar 

  19. Tan, M., Le, Q.: Efficientnet: rethinking model scaling for convolutional neural networks. In: International Conference on Machine Learning. PMLR (2019)

    Google Scholar 

  20. Huang, X., Belongie, S.: Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of the IEEE International Conference on Computer Vision (2017)

    Google Scholar 

  21. Park, T., Zhu, J.Y., Wang, O., et al.: Swapping autoencoder for deep image manipulation. Adv. Neural Inf. Process. Syst. 33, 7198–7211 (2020)

    Google Scholar 

  22. Micikevicius, P., Narang, S., Alben, J., et al.: Mixed precision training. arXiv preprint arXiv:1710.03740 (2017)

  23. Qin, Z., Kim, D., Gedeon, T.: Rethinking softmax with cross-entropy: neural network classifier as mutual information estimator. arXiv:1911.10688 (2019)

  24. Jiang, L., Dai, B., Wu, W., et al.: Focal frequency loss for image reconstruction and synthesis. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 13919–13929 (2021)

    Google Scholar 

  25. Zhu, J., Park, Y., Isola, T., et al.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: IEEE International Conference on Computer Vision, pp. 2223–2232 (2017)

    Google Scholar 

  26. Hong, K., Jeon, S., Yang, H., Fu, J., Byun, H.: Domain-aware universal style transfer. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 14609–14617 (2021)

    Google Scholar 

  27. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 (2014)

  28. He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 770–778 (2016)

    Google Scholar 

  29. Dimech’s, A.: Peak Signal-to-Noise Ratio (PSNR) in Python. Adam Dimech’s Coding. https://code.adonline.id.au/peak-signal-to-noise-ratio-python/ (2021)

  30. Parfenenkov, B.B., Panachev, M.A.: Comparison of some image quality approaches. AIST (Supplement), 48–53 (2014)

    Google Scholar 

  31. Wang, Z., Simoncelli, E.P., Bovik, A.C.: Multiscale structural similarity for image quality assessment. In: The Thirty-Seventh Asilomar Conference on Signals, Systems & Computers. IEEE (2003)

    Google Scholar 

Download references

Acknowledgement

This research is supported by Natural Science Foundation of China (No. 61972183).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rabia Tahir .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cheng, K., Tahir, R., Wan, H. (2025). Advancements in Photorealistic Style Translation with a Hybrid Generative Adversarial Network. In: Lin, Z., et al. Pattern Recognition and Computer Vision. PRCV 2024. Lecture Notes in Computer Science, vol 15034. Springer, Singapore. https://doi.org/10.1007/978-981-97-8505-6_24

Download citation

  • DOI: https://doi.org/10.1007/978-981-97-8505-6_24

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-97-8504-9

  • Online ISBN: 978-981-97-8505-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics