Skip to main content
Log in

Adaptive Style Modulation for Artistic Style Transfer

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

Arbitrary-style-per-model (ASPM) style transfer algorithms transfer arbitrary styles based on a single model. Statistics-based learning algorithms of ASPM, represented by adaptive instance normalization (AdaIN), conduct instance normalization and then perform an affine transformation on target features. These algorithms are computationally efficient and easy to embed in convolutional neural networks. Consequently, they are widely used in image synthesis tasks to control the style of the resulting images. However, the style of stylized images may be a combination of content and stylized images, which suggests that these methods do not transform styles accurately. In this work, we rethink the function of AdaIN in controlling style. We show that the role of AdaIN is to (1) give each input content image a specific optimization target, (2) dynamically set cross-channel correlations, and (3) act as a feature selector after combining it with an activation function. Accordingly, we propose adaptive style modulation (AdaSM), which fully leverages the three roles mentioned above and thereby enables more precise control of global style. Experimental results show that AdaSM provides superior style controllability, alleviates the style blending problem, and outperforms state-of-the-art methods in artistic style transfer tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data Availibility

The datasets used during this study are available upon request to the authors.

References

  1. Gooch B, Gooch A (2001) Non-photorealistic rendering. AK Peters/CRC Press

  2. Strothotte T, Schlechtweg S (2002) Non-photorealistic computer graphics: modeling, rendering, and animation. Morgan Kaufmann Publishers Inc

  3. Rosin P, Collomosse J (2012) Image and video-based artistic stylisation. Springer

  4. Jing Y, Yang Y, Feng Z, Ye J, Yu Y, Song M (2019) Neural style transfer: a review. IEEE Trans Vis Comput Graph 26(11):3365–3385

    Article  Google Scholar 

  5. Misra J, Saha I (2010) Artificial neural networks in hardware: a survey of two decades of progress. Neurocomputing 74(1–3):239–255

    Article  Google Scholar 

  6. Cao Y, Cao Y, Wen S, Huang T, Zeng Z (2019) Passivity analysis of delayed reaction-diffusion memristor-based neural networks. Neural Netw 109:159–167

    Article  MATH  Google Scholar 

  7. Cao Y, Liu N, Zhang C, Zhang T, Luo Z-F (2022) Synchronization of multiple reaction-diffusion memristive neural networks with known or unknown parameters and switching topologies. Knowl Based Syst 254:109595

    Article  Google Scholar 

  8. Wang Z, Joshi S, Savel’ev S, Song W, Midya R, Li Y, Rao M, Yan P, Asapu S, Zhuo Y et al (2018) Fully memristive neural networks for pattern classification with unsupervised learning. Nat Electron 1(2):137–145

    Article  Google Scholar 

  9. Pershin YV, Di Ventra M (2010) Experimental demonstration of associative memory with memristive neural networks. Neural Netw 23(7):881–886

    Article  Google Scholar 

  10. Krizhevsky A, Sutskever I, Hinton GE (2017) Imagenet classification with deep convolutional neural networks. Commun ACM 60(6):84–90

    Article  Google Scholar 

  11. Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning, pp 448–456 . PMLR

  12. Ba JL, Kiros JR, Hinton GE (2016) Layer normalization. arXiv preprint arXiv:1607.06450

  13. Xu J, Sun X, Zhang Z, Zhao G, Lin J (2019) Understanding and improving layer normalization. In: Advances in neural information processing systems, vol 32, pp 4383–4393

  14. Salimans T, Kingma DP (2016) Weight normalization: a simple reparameterization to accelerate training of deep neural networks. In: Advances in neural information processing systems, vol 29, pp 901–909

  15. Wu Y, He K (2018) Group normalization. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 3–19

  16. Ulyanov D, Vedaldi A, Lempitsky V (2016) Instance normalization: the missing ingredient for fast stylization. arXiv preprint arXiv:1607.08022

  17. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 770–778

  18. Duchi J, Singer Y (2009) Efficient online and batch learning using forward backward splitting. J Mach Learn Res 10:2899–2934

    MathSciNet  MATH  Google Scholar 

  19. Zeiler MD (2012) Adadelta: an adaptive learning rate method. arXiv preprint arXiv:1212.5701

  20. Zeiler MD, Taylor GW, Fergus R (2011) Adaptive deconvolutional networks for mid and high level feature learning. In: 2011 International Conference on Computer Vision, pp 2018–2025

  21. Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980

  22. Wen S, Xiao S, Yang Y, Yan Z, Zeng Z, Huang T (2018) Adjusting learning rate of memristor-based multilayer neural networks via fuzzy method. IEEE Trans Comput Aided Des Integr Circ Syst 38(6):1084–1094

    Article  Google Scholar 

  23. Parikh AP, Täckström O, Das D, Uszkoreit J (2016) A decomposable attention model for natural language inference. arXiv preprint arXiv:1606.01933

  24. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser, Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, vol 30, pp 5998–6008

  25. Zhang H, Goodfellow I, Metaxas D, Odena A (2019) Self-attention generative adversarial networks. In: International Conference on Machine Learning, pp 7354–7363 . PMLR

  26. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556

  27. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1–9

  28. Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2818–2826

  29. Szegedy C, Ioffe S, Vanhoucke V, Alemi AA (2017) Inception-v4, inception-resnet and the impact of residual connections on learning. In: Thirty-first AAAI Conference on Artificial Intelligence

  30. Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861

  31. Zhang X, Zhou X, Lin M, Sun J (2018) Shufflenet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 6848–6856

  32. Sun K, Xiao B, Liu D, Wang J (2019) Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 5693–5703

  33. Gatys LA, Ecker AS, Bethge M (2016) Image style transfer using convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2414–2423

  34. Gatys L, Ecker AS, Bethge M (2015) Texture synthesis using convolutional neural networks. In: Advances in Neural Information Processing Systems, pp 262–270

  35. Johnson J, Alahi A, Fei-Fei L (2016) Perceptual losses for real-time style transfer and super-resolution. In: European Conference on Computer Vision, Springer, pp 694–711

  36. Ulyanov D, Lebedev V, Vedaldi A, Lempitsky VS (2016) Texture networks: feed-forward synthesis of textures and stylized images. In: ICML, p 4

  37. Ulyanov D, Vedaldi A, Lempitsky V (2017) Improved texture networks: maximizing quality and diversity in feed-forward stylization and texture synthesis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 6924–6932

  38. Li C, Wand M (2016) Precomputed real-time texture synthesis with markovian generative adversarial networks. In: European Conference on Computer Vision, Springer, pp 702–716

  39. Dumoulin V, Shlens J, Kudlur M (2016) A learned representation for artistic style. arXiv preprint arXiv:1610.07629

  40. Huang X, Belongie S (2017) Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of the IEEE International Conference on Computer Vision, pp 1501–1510

  41. Li Y, Fang C, Yang J, Wang Z, Lu X, Yang M-H (2017) Universal style transfer via feature transforms. In: Advances in Neural Information Processing Systems, pp 386–396

  42. Li X, Liu S, Kautz J, Yang M.-H (2019) Learning linear transformations for fast image and video style transfer. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3809–3817

  43. Sheng L, Lin Z, Shao J, Wang X (2018) Avatar-net: multi-scale zero-shot style transfer by feature decoration. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 8242–8250

  44. Park DY, Lee KH (2019) Arbitrary style transfer with style-attentional networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 5880–5888

  45. Deng Y, Tang F, Dong W, Sun W, Huang F, Xu C (2020) Arbitrary style transfer via multi-adaptation network. In: Proceedings of the 28th ACM International Conference on Multimedia, pp 2719–2727

  46. Yao Y, Ren J, Xie X, Liu W, Liu Y-J, Wang J (2019) Attention-aware multi-stroke style transfer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 1467–1475

  47. Chen H, Wang Z, Zhang H, Zuo Z, Li A, Xing W, Lu D et al (2021) Artistic style transfer with internal-external learning and contrastive learning. Adv Neural Inf Process Syst 34:26561–26573

    Google Scholar 

  48. Ghiasi G, Lee H, Kudlur M, Dumoulin V, Shlens J (2017) Exploring the structure of a real-time, arbitrary neural artistic stylization network. In: Proceedings of the British machine vision conference, pp 114.1–114.12

  49. Park DY, Lee KH (2019) Arbitrary style transfer with style-attentional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 5880–5888

  50. Karras T, Laine S, Aittala M, Hellsten J, Lehtinen J, Aila T (2020) Analyzing and improving the image quality of stylegan. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 8110–8119

  51. Karras T, Aittala M, Laine S, Härkönen E, Hellsten J, Lehtinen J, Aila T (2021) Alias-free generative adversarial networks. Adv Neural Inf Process Syst 34:852–863

    Google Scholar 

  52. Choi Y, Uh Y, Yoo J, Ha J-W (2020) Stargan v2: diverse image synthesis for multiple domains. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 8188–8197

  53. Huang X, Liu M-Y, Belongie S, Kautz J (2018) Multimodal unsupervised image-to-image translation. In: Proceedings of the European Conference on Computer Vision, pp 172–189

  54. Wang Y, Gonzalez-Garcia A, van de Weijer J, Herranz L (2019) Sdit: scalable and diverse cross-domain image translation. In: Proceedings of the 27th ACM International Conference on Multimedia, pp 1267–1276

  55. Karras T, Laine S, Aila T (2019) A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4401–4410

  56. Li Y, Wang N, Liu J, Hou X (2017) Demystifying neural style transfer. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence, pp 2230–2236

  57. Chandran P, Zoss G, Gotardo P, Gross M, Bradley D (2021) Adaptive convolutions for structure-aware style transfer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 7972–7981

  58. Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. In: European Conference on Computer Vision, pp 740–755 . Springer

  59. Wikiart P (2016) www.kaggle.com/c/painter-by-numbers

  60. Li X, Liu S, Kautz J, Yang M-H (2019) Learning linear transformations for fast image and video style transfer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 3809–3817

  61. Sheng L, Lin Z, Shao J, Wang X (2018) Avatar-net: multi-scale zero-shot style transfer by feature decoration. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 8242–8250

  62. Liu S, Lin T, He D, Li F, Wang M, Li X, Sun Z, Li Q, Ding E (2021) Adaattn: Revisit attention mechanism in arbitrary neural style transfer. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 6649–6658

Download references

Acknowledgements

The research was supported by the Key Laboratory of Spectral Imaging Technology, Xi’an Institute of Optics and Precision Mechanics of the Chinese Academy of Sciences, the Key Laboratory of Biomedical Spectroscopy of Xi’an, the Outstanding Award for Talent Project of the Chinese Academy of Sciences, “From 0 to 1” Original Innovation Project of the Basic Frontier Scientific Research Program of the Chinese Academy of Sciences, and Autonomous Deployment Project of Xi’an Institute of Optics and Precision Mechanics of Chinese Academy of Sciences.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Quan Wang.

Ethics declarations

Conflict of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, Y., Hu, B., Huang, Y. et al. Adaptive Style Modulation for Artistic Style Transfer. Neural Process Lett 55, 6213–6230 (2023). https://doi.org/10.1007/s11063-022-11135-7

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-022-11135-7

Keywords

Navigation