Adaptive Style Modulation for Artistic Style Transfer

Zhang, Yipeng; Hu, Bingliang; Huang, Yingying; Gao, Chi; Wang, Quan

doi:10.1007/s11063-022-11135-7

Adaptive Style Modulation for Artistic Style Transfer

Published: 31 December 2022

Volume 55, pages 6213–6230, (2023)
Cite this article

Neural Processing Letters Aims and scope Submit manuscript

Yipeng Zhang^1,2,3,
Bingliang Hu^1,2,
Yingying Huang^1,2,3,
Chi Gao^1,2,3 &
…
Quan Wang^1,2

270 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Arbitrary-style-per-model (ASPM) style transfer algorithms transfer arbitrary styles based on a single model. Statistics-based learning algorithms of ASPM, represented by adaptive instance normalization (AdaIN), conduct instance normalization and then perform an affine transformation on target features. These algorithms are computationally efficient and easy to embed in convolutional neural networks. Consequently, they are widely used in image synthesis tasks to control the style of the resulting images. However, the style of stylized images may be a combination of content and stylized images, which suggests that these methods do not transform styles accurately. In this work, we rethink the function of AdaIN in controlling style. We show that the role of AdaIN is to (1) give each input content image a specific optimization target, (2) dynamically set cross-channel correlations, and (3) act as a feature selector after combining it with an activation function. Accordingly, we propose adaptive style modulation (AdaSM), which fully leverages the three roles mentioned above and thereby enables more precise control of global style. Experimental results show that AdaSM provides superior style controllability, alleviates the style blending problem, and outperforms state-of-the-art methods in artistic style transfer tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial intelligence in the creative industries: a review

Article Open access 02 July 2021

Perceptual Losses for Real-Time Style Transfer and Super-Resolution

Using machine learning to predict artistic styles: an analysis of trends and the research agenda

Article Open access 15 April 2024

Data Availibility

The datasets used during this study are available upon request to the authors.

References

Gooch B, Gooch A (2001) Non-photorealistic rendering. AK Peters/CRC Press
Strothotte T, Schlechtweg S (2002) Non-photorealistic computer graphics: modeling, rendering, and animation. Morgan Kaufmann Publishers Inc
Rosin P, Collomosse J (2012) Image and video-based artistic stylisation. Springer
Jing Y, Yang Y, Feng Z, Ye J, Yu Y, Song M (2019) Neural style transfer: a review. IEEE Trans Vis Comput Graph 26(11):3365–3385
Article Google Scholar
Misra J, Saha I (2010) Artificial neural networks in hardware: a survey of two decades of progress. Neurocomputing 74(1–3):239–255
Article Google Scholar
Cao Y, Cao Y, Wen S, Huang T, Zeng Z (2019) Passivity analysis of delayed reaction-diffusion memristor-based neural networks. Neural Netw 109:159–167
Article MATH Google Scholar
Cao Y, Liu N, Zhang C, Zhang T, Luo Z-F (2022) Synchronization of multiple reaction-diffusion memristive neural networks with known or unknown parameters and switching topologies. Knowl Based Syst 254:109595
Article Google Scholar
Wang Z, Joshi S, Savel’ev S, Song W, Midya R, Li Y, Rao M, Yan P, Asapu S, Zhuo Y et al (2018) Fully memristive neural networks for pattern classification with unsupervised learning. Nat Electron 1(2):137–145
Article Google Scholar
Pershin YV, Di Ventra M (2010) Experimental demonstration of associative memory with memristive neural networks. Neural Netw 23(7):881–886
Article Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2017) Imagenet classification with deep convolutional neural networks. Commun ACM 60(6):84–90
Article Google Scholar
Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning, pp 448–456 . PMLR
Ba JL, Kiros JR, Hinton GE (2016) Layer normalization. arXiv preprint arXiv:1607.06450
Xu J, Sun X, Zhang Z, Zhao G, Lin J (2019) Understanding and improving layer normalization. In: Advances in neural information processing systems, vol 32, pp 4383–4393
Salimans T, Kingma DP (2016) Weight normalization: a simple reparameterization to accelerate training of deep neural networks. In: Advances in neural information processing systems, vol 29, pp 901–909
Wu Y, He K (2018) Group normalization. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 3–19
Ulyanov D, Vedaldi A, Lempitsky V (2016) Instance normalization: the missing ingredient for fast stylization. arXiv preprint arXiv:1607.08022
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 770–778
Duchi J, Singer Y (2009) Efficient online and batch learning using forward backward splitting. J Mach Learn Res 10:2899–2934
MathSciNet MATH Google Scholar
Zeiler MD (2012) Adadelta: an adaptive learning rate method. arXiv preprint arXiv:1212.5701
Zeiler MD, Taylor GW, Fergus R (2011) Adaptive deconvolutional networks for mid and high level feature learning. In: 2011 International Conference on Computer Vision, pp 2018–2025
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980
Wen S, Xiao S, Yang Y, Yan Z, Zeng Z, Huang T (2018) Adjusting learning rate of memristor-based multilayer neural networks via fuzzy method. IEEE Trans Comput Aided Des Integr Circ Syst 38(6):1084–1094
Article Google Scholar
Parikh AP, Täckström O, Das D, Uszkoreit J (2016) A decomposable attention model for natural language inference. arXiv preprint arXiv:1606.01933
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser, Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, vol 30, pp 5998–6008
Zhang H, Goodfellow I, Metaxas D, Odena A (2019) Self-attention generative adversarial networks. In: International Conference on Machine Learning, pp 7354–7363 . PMLR
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1–9
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2818–2826
Szegedy C, Ioffe S, Vanhoucke V, Alemi AA (2017) Inception-v4, inception-resnet and the impact of residual connections on learning. In: Thirty-first AAAI Conference on Artificial Intelligence
Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861
Zhang X, Zhou X, Lin M, Sun J (2018) Shufflenet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 6848–6856
Sun K, Xiao B, Liu D, Wang J (2019) Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 5693–5703
Gatys LA, Ecker AS, Bethge M (2016) Image style transfer using convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2414–2423
Gatys L, Ecker AS, Bethge M (2015) Texture synthesis using convolutional neural networks. In: Advances in Neural Information Processing Systems, pp 262–270
Johnson J, Alahi A, Fei-Fei L (2016) Perceptual losses for real-time style transfer and super-resolution. In: European Conference on Computer Vision, Springer, pp 694–711
Ulyanov D, Lebedev V, Vedaldi A, Lempitsky VS (2016) Texture networks: feed-forward synthesis of textures and stylized images. In: ICML, p 4
Ulyanov D, Vedaldi A, Lempitsky V (2017) Improved texture networks: maximizing quality and diversity in feed-forward stylization and texture synthesis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 6924–6932
Li C, Wand M (2016) Precomputed real-time texture synthesis with markovian generative adversarial networks. In: European Conference on Computer Vision, Springer, pp 702–716
Dumoulin V, Shlens J, Kudlur M (2016) A learned representation for artistic style. arXiv preprint arXiv:1610.07629
Huang X, Belongie S (2017) Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of the IEEE International Conference on Computer Vision, pp 1501–1510
Li Y, Fang C, Yang J, Wang Z, Lu X, Yang M-H (2017) Universal style transfer via feature transforms. In: Advances in Neural Information Processing Systems, pp 386–396
Li X, Liu S, Kautz J, Yang M.-H (2019) Learning linear transformations for fast image and video style transfer. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3809–3817
Sheng L, Lin Z, Shao J, Wang X (2018) Avatar-net: multi-scale zero-shot style transfer by feature decoration. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 8242–8250
Park DY, Lee KH (2019) Arbitrary style transfer with style-attentional networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 5880–5888
Deng Y, Tang F, Dong W, Sun W, Huang F, Xu C (2020) Arbitrary style transfer via multi-adaptation network. In: Proceedings of the 28th ACM International Conference on Multimedia, pp 2719–2727
Yao Y, Ren J, Xie X, Liu W, Liu Y-J, Wang J (2019) Attention-aware multi-stroke style transfer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 1467–1475
Chen H, Wang Z, Zhang H, Zuo Z, Li A, Xing W, Lu D et al (2021) Artistic style transfer with internal-external learning and contrastive learning. Adv Neural Inf Process Syst 34:26561–26573
Google Scholar
Ghiasi G, Lee H, Kudlur M, Dumoulin V, Shlens J (2017) Exploring the structure of a real-time, arbitrary neural artistic stylization network. In: Proceedings of the British machine vision conference, pp 114.1–114.12
Park DY, Lee KH (2019) Arbitrary style transfer with style-attentional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 5880–5888
Karras T, Laine S, Aittala M, Hellsten J, Lehtinen J, Aila T (2020) Analyzing and improving the image quality of stylegan. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 8110–8119
Karras T, Aittala M, Laine S, Härkönen E, Hellsten J, Lehtinen J, Aila T (2021) Alias-free generative adversarial networks. Adv Neural Inf Process Syst 34:852–863
Google Scholar
Choi Y, Uh Y, Yoo J, Ha J-W (2020) Stargan v2: diverse image synthesis for multiple domains. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 8188–8197
Huang X, Liu M-Y, Belongie S, Kautz J (2018) Multimodal unsupervised image-to-image translation. In: Proceedings of the European Conference on Computer Vision, pp 172–189
Wang Y, Gonzalez-Garcia A, van de Weijer J, Herranz L (2019) Sdit: scalable and diverse cross-domain image translation. In: Proceedings of the 27th ACM International Conference on Multimedia, pp 1267–1276
Karras T, Laine S, Aila T (2019) A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4401–4410
Li Y, Wang N, Liu J, Hou X (2017) Demystifying neural style transfer. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence, pp 2230–2236
Chandran P, Zoss G, Gotardo P, Gross M, Bradley D (2021) Adaptive convolutions for structure-aware style transfer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 7972–7981
Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. In: European Conference on Computer Vision, pp 740–755 . Springer
Wikiart P (2016) www.kaggle.com/c/painter-by-numbers
Li X, Liu S, Kautz J, Yang M-H (2019) Learning linear transformations for fast image and video style transfer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 3809–3817
Sheng L, Lin Z, Shao J, Wang X (2018) Avatar-net: multi-scale zero-shot style transfer by feature decoration. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 8242–8250
Liu S, Lin T, He D, Li F, Wang M, Li X, Sun Z, Li Q, Ding E (2021) Adaattn: Revisit attention mechanism in arbitrary neural style transfer. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 6649–6658

Download references

Acknowledgements

The research was supported by the Key Laboratory of Spectral Imaging Technology, Xi’an Institute of Optics and Precision Mechanics of the Chinese Academy of Sciences, the Key Laboratory of Biomedical Spectroscopy of Xi’an, the Outstanding Award for Talent Project of the Chinese Academy of Sciences, “From 0 to 1” Original Innovation Project of the Basic Frontier Scientific Research Program of the Chinese Academy of Sciences, and Autonomous Deployment Project of Xi’an Institute of Optics and Precision Mechanics of Chinese Academy of Sciences.

Author information

Authors and Affiliations

Key Laboratory of Spectral Imaging Technology CAS, Xi’an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences, Xi’an, 710119, Shaanxi, China
Yipeng Zhang, Bingliang Hu, Yingying Huang, Chi Gao & Quan Wang
The Key laboratory of Biomedical Spectroscopy of Xi’an, Xi’an, 710119, Shaanxi, China
Yipeng Zhang, Bingliang Hu, Yingying Huang, Chi Gao & Quan Wang
School of Optoelectronics, University of Chinese Academy of Sciences, Beijing, 100190, China
Yipeng Zhang, Yingying Huang & Chi Gao

Authors

Yipeng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Bingliang Hu
View author publications
You can also search for this author in PubMed Google Scholar
Yingying Huang
View author publications
You can also search for this author in PubMed Google Scholar
Chi Gao
View author publications
You can also search for this author in PubMed Google Scholar
Quan Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Quan Wang.

Ethics declarations

Conflict of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhang, Y., Hu, B., Huang, Y. et al. Adaptive Style Modulation for Artistic Style Transfer. Neural Process Lett 55, 6213–6230 (2023). https://doi.org/10.1007/s11063-022-11135-7

Download citation

Accepted: 18 December 2022
Published: 31 December 2022
Issue Date: October 2023
DOI: https://doi.org/10.1007/s11063-022-11135-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Adaptive Style Modulation for Artistic Style Transfer

Abstract

Access this article

Similar content being viewed by others

Artificial intelligence in the creative industries: a review

Perceptual Losses for Real-Time Style Transfer and Super-Resolution

Using machine learning to predict artistic styles: an analysis of trends and the research agenda

Data Availibility

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Adaptive Style Modulation for Artistic Style Transfer

Abstract

Access this article

Similar content being viewed by others

Artificial intelligence in the creative industries: a review

Perceptual Losses for Real-Time Style Transfer and Super-Resolution

Using machine learning to predict artistic styles: an analysis of trends and the research agenda

Data Availibility

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation