Skip to main content
Log in

Deliberation on object-aware video style transfer network with long–short temporal and depth-consistent constraints

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Video style transfer, as a natural extension of image style transfer, has recently gained much interests. It has much important applications on non-photorealistic rendering and computer games. However, existing image-based methods cannot be readily extended to videos because of temporal flickering and stylization inconsistency. Therefore, the main effort of this work is to propose an efficient salient object-aware and depth-consistent video style transfer algorithm. Specifically, DenseNet is carefully extended as feed-forward backbone network for better style transfer quality. Then, through utilizing salient object segmentation and depth estimation results, depth-consistent loss and object masked long–short temporal losses are deliberately proposed at training stage. The proposed losses can preserve stereoscopic sense without salient semantic distortion and consecutive stylized frame flickering. The proposed network has been compared with several state-of-the-art methods. The experimental results demonstrate that the proposed method is more superior on achieving real-time processing efficiency, nice rendering quality and coherent stylization at the same time. Related source codes have been released on GitHub.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  1. Gatys LA, Ecker AS, Bethge M (2015) A neural algorithm of artistic style. arXiv:1508.06576

  2. Johnson J, Alahi A, Fei-Fei L (2016) Perceptual losses for real-time style transfer and super-resolution. In: European conference on computer vision

  3. Liu X-C, Cheng M-M, Lai Y-K, Rosin PL (2017) Depth-aware neural style transfer. In: Non-photorealistic animation and rendering (NPAR)

  4. Ruder M, Dosovitskiy A, Brox T (2016) Artistic style transfer for videos. In: German conference on pattern recognition

  5. Huang H, Wang H, Luo W, Ma L, Jiang W, Zhu X, Li Z, Liu W (2017) Real-time neural style transfer for videos. In: IEEE conference on computer vision and pattern recognition

  6. Jegou S, Drozdzal M, Vazquez D, Romero A, Bengio Y (2017) The one hundred layers tiramisu: fully convolutional densenets for semantic segmentation. In: IEEE conference on computer vision and pattern recognition

  7. Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: IEEE conference on computer vision and pattern recognition

  8. Qin X, Zhang Z, Huang C, Gao C, Dehghan M, Jagersand M (2019) Basnet: boundary-aware salient object detection. In: IEEE conference on computer vision and pattern recognition

  9. Liu N, Han J, Yang M-H (2018) Picanet: learning pixel-wise contextual attention for saliency detection. In: IEEE conference on computer vision and pattern recognition

  10. Yuzhu J, Haijun Z, Jonathan WQM (2018) Salient object detection via multi-scale attention cnn. Neurocomputing 322:130–140

    Article  Google Scholar 

  11. Yanni D, Tianyang L, Yuxiang Z, Bo D (2020) Spectral-spatial weighted kernel manifold embedded distribution alignment for remote sensing image classification. IEEE Trans Cybern. https://doi.org/10.1109/TCYB.2020.3004263

    Article  Google Scholar 

  12. Yuzhu J, Haijun Z, Zequn J, Lin M, Jonathan WQM (2020) Casnet: a cross-attention siamese network for video salient object detection. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2020.3007534

    Article  Google Scholar 

  13. Chen D, Yuan L, Liao J, Yu N, Hua G (2017) Stylebank: an explicit representation for neural image style transfer. In: IEEE conference on computer vision and pattern recognition

  14. Dumoulin V, Shlens J, Kudlur M (2017) A learned representation for artistic style. In: International conference on learning representations

  15. Shen F, Yan S, Zeng G (2017) Meta networks for neural style transfer. In: IEEE conference on computer vision and pattern recognition

  16. Kozlovtsev K, Kitov V (2019) Depth-aware arbitrary style transfer using instance normalization. arXiv:1906.01123v2

  17. Huang X, Serge B (2017) Arbitrary style transfer in real-time with adaptive instance normalization. In: IEEE International Conference on Computer Vision, Venice. https://doi.org/10.1109/ICCV.2017.167

  18. Zhan L, Yuanqing W (2019) Stable and refined style transfer using zigzag learning algorithm. Neural Process Lett 50(2):2481–2492

    Article  Google Scholar 

  19. Chen D, Liao J, Yuan L, Yu N, Hua G (2017) Coherent online video style transfer. In: IEEE international conference on computer vision

  20. Gao C, Gu D, Zhang F, Yu Y (2018) Reconet: real-time coherent video style transfer network. In: Asia conference on computer vision

  21. Li X, Liu S, Kautz J, Yang M-H (2019) Learning linear transformations for fast image and video style transfer. In: IEEE/CVF conference on computer vision and pattern recognition

  22. Todd JT, Farley NJ (2003) The visual perception of 3-d shape from multiple cues: are observers capable of perceiving metric structure? Percept Psychophys 65(1):31–47

    Article  Google Scholar 

  23. Zoran D, Isola P, Krishnan D, Freeman WT (2015) Learning ordinal relationships for mid-level vision. In: Proceedings of the IEEE international conference on computer vision

  24. Chen W, Fu Z, Yang D, Deng J (2016) Single-image depth perception in the wild. In: Advances in neural information processing systems

  25. Hao C, Youfu L, Dan S (2020) Discriminative cross-modal transfer learning and densely cross-level feedback fusion for rgb-d salient object detection. IEEE Trans Cybern 50(11):4808–4820

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aiwen Jiang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was supported by Natural Science Foundation of China under Grant No. 61966018.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, Y., Jiang, A., Pan, J. et al. Deliberation on object-aware video style transfer network with long–short temporal and depth-consistent constraints. Neural Comput & Applic 33, 8845–8856 (2021). https://doi.org/10.1007/s00521-020-05630-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-020-05630-y

Keywords

Navigation