Abstract
We introduce a new example-based approach to video stylization, with a focus on preserving the visual quality of the style, user controllability and applicability to arbitrary video. Our method gets as input one or more keyframes that the artist chooses to stylize with standard painting tools. It then automatically propagates the stylization to the rest of the sequence. To facilitate this while preserving visual quality, we developed a new type of guidance for state-of-art patch-based synthesis, that can be applied to any type of video content and does not require any additional information besides the video itself and a user-specified mask of the region to be stylized. We further show a temporal blending approach for interpolating style between keyframes that preserves texture coherence, contrast and high frequency details. We evaluate our method on various scenes from real production setting and provide a thorough comparison with prior art.
- Connelly Barnes, Eli Shechtman, Adam Finkelstein, and Dan B Goldman. 2009. PatchMatch: A Randomized Correspondence Algorithm for Structural Image Editing. ACM Transactions on Graphics 28, 3 (2009), 24. Google ScholarDigital Library
- Pierre Bénard, Forrester Cole, Michael Kass, Igor Mordatch, James Hegarty, Martin Sebastian Senn, Kurt Fleischer, Davide Pesare, and Katherine Breeden. 2013. Stylizing Animation by Example. ACM Transactions on Graphics 32, 4 (2013), 119. Google ScholarDigital Library
- Pierre Bénard, Ares Lagae, Peter Vangorp, Sylvain Lefebvre, George Drettakis, and Joëlle Thollot. 2010. A Dynamic Noise Primitive for Coherent Stylization. Computer Graphics Forum 29, 4 (2010), 1497--1506. Google ScholarDigital Library
- Pravin Bhat, Brian Curless, Michael Cohen, and C. Lawrence Zitnick. 2008. Fourier Analysis of the 2D Screened Poisson Equation for Gradient Domain Problems. In Proceedings of European Conference on Computer Vision. 114--128. Google ScholarDigital Library
- Sai Bi, Xiaoguang Han, and Yizhou Yu. 2015. An L1 Image Transform for Edge-preserving Smoothing and Scene-level Intrinsic Decomposition. ACM Transactions on Graphics 34, 4 (2015), 78. Google ScholarDigital Library
- Adrien Bousseau, Matthew Kaplan, Joëlle Thollot, and François Sillion. 2006. Interactive Watercolor Rendering with Temporal Coherence and Abstraction. In Proceedings of International Symposium on Non-Photorealistic Animation and Rendering. 141--149. Google ScholarDigital Library
- Adrien Bousseau, Fabrice Neyret, Joëlle Thollot, and David Salesin. 2007. Video Watercolorization Using Bidirectional Texture Advection. ACM Transactions on Graphics 26, 3 (2007), 104. Google ScholarDigital Library
- Mark Browning, Connelly Barnes, Samantha Ritter, and Adam Finkelstein. 2014. Stylized Keyframe Animation of Fluid Simulations. In Proceedings of the Workshop on Non-Photorealistic Animation and Rendering. 63--70. Google ScholarDigital Library
- Dongdong Chen, Jing Liao, Lu Yuan, Nenghai Yu, and Gang Hua. 2017. Coherent Online Video Style Transfer. In Proceedings of IEEE International Conference on Computer Vision. 1114--1123.Google ScholarCross Ref
- Cassidy J. Curtis, Sean E. Anderson, Joshua E. Seims, Kurt W. Fleischer, and David H. Salesin. 1997. Computer-Generated Watercolor. In SIGGRAPH Conference Proceedings. 421--430. Google ScholarDigital Library
- Soheil Darabi, Eli Shechtman, Connelly Barnes, Dan B. Goldman, and Pradeep Sen. 2012. Image Melding: Combining Inconsistent Images Using Patch-Based Synthesis. ACM Transactions on Graphics 31, 4 (2012), 82. Google ScholarDigital Library
- Marek Dvorožňák, Wilmot Li, Vladimir G. Kim, and Daniel Sýkora. 2018. ToonSynth: Example-Based Synthesis of Hand-Colored Cartoon Animations. ACM Transactions on Graphics 37, 4 (2018), 167. Google ScholarDigital Library
- Jakub Fišer, Ondřej Jamriška, Michal Lukáč, Eli Shechtman, Paul Asente, Jingwan Lu, and Daniel Sýkora. 2016. StyLit: Illumination-Guided Example-Based Stylization of 3D Renderings. ACM Transactions on Graphics 35, 4 (2016), 92. Google ScholarDigital Library
- Jakub Fišer, Ondřej Jamriška, David Simons, Eli Shechtman, Jingwan Lu, Paul Asente, Michal Lukáč, and Daniel Sýkora. 2017. Example-Based Synthesis of Stylized Facial Animations. ACM Transactions on Graphics 36, 4 (2017). Google ScholarDigital Library
- Jakub Fišer, Michal Lukáč, Ondřej Jamriška, Martin Čadík, Yotam Gingold, Paul Asente, and Daniel Sýkora. 2014. Color Me Noisy: Example-Based Rendering of Hand-Colored Animations with Temporal Noise Control. Computer Graphics Forum 33, 4 (2014), 1--10.Google ScholarCross Ref
- Oriel Frigo, Neus Sabater, Julie Delon, and Pierre Hellier. 2016. Split and Match: Example-Based Adaptive Patch Sampling for Unsupervised Style Transfer. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 553--561.Google ScholarCross Ref
- Oriel Frigo, Neus Sabater, Julie Delon, and Pierre Hellier. 2019. Video Style Transfer by Consistent Adaptive Patch Sampling. The Visual Computer 35, 3 (2019), 429--443. Google ScholarDigital Library
- Leon A Gatys, Alexander S Ecker, and Matthias Bethge. 2016. Image Style Transfer Using Convolutional Neural Networks. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2414--2423.Google ScholarCross Ref
- Leon A. Gatys, Alexander S. Ecker, Matthias Bethge, Aaron Hertzmann, and Eli Shechtman. 2017. Controlling Perceptual Factors in Neural Style Transfer. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 3730--3738.Google ScholarCross Ref
- Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron C. Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets. In Advances in Neural Information Processing Systems. 2672--2680. Google ScholarDigital Library
- Shuyang Gu, Congliang Chen, Jing Liao, and Lu Yuan. 2018. Arbitrary Style Transfer with Deep Feature Reshuffle. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 8222--8231.Google ScholarCross Ref
- Agrim Gupta, Justin Johnson, Alexandre Alahi, and Li Fei-Fei. 2017. Characterizing and Improving Stability in Neural Style Transfer. In Proceedings of IEEE International Conference on Computer Vision. 4087--4096.Google ScholarCross Ref
- Yoav HaCohen, Eli Shechtman, Dan Goldman, and Dani Lischinski. 2011. Non-rigid Dense Correspondence with Applications for Image Enhancement. ACM Transactions on Graphics 30, 4 (2011), 70. Google ScholarDigital Library
- William Van Haevre, Tom Van Laerhoven, Fabian Di Fiore, and Frank Van Reeth. 2007. From Dust Till Drawn: A Real-Time Bidirectional Pastel Simulation. The Visual Computer 23, 9--11 (2007), 925--934. Google ScholarDigital Library
- James Hays and Irfan A. Essa. 2004. Image and Video Based Painterly Animation. In Proceedings of International Symposium on Non-Photorealistic Animation and Rendering. 113--120. Google ScholarDigital Library
- Eric Heitz and Fabrice Neyret. 2018. High-Performance By-Example Noise Using a Histogram-Preserving Blending Operator. Proceedings of the ACM on Computer Graphics and Interactive Techniques 1, 2 (2018), 31. Google ScholarDigital Library
- Aaron Hertzmann, Charles E Jacobs, Nuria Oliver, Brian Curless, and David H Salesin. 2001. Image Analogies. In SIGGRAPH Conference Proceedings. 327--340. Google ScholarDigital Library
- Xun Huang and Serge Belongie. 2017. Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization. In Proceedings of IEEE International Conference on Computer Vision. 1510--1519.Google ScholarCross Ref
- Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. 2017. Image-to-Image Translation with Conditional Adversarial Networks. 5967--5976.Google Scholar
- Ondřej Jamriška, Jakub Fišer, Paul Asente, Jingwan Lu, Eli Shechtman, and Daniel Sýkora. 2015. LazyFluids: Appearance Transfer for Fluid Animations. ACM Transactions on Graphics 34, 4 (2015), 92. Google ScholarDigital Library
- Justin Johnson, Alexandre Alahi, and Li Fei-Fei. 2016. Perceptual Losses for Real-Time Style Transfer and Super-Resolution. In Proceedings of European Conference on Computer Vision. 694--711.Google ScholarCross Ref
- Alexandre Kaspar, Boris Neubert, Dani Lischinski, Mark Pauly, and Johannes Kopf. 2015. Self Tuning Texture Optimization. Computer Graphics Forum 34, 2 (2015), 349--360. Google ScholarDigital Library
- Wei-Sheng Lai, Jia-Bin Huang, Oliver Wang, Eli Shechtman, Ersin Yumer, and Ming-Hsuan Yang. 2018. Learning Blind Video Temporal Consistency. In Proceedings of European Conference on Computer Vision. 179--195.Google ScholarDigital Library
- Chuan Li and Michael Wand. 2016. Combining Markov Random Fields and Convolutional Neural Networks for Image Synthesis.. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2479--2486.Google ScholarCross Ref
- Wenbin Li, Fabio Viola, Jonathan Starck, Gabriel J. Brostow, and Neill D. F. Campbell. 2016. Roto++: Accelerating Professional Rotoscoping Using Shape Manifolds. ACM Transactions on Graphics 35, 4 (2016), 62. Google ScholarDigital Library
- Yijun Li, Chen Fang, Jimei Yang, Zhaowen Wang, Xin Lu, and Ming-Hsuan Yang. 2017. Universal Style Transfer via Feature Transforms. In Advances in Neural Information Processing Systems. 386--396. Google ScholarDigital Library
- Jing Liao, Rodolfo Lima, Diego Nehab, Hugues Hoppe, Pedro Sander, and Jinhui Yu. 2014. Automating Image Morphing Using Structural Similarity on a Halfway Domain. ACM Transactions on Graphics 33, 5 (2014), 168. Google ScholarDigital Library
- Jing Liao, Yuan Yao, Lu Yuan, Gang Hua, and Sing Bing Kang. 2017. Visual Attribute Transfer Through Deep Image Analogy. ACM Transactions on Graphics 36, 4 (2017), 120. Google ScholarDigital Library
- Peter Litwinowicz. 1997. Processing Images and Video for an Impressionist Effect. In SIGGRAPH Conference Proceedings. 407--414. Google ScholarDigital Library
- Ce Liu, Jenny Yuen, and Antonio Torralba. 2011. SIFT Flow: Dense Correspondence across Scenes and Its Applications. IEEE Transactions on Pattern Analysis and Machine Intelligence 33, 5 (2011), 978--994. Google ScholarDigital Library
- Cewu Lu, Li Xu, and Jiaya Jia. 2012. Combining Sketch and Tone for Pencil Drawing Production. In Proceedings of International Symposium on Non-Photorealistic Animation and Rendering. 65--73. Google ScholarDigital Library
- Santiago E Montesdeoca, Hock Soon Seah, Amir Semmo, Pierre Bénard, Romain Vergne, Joëlle Thollot, and Davide Benvenuti. 2018. MNPR: A Framework for Real-Time Expressive Non-Photorealistic Rendering of 3D Computer Graphics. In Proceedings of The Joint Symposium on Computational Aesthetics and Sketch Based Interfaces and Modeling and Non-Photorealistic Animation and Rendering. 11. Google ScholarDigital Library
- Alexandrina Orzan, Adrien Bousseau, Holger Winnemöller, Pascal Barla, Joëlle Thollot, and David Salesin. 2008. Diffusion Curves: A Vector Representation for Smooth-Shaded Images. ACM Transactions on Graphics 27, 3 (2008), 92. Google ScholarDigital Library
- Emil Praun, Hugues Hoppe, Matthew Webb, and Adam Finkelstein. 2001. Real-Time Hatching. In SIGGRAPH Conference Proceedings. 581--586. Google ScholarDigital Library
- Manuel Ruder, Alexey Dosovitskiy, and Thomas Brox. 2018. Artistic Style Transfer for Videos and Spherical Images. International Journal of Computer Vision 126, 11 (2018), 1199--1219. Google ScholarDigital Library
- Roland Ruiters, Ruwen Schnabel, and Reinhard Klein. 2010. Patch-Based Texture Interpolation. Computer Graphics Forum 29, 4 (2010), 1421--1429. Google ScholarDigital Library
- Michael P. Salisbury, Michael T. Wong, John F. Hughes, and David H. Salesin. 1997. Orientable Textures for Image-Based Pen-and-Ink Illustration. In SIGGRAPH Conference Proceedings. 401--406. Google ScholarDigital Library
- Artsiom Sanakoyeu, Dmytro Kotovenko, Sabine Lang, and Björn Ommer. 2018. A Style-Aware Content Loss for Real-Time HD Style Transfer. In Proceedings of European Conference on Computer Vision. 715--731.Google ScholarDigital Library
- Johannes Schmid, Martin Sebastian Senn, Markus Gross, and Robert Sumner. 2011. OverCoat: An Implicit Canvas for 3D Painting. ACM Transactions on Graphics 30, 4 (2011), 28. Google ScholarDigital Library
- Eli Shechtman, Alex Rav-Acha, Michal Irani, and Steven M. Seitz. 2010. Regenerative Morphing. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 615--622.Google Scholar
- Yi-Chang Shih, Sylvain Paris, Connelly Barnes, William T. Freeman, and Frédo Durand. 2014. Style Transfer for Headshot Portraits. ACM Transactions on Graphics 33, 4 (2014), 148. Google ScholarDigital Library
- Karen Simonyan and Andrew Zisserman. 2014. Very Deep Convolutional Networks for Large-Scale Image Recognition. CoRR abs/1409.1556 (2014).Google Scholar
- Noah Snavely, C. Lawrence Zitnick, Sing Bing Kang, and Michael F. Cohen. 2006. Stylizing 2.5-D video. In Proceedings of International Symposium on Non-Photorealistic Animation and Rendering. 63--69. Google ScholarDigital Library
- Sergey Tulyakov, Ming-Yu Liu, Xiaodong Yang, and Jan Kautz. 2018. MoCoGAN: Decomposing Motion and Content for Video Generation. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 1526--1535.Google ScholarCross Ref
- Dmitry Ulyanov, Vadim Lebedev, Andrea Vedaldi, and Victor S. Lempitsky. 2016a. Texture Networks: Feed-Forward Synthesis of Textures and Stylized Images. In ICML, Vol. 48. 1349--1357. Google ScholarDigital Library
- Dmitry Ulyanov, Andrea Vedaldi, and Victor S. Lempitsky. 2016b. Instance Normalization: The Missing Ingredient for Fast Stylization. CoRR abs/1607.08022 (2016).Google Scholar
- Dmitry Ulyanov, Andrea Vedaldi, and Victor S. Lempitsky. 2017. Improved Texture Networks: Maximizing Quality and Diversity in Feed-Forward Stylization and Texture Synthesis. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 4105--4113.Google Scholar
- Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Nikolai Yakovenko, Andrew Tao, Jan Kautz, and Bryan Catanzaro. 2018. Video-to-Video Synthesis. In Advances in Neural Information Processing Systems. 1152--1164. Google ScholarDigital Library
- Xin Wang, Geoffrey Oxholm, Da Zhang, and Yuan-Fang Wang. 2017. Multimodal Transfer: A Hierarchical Deep Convolutional Neural Network for Fast Artistic Style Transfer. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 7178--7186.Google ScholarCross Ref
- Yonatan Wexler, Eli Shechtman, and Michal Irani. 2007. Space-Time Completion of Video. IEEE Transactions on Pattern Analysis and Machine Intelligence 29, 3 (2007), 463--476. Google ScholarDigital Library
- Pierre Wilmot, Eric Risser, and Connelly Barnes. 2017. Stable and Controllable Neural Texture Synthesis and Style Transfer Using Histogram Losses. CoRR abs/1701.08893 (2017).Google Scholar
- Li Xu, Cewu Lu, Yi Xu, and Jiaya Jia. 2011. Image Smoothing via L0 Gradient Minimization. ACM Transactions on Graphics 30, 6 (2011), 174. Google ScholarDigital Library
- Kaan Yücer, Alec Jacobson, Alexander Sorkine-Hornung, and Olga Sorkine-Hornung. 2012. Transfusive Image Manipulation. ACM Transactions on Graphics 31, 6 (2012), 176. Google ScholarDigital Library
- Mingtian Zhao and Song-Chun Zhu. 2011. Portrait Painting Using Active Templates. In Proceedings of International Symposium on Non-Photorealistic Animation and Rendering. 117--124. Google ScholarDigital Library
- Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A. Efros. 2017a. Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks. In Proceedings of IEEE International Conference on Computer Vision. 2242--2251.Google Scholar
- Jun-Yan Zhu, Richard Zhang, Deepak Pathak, Trevor Darrell, Alexei A. Efros, Oliver Wang, and Eli Shechtman. 2017b. Toward Multimodal Image-to-Image Translation. In Advances in Neural Information Processing Systems. 465--476. Google ScholarDigital Library
Index Terms
- Stylizing video by example
Recommendations
StyleBin: Stylizing Video by Example in Stereo
SA '22: SIGGRAPH Asia 2022 Conference PapersIn this paper we present StyleBin—an approach to example-based stylization of videos that can produce consistent binocular depiction of stylized content on stereoscopic displays. Given the target sequence and a set of stylized keyframes accompanied by ...
Example-based synthesis of stylized facial animations
We introduce a novel approach to example-based stylization of portrait videos that preserves both the subject's identity and the visual richness of the input style exemplar. Unlike the current state-of-the-art based on neural style transfer [Selim et ...
Stylizing animation by example
Skilled artists, using traditional media or modern computer painting tools, can create a variety of expressive styles that are very appealing in still images, but have been unsuitable for animation. The key difficulty is that existing techniques lack ...
Comments