Abstract
In this paper, we present iHairRecolorer, the first deep-learning based approach for example-based hair color transfer in videos. Given an input video and a reference image, our method automatically transfers the hair color in the reference image to the hair in the video while keeping other hair attributes (e.g., shape, structure, and illumination) untouched, producing vivid color-transferred dynamic hair in the video. Our method performs the color transfer purely in the image space, without any form of intermediate 3D hair reconstruction. The key enabler of our method is a carefully designed conditional generative model that explicitly disentangles various hair attributes into their corresponding sub-spaces, which are implemented as conditional modules integrated into a generator. We introduce a novel spatially and temporally normalized luminance map to represent the structure and illumination of the hair. Such a representation can largely ease the burden of the generator to synthesize temporally coherent vivid dynamic hairs in the video. We further introduce a cycle consistency loss to enforce the faithfulness of the generated results with respect to the reference. We demonstrate our system’s superiority in video hair color transfer by extensive experiments and comparisons to alternative methods.
Similar content being viewed by others
References
Chai M, Wang L, Weng Y, et al. Single-view hair modeling for portrait manipulation. ACM Trans Graph, 2012, 31: 1–8
Chai M, Wang L, Weng Y, et al. Dynamic hair manipulation in images and videos. ACM Trans Graph, 2013, 32: 1–8
Tan Z, Chai M, Chen D, et al. MichiGAN: multi-input-conditioned hair image generation for portrait editing. 2020. ArXiv:2010.16417
Wang T C, Liu M Y, Zhu J Y, et al. Video-to-video synthesis. 2018. ArXiv:1808.06601
Zhang B, He M, Liao J, et al. Deep exemplar-based video colorization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019. 8052–8061
Rossler A, Cozzolino D, Verdoliva L, et al. Faceforensics++: learning to detect manipulated facial images. In: Proceedings of the IEEE International Conference on Computer Vision, 2019. 1–11
Weng Y, Wang L, Li X, et al. Hair interpolation for portrait morphing. In: Proceedings of Computer Graphics Forum, 2013. 79–84
Wei L, Hu L, Kim V, et al. Real-time hair rendering using sequential adversarial networks. In: Proceedings of the European Conference on Computer Vision (ECCV), 2018. 99–116
Chai M, Ren J, Tulyakov S. Neural hair rendering. 2020. ArXiv:2004.13297
Goodfellow I, Pouget-Abadie J, Mirza M, et al. Generative adversarial nets. In: Proceedings of the 27th International Conference on Neural Information Processing Systems, 2014. 2672–2680
Zhang H, Goodfellow I, Metaxas D, et al. Self-attention generative adversarial networks. In: Proceedings of International Conference on Machine Learning, 2019. 7354–7363
Karras T, Laine S, Aila T. A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019. 4401–4410
Mirza M, Osindero S. Conditional generative adversarial nets. 2014. ArXiv:1411.1784
Isola P, Zhu J Y, Zhou T, et al. Image-to-image translation with conditional adversarial networks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017
Wang T C, Liu M Y, Zhu J Y, et al. High-resolution image synthesis and semantic manipulation with conditional GANs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018
Jo Y, Park J. SC-FEGAN: face editing generative adversarial network with user’s sketch and color. In: Proceedings of the IEEE International Conference on Computer Vision, 2019. 1745–1753
Yu R, Wang X, Xie X. VTNFP: an image-based virtual try-on network with body and clothing feature preservation. In: Proceedings of the IEEE International Conference on Computer Vision, 2019. 10511–10520
Han X, Wu Z, Wu Z, et al. VITON: an image-based virtual try-on network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018. 7543–7552
Cao K, Liao J, Yuan L. CariGANs: unpaired photo-to-caricature translation. 2018. ArXiv:1811.00222
Yang L, Shi Z, Wu Y, et al. iOrthoPredictor: model-guided deep prediction of teeth alignment. ACM Trans Graph, 2020, 39: 1–15
Paris S, Briceño H M, Sillion F X. Capture of hair geometry from multiple images. ACM Trans Graph, 2004, 23: 712–719
Saito M, Matsumoto E, Saito S. Temporal generative adversarial nets with singular value clipping. In: Proceedings of the IEEE International Conference on Computer Vision, 2017. 2830–2839
Tulyakov S, Liu M Y, Yang X, et al. MoCoGAN: decomposing motion and content for video generation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018. 1526–1535
Vondrick C, Pirsiavash H, Torralba A. Generating videos with scene dynamics. In: Proceedings of Advances in Neural Information Processing Systems, 2016. 613–621
Liu W, Piao Z, Min J, et al. Liquid warping GAN: a unified framework for human motion imitation, appearance transfer and novel view synthesis. In: Proceedings of the IEEE International Conference on Computer Vision, 2019. 5904–5913
Chan C, Ginosar S, Zhou T, et al. Everybody dance now. In: Proceedings of the IEEE International Conference on Computer Vision, 2019. 5933–5942
Shechtman E, Caspi Y, Irani M. Space-time super-resolution. IEEE Trans Pattern Anal Machine Intell, 2005, 27: 531–545
Shi W, Caballero J, Huszár F, et al. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016. 1874–1883
Chen D, Liao J, Yuan L, et al. Coherent online video style transfer. In: Proceedings of the IEEE International Conference on Computer Vision, 2017. 1105–1114
Gupta A, Johnson J, Alahi A, et al. Characterizing and improving stability in neural style transfer. In: Proceedings of the IEEE International Conference on Computer Vision, 2017. 4067–4076
Huang H, Wang H, Luo W, et al. Real-time neural style transfer for videos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017. 783–791
Ruder M, Dosovitskiy A, Brox T. Artistic style transfer for videos. In: Proceedings of German Conference on Pattern Recognition. Berlin: Springer, 2016. 26–36
Jampani V, Gadde R, Gehler P V. Video propagation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017. 451–461
Vondrick C, Shrivastava A, Fathi A, et al. Tracking emerges by colorizing videos. In: Proceedings of the European Conference on Computer Vision (ECCV), 2018. 391–408
Liu S, Zhong G, de Mello S, et al. Switchable temporal propagation network. In: Proceedings of the European Conference on Computer Vision (ECCV), 2018. 87–102
Meyer S, Cornillère V, Djelouah A, et al. Deep video color propagation. 2018. ArXiv:1808.03232
He M, Chen D, Liao J, et al. Deep exemplar-based colorization. ACM Trans Graph, 2018, 37: 1–16
He M, Liao J, Chen D, et al. Progressive color transfer with dense semantic correspondences. ACM Trans Graph, 2019, 38: 1–18
Chai M, Shao T, Wu H, et al. AutoHair: fully automatic hair modeling from a single image. ACM Trans Graph, 2016, 35: 1–2
Hou Q, Liu F. Context-aware image matting for simultaneous foreground and alpha estimation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019. 4130–4139
Liu G, Reda F A, Shih K J, et al. Image inpainting for irregular holes using partial convolutions. In: Proceedings of the European Conference on Computer Vision (ECCV), 2018. 85–100
Pérez P, Gangnet M, Blake A. Poisson image editing. ACM Trans Graph, 2003, 22: 313–318
Park T, Liu M Y, Wang T C, et al. Semantic image synthesis with spatially-adaptive normalization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019
Wang T C, Liu M Y, Zhu J Y, et al. High-resolution image synthesis and semantic manipulation with conditional GANs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018. 8798–8807
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. 2014. ArXiv:1409.1556
Guo J, Li M, Zong Z, et al. Volumetric appearance stylization with stylizing kernel prediction network. ACM Trans Graph, 2021, 40: 1–15
Acknowledgements
This work was supported in part by National Key Research & Development Program of China (Grant No. 2018YFE0100900) and National Natural Science Foundation of China (Grant No. 62172363).
Author information
Authors and Affiliations
Corresponding author
Additional information
Supporting information
The supporting information is available online at info.scichina.com and link.springer.com. The supporting materials are published as submitted, without typesetting or editing. The responsibility for scientific accuracy and content remains entirely with the authors.
Electronic Supplementary Material
Supplementary material, approximately 4.43 MB.
Supplementary material, approximately 8.93 MB.
Supplementary material, approximately 13.5 MB.
Supplementary material, approximately 11.2 MB.
Supplementary material, approximately 28.0 MB.
Supplementary material, approximately 6.03 MB.
Supplementary material, approximately 6.29 MB.
Supplementary material, approximately 132 KB.
Supplementary material, approximately 19.6 MB.
Rights and permissions
About this article
Cite this article
Wu, K., Yang, L., Fu, H. et al. iHairRecolorer: deep image-to-video hair color transfer. Sci. China Inf. Sci. 64, 210104 (2021). https://doi.org/10.1007/s11432-021-3325-6
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11432-021-3325-6