Heterogeneous Avatar Synthesis Based on Disentanglement of Topology and Rendering

Gao, Nan; Zeng, Zhi; Zhang, GuiXuan; Zhang, ShuWu

doi:10.1007/978-3-031-26316-3_9

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13844))

Included in the following conference series:

Asian Conference on Computer Vision

338 Accesses

Abstract

There are obviously structural and color discrepancies among different heterogeneous domains. In this paper, we explore the challenging heterogeneous avatar synthesis (HAS) task considering topology and rendering transfer. HAS transfers the topology as well as rendering styles of the referenced face to the source face, to produce high-fidelity heterogeneous avatars. Specifically, first, we utilize a Rendering Transfer Network (RT-Net) to render the grayscale source face based on the color palette of the referenced face. The grayscale features and color style are injected into RT-Net based on adaptive feature modulation. Second, we apply a Topology Transfer Network (TT-Net) to conduct heterogeneous facial topology transfer, where the image content of RT-Net is transferred based on AdaIN controlled by heterogeneous identity embedding. Comprehensive experimental results show that the disentanglement of rendering and topology is beneficial to the HAS task, and our HASNet has comparable performance compared with other state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Bahng, H., Yoo, S., Cho, W., et al.: Coloring with words: guiding image colorization through text-based palette generation. In: Proceedings of ECCV, pp. 431–447 (2018)
Google Scholar
Bhatt, H.S., Bharadwaj, S., Singh, R., et al.: Memetically optimized MCWLD for matching sketches with digital face images. IEEE TIFS 7(5), 1522–1535 (2012)
Google Scholar
Bińkowski, M., Sutherland, D.J., Arbel, M., Gretton, A.: Demystifying mmd GANs. In: ICLR (2018)
Google Scholar
Chen, J., Yi, D., Yang, J., et al.: Learning mappings for face synthesis from near infrared to visual light images. In: 2009 IEEE Conference on CVPR, pp. 156–163. IEEE (2009)
Google Scholar
Chen, R., Huang, W., Huang, B., et al.: Reusing discriminators for encoding: towards unsupervised image-to-image translation. In: Proceedings of CVPR, pp. 8168–8177 (2020)
Google Scholar
Deng, J., Guo, J., Xue, N., et al.: Arcface: additive angular margin loss for deep face recognition. In: Proceedings of CVPR, pp. 4690–4699 (2019)
Google Scholar
Duan, B., Fu, C., Li, Y., et al.: Cross-spectral face hallucination via disentangling independent factors. In: Proceedings of CVPR, pp. 7930–7938 (2020)
Google Scholar
Fu, C., Wu, X., Hu, Y., et al.: Dual variational generation for low shot heterogeneous face recognition. In: NIPS (2019)
Google Scholar
Fu, C., Wu, X., Hu, Y., et al.: DVG-face: dual variational generation for heterogeneous face recognition. IEEE Trans. PAMI 44(6), 2938–2952 (2021)
Article Google Scholar
Heusel, M., Ramsauer, H., Unterthiner, T., et al.: GANs trained by a two time-scale update rule converge to a local nash equilibrium. In: NIPS (2017)
Google Scholar
Hong, K., Jeon, S., Yang, H., et al.: Domain-aware universal style transfer. In: Proceedings of ICCV, pp. 14609–14617 (2021)
Google Scholar
Huang, D., Sun, J., Wang, Y.: The BUAA-VisNir face database instructions. School Computer Science and Engineering, Beihang University, Beijing, China, Technical report, IRIP-TR-12-FR-001, vol. 3 (2012)
Google Scholar
Huang, X., Belongie, S.: Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of ICCV, pp. 1501–1510 (2017)
Google Scholar
Karras, T., Aittala, M., Laine, S., et al.: Alias-free generative adversarial networks. In: NIPS (2021)
Google Scholar
Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: Proceedings of CVPR, pp. 4401–4410 (2019)
Google Scholar
Karras, T., Laine, S., Aittala, M., et al.: Analyzing and improving the image quality of StyleGAN. In: Proceedings of CVPR, pp. 8110–8119 (2020)
Google Scholar
Kim, J., Kim, M., Kang, H., et al.: U-GAT-IT: Unsupervised generative attentional networks with adaptive layer-instance normalization for image-to-image translation. In: ICLR (2020)
Google Scholar
Li, S., Yi, D., Lei, Z., et al.: The CASIA NIR-VIS 2.0 face database. In: Proceedings of the IEEE Conference on CVPR Workshops, pp. 348–353 (2013)
Google Scholar
Li, Y., Fang, C., Yang, et al.: Universal style transfer via feature transforms. In: NIPS, pp. 386–396 (2017)
Google Scholar
Liao, J., Yao, Y., Yuan, L., et al.: Visual attribute transfer through deep image analogy. In: SIGGRAPH (2017)
Google Scholar
Mechrez, R., Talmi, I., Zelnik-Manor, L.: The contextual loss for image transformation with non-aligned data. In: Proceedings of ECCV, pp. 768–783 (2018)
Google Scholar
Messer, K., Matas, J., Kittler, J., et al.: XM2VTSDB: the extended M2VTS database. In: Second International Conference on Audio and Video-Based Biometric Person Authentication, vol. 964, pp. 965–966. Citeseer (1999)
Google Scholar
Mishra, A., Rai, S.N., Mishra, A., Jawahar, C.V.: IIIT-CFW: a benchmark database of cartoon faces in the wild. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9913, pp. 35–47. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46604-0_3
Chapter Google Scholar
Ojha, U., Li, Y., Lu, J., et al.: Few-shot image generation via cross-domain correspondence. In: Proceedings of CVPR, pp. 10743–10752 (2021)
Google Scholar
Panetta, K., Wan, Q., Agaian, S., et al.: A comprehensive database for benchmarking imaging systems. IEEE Trans. PAMI 42(3), 509–520 (2018)
Article Google Scholar
Park, D.Y., Lee, K.H.: Arbitrary style transfer with style-attentional networks. In: Proceedings of CVPR, pp. 5880–5888 (2019)
Google Scholar
Park, T., Liu, M.Y., Wang, T.C., et al.: Semantic image synthesis with spatially-adaptive normalization. In: Proceedings of CVPR, pp. 2337–2346 (2019)
Google Scholar
Pinkney, J.N., Adler, D.: Resolution dependent gan interpolation for controllable image synthesis between domains. arXiv preprint arXiv:2010.05334 (2020)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: 3rd ICLR (2015)
Google Scholar
Wang, X., Li, Y., Zhang, H., et al.: Towards real-world blind face restoration with generative facial prior. In: CVPR (2021)
Google Scholar
Yoo, S., Bahng, H., Chung, S., et al.: Coloring with limited data: few-shot colorization via memory augmented networks. In: Proceedings of CVPR, pp. 11283–11292 (2019)
Google Scholar
Zhang, M., Wang, R., Gao, X., et al.: Dual-transfer face sketch-photo synthesis. IEEE Trans. Image Process. 28(2), 642–657 (2018)
Article MathSciNet MATH Google Scholar
Zhang, W., Wang, X., Tang, X.: Coupled information-theoretic encoding for face photo-sketch recognition. In: CVPR 2011, pp. 513–520. IEEE (2011)
Google Scholar

Download references

Acknowledgments

This work is supported by the National Key R &D Program of China (2019YFB1406202).

Author information

Authors and Affiliations

Institute of Automation Chinese Academy of Sciences, Beijing, China
Nan Gao, Zhi Zeng, GuiXuan Zhang & ShuWu Zhang

Authors

Nan Gao
View author publications
You can also search for this author in PubMed Google Scholar
Zhi Zeng
View author publications
You can also search for this author in PubMed Google Scholar
GuiXuan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
ShuWu Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to GuiXuan Zhang .

Editor information

Editors and Affiliations

University of Wollongong, Wollongong, NSW, Australia
Lei Wang
University of Bonn, Bonn, Germany
Juergen Gall
University of Adelaide, Adelaide, SA, Australia
Tat-Jun Chin
National Institute of Informatics, Tokyo, Japan
Imari Sato
Johns Hopkins University, Baltimore, MD, USA
Rama Chellappa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gao, N., Zeng, Z., Zhang, G., Zhang, S. (2023). Heterogeneous Avatar Synthesis Based on Disentanglement of Topology and Rendering. In: Wang, L., Gall, J., Chin, TJ., Sato, I., Chellappa, R. (eds) Computer Vision – ACCV 2022. ACCV 2022. Lecture Notes in Computer Science, vol 13844. Springer, Cham. https://doi.org/10.1007/978-3-031-26316-3_9

Download citation

DOI: https://doi.org/10.1007/978-3-031-26316-3_9
Published: 02 March 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-26315-6
Online ISBN: 978-3-031-26316-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Heterogeneous Avatar Synthesis Based on Disentanglement of Topology and Rendering