Skip to main content

Heterogeneous Avatar Synthesis Based on Disentanglement of Topology and Rendering

  • Conference paper
  • First Online:
Computer Vision – ACCV 2022 (ACCV 2022)

Abstract

There are obviously structural and color discrepancies among different heterogeneous domains. In this paper, we explore the challenging heterogeneous avatar synthesis (HAS) task considering topology and rendering transfer. HAS transfers the topology as well as rendering styles of the referenced face to the source face, to produce high-fidelity heterogeneous avatars. Specifically, first, we utilize a Rendering Transfer Network (RT-Net) to render the grayscale source face based on the color palette of the referenced face. The grayscale features and color style are injected into RT-Net based on adaptive feature modulation. Second, we apply a Topology Transfer Network (TT-Net) to conduct heterogeneous facial topology transfer, where the image content of RT-Net is transferred based on AdaIN controlled by heterogeneous identity embedding. Comprehensive experimental results show that the disentanglement of rendering and topology is beneficial to the HAS task, and our HASNet has comparable performance compared with other state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://lokeshdhakar.com/projects/color-thief/.

  2. 2.

    https://github.com/postite/metfaces-dataset.

References

  1. Bahng, H., Yoo, S., Cho, W., et al.: Coloring with words: guiding image colorization through text-based palette generation. In: Proceedings of ECCV, pp. 431–447 (2018)

    Google Scholar 

  2. Bhatt, H.S., Bharadwaj, S., Singh, R., et al.: Memetically optimized MCWLD for matching sketches with digital face images. IEEE TIFS 7(5), 1522–1535 (2012)

    Google Scholar 

  3. Bińkowski, M., Sutherland, D.J., Arbel, M., Gretton, A.: Demystifying mmd GANs. In: ICLR (2018)

    Google Scholar 

  4. Chen, J., Yi, D., Yang, J., et al.: Learning mappings for face synthesis from near infrared to visual light images. In: 2009 IEEE Conference on CVPR, pp. 156–163. IEEE (2009)

    Google Scholar 

  5. Chen, R., Huang, W., Huang, B., et al.: Reusing discriminators for encoding: towards unsupervised image-to-image translation. In: Proceedings of CVPR, pp. 8168–8177 (2020)

    Google Scholar 

  6. Deng, J., Guo, J., Xue, N., et al.: Arcface: additive angular margin loss for deep face recognition. In: Proceedings of CVPR, pp. 4690–4699 (2019)

    Google Scholar 

  7. Duan, B., Fu, C., Li, Y., et al.: Cross-spectral face hallucination via disentangling independent factors. In: Proceedings of CVPR, pp. 7930–7938 (2020)

    Google Scholar 

  8. Fu, C., Wu, X., Hu, Y., et al.: Dual variational generation for low shot heterogeneous face recognition. In: NIPS (2019)

    Google Scholar 

  9. Fu, C., Wu, X., Hu, Y., et al.: DVG-face: dual variational generation for heterogeneous face recognition. IEEE Trans. PAMI 44(6), 2938–2952 (2021)

    Article  Google Scholar 

  10. Heusel, M., Ramsauer, H., Unterthiner, T., et al.: GANs trained by a two time-scale update rule converge to a local nash equilibrium. In: NIPS (2017)

    Google Scholar 

  11. Hong, K., Jeon, S., Yang, H., et al.: Domain-aware universal style transfer. In: Proceedings of ICCV, pp. 14609–14617 (2021)

    Google Scholar 

  12. Huang, D., Sun, J., Wang, Y.: The BUAA-VisNir face database instructions. School Computer Science and Engineering, Beihang University, Beijing, China, Technical report, IRIP-TR-12-FR-001, vol. 3 (2012)

    Google Scholar 

  13. Huang, X., Belongie, S.: Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of ICCV, pp. 1501–1510 (2017)

    Google Scholar 

  14. Karras, T., Aittala, M., Laine, S., et al.: Alias-free generative adversarial networks. In: NIPS (2021)

    Google Scholar 

  15. Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: Proceedings of CVPR, pp. 4401–4410 (2019)

    Google Scholar 

  16. Karras, T., Laine, S., Aittala, M., et al.: Analyzing and improving the image quality of StyleGAN. In: Proceedings of CVPR, pp. 8110–8119 (2020)

    Google Scholar 

  17. Kim, J., Kim, M., Kang, H., et al.: U-GAT-IT: Unsupervised generative attentional networks with adaptive layer-instance normalization for image-to-image translation. In: ICLR (2020)

    Google Scholar 

  18. Li, S., Yi, D., Lei, Z., et al.: The CASIA NIR-VIS 2.0 face database. In: Proceedings of the IEEE Conference on CVPR Workshops, pp. 348–353 (2013)

    Google Scholar 

  19. Li, Y., Fang, C., Yang, et al.: Universal style transfer via feature transforms. In: NIPS, pp. 386–396 (2017)

    Google Scholar 

  20. Liao, J., Yao, Y., Yuan, L., et al.: Visual attribute transfer through deep image analogy. In: SIGGRAPH (2017)

    Google Scholar 

  21. Mechrez, R., Talmi, I., Zelnik-Manor, L.: The contextual loss for image transformation with non-aligned data. In: Proceedings of ECCV, pp. 768–783 (2018)

    Google Scholar 

  22. Messer, K., Matas, J., Kittler, J., et al.: XM2VTSDB: the extended M2VTS database. In: Second International Conference on Audio and Video-Based Biometric Person Authentication, vol. 964, pp. 965–966. Citeseer (1999)

    Google Scholar 

  23. Mishra, A., Rai, S.N., Mishra, A., Jawahar, C.V.: IIIT-CFW: a benchmark database of cartoon faces in the wild. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9913, pp. 35–47. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46604-0_3

    Chapter  Google Scholar 

  24. Ojha, U., Li, Y., Lu, J., et al.: Few-shot image generation via cross-domain correspondence. In: Proceedings of CVPR, pp. 10743–10752 (2021)

    Google Scholar 

  25. Panetta, K., Wan, Q., Agaian, S., et al.: A comprehensive database for benchmarking imaging systems. IEEE Trans. PAMI 42(3), 509–520 (2018)

    Article  Google Scholar 

  26. Park, D.Y., Lee, K.H.: Arbitrary style transfer with style-attentional networks. In: Proceedings of CVPR, pp. 5880–5888 (2019)

    Google Scholar 

  27. Park, T., Liu, M.Y., Wang, T.C., et al.: Semantic image synthesis with spatially-adaptive normalization. In: Proceedings of CVPR, pp. 2337–2346 (2019)

    Google Scholar 

  28. Pinkney, J.N., Adler, D.: Resolution dependent gan interpolation for controllable image synthesis between domains. arXiv preprint arXiv:2010.05334 (2020)

  29. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28

    Chapter  Google Scholar 

  30. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: 3rd ICLR (2015)

    Google Scholar 

  31. Wang, X., Li, Y., Zhang, H., et al.: Towards real-world blind face restoration with generative facial prior. In: CVPR (2021)

    Google Scholar 

  32. Yoo, S., Bahng, H., Chung, S., et al.: Coloring with limited data: few-shot colorization via memory augmented networks. In: Proceedings of CVPR, pp. 11283–11292 (2019)

    Google Scholar 

  33. Zhang, M., Wang, R., Gao, X., et al.: Dual-transfer face sketch-photo synthesis. IEEE Trans. Image Process. 28(2), 642–657 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  34. Zhang, W., Wang, X., Tang, X.: Coupled information-theoretic encoding for face photo-sketch recognition. In: CVPR 2011, pp. 513–520. IEEE (2011)

    Google Scholar 

Download references

Acknowledgments

This work is supported by the National Key R &D Program of China (2019YFB1406202).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to GuiXuan Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Gao, N., Zeng, Z., Zhang, G., Zhang, S. (2023). Heterogeneous Avatar Synthesis Based on Disentanglement of Topology and Rendering. In: Wang, L., Gall, J., Chin, TJ., Sato, I., Chellappa, R. (eds) Computer Vision – ACCV 2022. ACCV 2022. Lecture Notes in Computer Science, vol 13844. Springer, Cham. https://doi.org/10.1007/978-3-031-26316-3_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-26316-3_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-26315-6

  • Online ISBN: 978-3-031-26316-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics