Abstract
There are obviously structural and color discrepancies among different heterogeneous domains. In this paper, we explore the challenging heterogeneous avatar synthesis (HAS) task considering topology and rendering transfer. HAS transfers the topology as well as rendering styles of the referenced face to the source face, to produce high-fidelity heterogeneous avatars. Specifically, first, we utilize a Rendering Transfer Network (RT-Net) to render the grayscale source face based on the color palette of the referenced face. The grayscale features and color style are injected into RT-Net based on adaptive feature modulation. Second, we apply a Topology Transfer Network (TT-Net) to conduct heterogeneous facial topology transfer, where the image content of RT-Net is transferred based on AdaIN controlled by heterogeneous identity embedding. Comprehensive experimental results show that the disentanglement of rendering and topology is beneficial to the HAS task, and our HASNet has comparable performance compared with other state-of-the-art methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bahng, H., Yoo, S., Cho, W., et al.: Coloring with words: guiding image colorization through text-based palette generation. In: Proceedings of ECCV, pp. 431–447 (2018)
Bhatt, H.S., Bharadwaj, S., Singh, R., et al.: Memetically optimized MCWLD for matching sketches with digital face images. IEEE TIFS 7(5), 1522–1535 (2012)
Bińkowski, M., Sutherland, D.J., Arbel, M., Gretton, A.: Demystifying mmd GANs. In: ICLR (2018)
Chen, J., Yi, D., Yang, J., et al.: Learning mappings for face synthesis from near infrared to visual light images. In: 2009 IEEE Conference on CVPR, pp. 156–163. IEEE (2009)
Chen, R., Huang, W., Huang, B., et al.: Reusing discriminators for encoding: towards unsupervised image-to-image translation. In: Proceedings of CVPR, pp. 8168–8177 (2020)
Deng, J., Guo, J., Xue, N., et al.: Arcface: additive angular margin loss for deep face recognition. In: Proceedings of CVPR, pp. 4690–4699 (2019)
Duan, B., Fu, C., Li, Y., et al.: Cross-spectral face hallucination via disentangling independent factors. In: Proceedings of CVPR, pp. 7930–7938 (2020)
Fu, C., Wu, X., Hu, Y., et al.: Dual variational generation for low shot heterogeneous face recognition. In: NIPS (2019)
Fu, C., Wu, X., Hu, Y., et al.: DVG-face: dual variational generation for heterogeneous face recognition. IEEE Trans. PAMI 44(6), 2938–2952 (2021)
Heusel, M., Ramsauer, H., Unterthiner, T., et al.: GANs trained by a two time-scale update rule converge to a local nash equilibrium. In: NIPS (2017)
Hong, K., Jeon, S., Yang, H., et al.: Domain-aware universal style transfer. In: Proceedings of ICCV, pp. 14609–14617 (2021)
Huang, D., Sun, J., Wang, Y.: The BUAA-VisNir face database instructions. School Computer Science and Engineering, Beihang University, Beijing, China, Technical report, IRIP-TR-12-FR-001, vol. 3 (2012)
Huang, X., Belongie, S.: Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of ICCV, pp. 1501–1510 (2017)
Karras, T., Aittala, M., Laine, S., et al.: Alias-free generative adversarial networks. In: NIPS (2021)
Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: Proceedings of CVPR, pp. 4401–4410 (2019)
Karras, T., Laine, S., Aittala, M., et al.: Analyzing and improving the image quality of StyleGAN. In: Proceedings of CVPR, pp. 8110–8119 (2020)
Kim, J., Kim, M., Kang, H., et al.: U-GAT-IT: Unsupervised generative attentional networks with adaptive layer-instance normalization for image-to-image translation. In: ICLR (2020)
Li, S., Yi, D., Lei, Z., et al.: The CASIA NIR-VIS 2.0 face database. In: Proceedings of the IEEE Conference on CVPR Workshops, pp. 348–353 (2013)
Li, Y., Fang, C., Yang, et al.: Universal style transfer via feature transforms. In: NIPS, pp. 386–396 (2017)
Liao, J., Yao, Y., Yuan, L., et al.: Visual attribute transfer through deep image analogy. In: SIGGRAPH (2017)
Mechrez, R., Talmi, I., Zelnik-Manor, L.: The contextual loss for image transformation with non-aligned data. In: Proceedings of ECCV, pp. 768–783 (2018)
Messer, K., Matas, J., Kittler, J., et al.: XM2VTSDB: the extended M2VTS database. In: Second International Conference on Audio and Video-Based Biometric Person Authentication, vol. 964, pp. 965–966. Citeseer (1999)
Mishra, A., Rai, S.N., Mishra, A., Jawahar, C.V.: IIIT-CFW: a benchmark database of cartoon faces in the wild. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9913, pp. 35–47. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46604-0_3
Ojha, U., Li, Y., Lu, J., et al.: Few-shot image generation via cross-domain correspondence. In: Proceedings of CVPR, pp. 10743–10752 (2021)
Panetta, K., Wan, Q., Agaian, S., et al.: A comprehensive database for benchmarking imaging systems. IEEE Trans. PAMI 42(3), 509–520 (2018)
Park, D.Y., Lee, K.H.: Arbitrary style transfer with style-attentional networks. In: Proceedings of CVPR, pp. 5880–5888 (2019)
Park, T., Liu, M.Y., Wang, T.C., et al.: Semantic image synthesis with spatially-adaptive normalization. In: Proceedings of CVPR, pp. 2337–2346 (2019)
Pinkney, J.N., Adler, D.: Resolution dependent gan interpolation for controllable image synthesis between domains. arXiv preprint arXiv:2010.05334 (2020)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: 3rd ICLR (2015)
Wang, X., Li, Y., Zhang, H., et al.: Towards real-world blind face restoration with generative facial prior. In: CVPR (2021)
Yoo, S., Bahng, H., Chung, S., et al.: Coloring with limited data: few-shot colorization via memory augmented networks. In: Proceedings of CVPR, pp. 11283–11292 (2019)
Zhang, M., Wang, R., Gao, X., et al.: Dual-transfer face sketch-photo synthesis. IEEE Trans. Image Process. 28(2), 642–657 (2018)
Zhang, W., Wang, X., Tang, X.: Coupled information-theoretic encoding for face photo-sketch recognition. In: CVPR 2011, pp. 513–520. IEEE (2011)
Acknowledgments
This work is supported by the National Key R &D Program of China (2019YFB1406202).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Gao, N., Zeng, Z., Zhang, G., Zhang, S. (2023). Heterogeneous Avatar Synthesis Based on Disentanglement of Topology and Rendering. In: Wang, L., Gall, J., Chin, TJ., Sato, I., Chellappa, R. (eds) Computer Vision – ACCV 2022. ACCV 2022. Lecture Notes in Computer Science, vol 13844. Springer, Cham. https://doi.org/10.1007/978-3-031-26316-3_9
Download citation
DOI: https://doi.org/10.1007/978-3-031-26316-3_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-26315-6
Online ISBN: 978-3-031-26316-3
eBook Packages: Computer ScienceComputer Science (R0)