skip to main content
10.1145/3591106.3592291acmconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
short-paper

SOFA: Style-based One-shot 3D Facial Animation Driven by 2D landmarks

Published: 12 June 2023 Publication History

Abstract

We propose a 2D landmark-driven 3D facial animation framework trained without the need of 3D facial dataset. Our method decomposes the 3D facial avatar into geometry and texture. Given 2D landmarks as input, our models learn to estimate the parameters of FLAME and transfer the target texture into different facial expressions. The experiments show that our method achieves remarkable results. Using 2D landmarks as input data, our method has the potential to be deployed in a scenario that suffered from obtaining full RGB facial images (e.g., occluded by VR Head-mounted Display).

References

[1]
Volker Blanz and Thomas Vetter. 1999. A morphable model for the synthesis of 3D faces. In Proceedings of the 26th annual conference on Computer graphics and interactive techniques.
[2]
Chen Cao, Tomas Simon, Jin Kyu Kim, Gabe Schwartz, Michael Zollhoefer, Shun-Suke Saito, Stephen Lombardi, Shih-En Wei, Danielle Belko, Shoou-I Yu, 2022. Authentic volumetric avatars from a phone scan. ACM Transactions on Graphics (TOG) (2022).
[3]
Yao Feng, Haiwen Feng, Michael J Black, and Timo Bolkart. 2021. Learning an animatable detailed 3D face model from in-the-wild images. ACM Transactions on Graphics (ToG) (2021).
[4]
Kuangxiao Gu, Yuqian Zhou, and Thomas Huang. 2020. Flnet: Landmark driven fetching and learning network for faithful talking facial animation synthesis. In Proceedings of the AAAI conference on artificial intelligence.
[5]
Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-to-Image Translation with Conditional Adversarial Networks. CVPR (2017).
[6]
Justin Johnson, Alexandre Alahi, and Li Fei-Fei. 2016. Perceptual losses for real-time style transfer and super-resolution. In European conference on computer vision. Springer.
[7]
Reinhard Knothe, Brian Amberg, Sami Romdhani, Volker Blanz, and Thomas Vetter. 2011. Morphable Models of Faces. In Handbook of Face Recognition. Springer.
[8]
Tianye Li, Timo Bolkart, Michael J Black, Hao Li, and Javier Romero. 2017. Learning a model of facial shape and expression from 4D scans.ACM Trans. Graph. (2017).
[9]
Stephen Lombardi, Jason Saragih, Tomas Simon, and Yaser Sheikh. 2018. Deep appearance models for face rendering. ACM Transactions on Graphics (ToG) (2018).
[10]
Camillo Lugaresi, Jiuqiang Tang, Hadon Nash, Chris McClanahan, Esha Uboweja, Michael Hays, Fan Zhang, Chuo-Ling Chang, Ming Guang Yong, Juhyun Lee, 2019. Mediapipe: A framework for building perception pipelines. arXiv preprint arXiv:1906.08172 (2019).
[11]
Safa C Medin, Bernhard Egger, Anoop Cherian, Ye Wang, Joshua B Tenenbaum, Xiaoming Liu, and Tim K Marks. 2022. MOST-GAN: 3D morphable StyleGAN for disentangled face image manipulation. In Proceedings of the AAAI Conference on Artificial Intelligence.
[12]
Moustafa Meshry, Saksham Suri, Larry S Davis, and Abhinav Shrivastava. 2021. Learned Spatial Representations for Few-shot Talking-Head Synthesis. In Proceedings of the IEEE/CVF International Conference on Computer Vision.
[13]
Alexander Richard, Colin Lea, Shugao Ma, Jurgen Gall, Fernando De la Torre, and Yaser Sheikh. 2021. Audio-and gaze-driven facial animation of codec avatars. In Proceedings of the IEEE/CVF winter conference on applications of computer vision.
[14]
Wenzhe Shi, Jose Caballero, Ferenc Huszár, Johannes Totz, Andrew P Aitken, Rob Bishop, Daniel Rueckert, and Zehan Wang. 2016. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In Proceedings of the IEEE conference on computer vision and pattern recognition.
[15]
Jiale Tao, Biao Wang, Borun Xu, Tiezheng Ge, Yuning Jiang, Wen Li, and Lixin Duan. 2022. Structure-Aware Motion Transfer with Deformable Anchor Model. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[16]
Shih-En Wei, Jason Saragih, Tomas Simon, Adam W Harley, Stephen Lombardi, Michal Perdoch, Alexander Hypes, Dawei Wang, Hernan Badino, and Yaser Sheikh. 2019. Vr facial animation via multiview image translation. ACM Transactions on Graphics (TOG) (2019).
[17]
Zili Yi, Qiang Tang, Vishnu Sanjay Ramiya Srinivasan, and Zhan Xu. 2020. Animating through warping: An efficient method for high-quality facial expression animation. In Proceedings of the 28th ACM international conference on multimedia.
[18]
Egor Zakharov, Aliaksandra Shysheya, Egor Burkov, and Victor Lempitsky. 2019. Few-shot adversarial learning of realistic neural talking head models. In Proceedings of the IEEE/CVF international conference on computer vision.
[19]
Ruiqi Zhao, Tianyi Wu, and Guodong Guo. 2021. Sparse to dense motion transfer for face image animation. In Proceedings of the IEEE/CVF International Conference on Computer Vision.

Index Terms

  1. SOFA: Style-based One-shot 3D Facial Animation Driven by 2D landmarks

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICMR '23: Proceedings of the 2023 ACM International Conference on Multimedia Retrieval
    June 2023
    694 pages
    ISBN:9798400701788
    DOI:10.1145/3591106
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 12 June 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. 3D avatar
    2. Facial animation
    3. Morphable model

    Qualifiers

    • Short-paper
    • Research
    • Refereed limited

    Conference

    ICMR '23
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 254 of 830 submissions, 31%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 165
      Total Downloads
    • Downloads (Last 12 months)40
    • Downloads (Last 6 weeks)13
    Reflects downloads up to 03 Mar 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media