Skip to content
Licensed Unlicensed Requires Authentication Published by De Gruyter Oldenbourg February 21, 2019

Face2Face: Real-time facial reenactment

  • Justus Thies

    Dr. Justus Thies is working as a postdoctoral researcher at the Technical University of Munich. In September 2017 he joined the Visual Computing Lab of Prof. Dr. Matthias Nießner. Previous, he was a PhD student at the University of Erlangen-Nuremberg under the supervision of Gunther Greiner. He started his PhD studies in 2014 after receiving his Master of Science degree from the University of Erlangen-Nuremberg. During the time as a PhD student he collaborated with other institutes and did internships at Stanford University and the Max-Planck Institute for Informatics. His research focuses on real-time facial performance capturing and expression transfer using commodity hardware. Thus, he is interested in Computer Vision and Computer Graphics, as well as in efficient implementations of optimization techniques, especially on graphics hardware. His publications opened up a new research field – real-time facial reenactment. The achieved quality, efficiency and the reduced hardware requirements of his developed methods raised a lot of attention in academia, industry and media. The dissertation “Face2Face: Real-time Facial Reenactment” of Justus Thies summarizes these publications and discusses the implications of the demonstrated technologies. Recently, he is also working on digital forensic projects that target on detecting forgeries in image and video data.

    EMAIL logo

Abstract

This article summarizes the dissertation “Face2Face: Realtime Facial Reenactment” by Justus Thies (Eurographics Graphics Dissertation Online, 2017). It shows advances in the field of 3D reconstruction of human faces using commodity hardware. Besides the reconstruction of the facial geometry and texture, real-time face tracking is demonstrated. The developed algorithms are based on the principle of analysis-by-synthesis. To apply this principle, a mathematical model that represents a face virtually is defined. Utilizing this model to synthesize facial imagery, the model parameters are adjusted, such that the synthesized image fits the input image as good as possible. Thus, in reverse, this process transfers the input image to a virtual representation of the face. The achieved quality allows many new applications that require a good reconstruction of the face. One of these applications is the so-called “Facial Reenactment”. Our developed methods show that such an application does not need any special hardware. The generated results are nearly photo-realistic videos that show the transfer of the expressions of one person to another person. These techniques can for example be used to bring movie dubbing to a new level. Instead of adapting the audio to the video, which might also include changes of the text, the video can be post-processed to match the mouth movements of the dubber. Since the approaches that are shown in the dissertation run in real-time, one can also think of a live dubber in a video teleconferencing system that simultaneously translates the speech of a person to another language.

The published videos of the projects in this dissertation led to a broad discussion in the media. On the one hand this is due to the fact that our methods are designed such that they run in real-time and on the other hand that we reduced the hardware requirements to a minimum while improving the resulting quality. In fact, after some preprocessing, we are able to edit ordinary videos from the Internet in real-time. Amongst others, we impose a different mimic to faces of prominent persons like former presidents of the United States of America. This led inevitably to a discussion about trustworthiness of video material, especially from unknown source. Most people did not expect that such manipulations are possible, neglecting existing methods that are already able to edit videos (e. g. special effects in movie productions). Thus, besides the advances in real-time face tracking, our projects raised the awareness of video manipulation.

ACM CCS:

Article note

The dissertation has been published as Research Highlight by the Communications of the ACM (January 2019), it has been recommended to the GI-Dissertation Award 2017 by the University of Erlangen-Nuremberg and the demonstration of the Face2Face technique won the “Best in Show Award” at SIGGRAPH 2016, Emerging Technologies. It has been covered by the New York Times, Washington Post, Spiegel, and other digital newspages.


About the author

Justus Thies

Dr. Justus Thies is working as a postdoctoral researcher at the Technical University of Munich. In September 2017 he joined the Visual Computing Lab of Prof. Dr. Matthias Nießner. Previous, he was a PhD student at the University of Erlangen-Nuremberg under the supervision of Gunther Greiner. He started his PhD studies in 2014 after receiving his Master of Science degree from the University of Erlangen-Nuremberg. During the time as a PhD student he collaborated with other institutes and did internships at Stanford University and the Max-Planck Institute for Informatics. His research focuses on real-time facial performance capturing and expression transfer using commodity hardware. Thus, he is interested in Computer Vision and Computer Graphics, as well as in efficient implementations of optimization techniques, especially on graphics hardware. His publications opened up a new research field – real-time facial reenactment. The achieved quality, efficiency and the reduced hardware requirements of his developed methods raised a lot of attention in academia, industry and media. The dissertation “Face2Face: Real-time Facial Reenactment” of Justus Thies summarizes these publications and discusses the implications of the demonstrated technologies. Recently, he is also working on digital forensic projects that target on detecting forgeries in image and video data.

Literature

1. J. Thies. Face2Face: Real-time Facial Reenactment. Eurographics Graphics Dissertation Online, 2017.10.1145/3292039Search in Google Scholar

2. R. Newcomb et al. KinectFusion: Real-time Dense Surface Mapping and Tracking. Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology, 2011.10.1109/ISMAR.2011.6092378Search in Google Scholar

3. V. Blanz and T. Vetter. A Morphable Model for the Synthesis of 3D Faces. Proceedings of the 26th Annual Conference on Computer Graphics and Interactive Techniques, 1999.10.1145/311535.311556Search in Google Scholar

Received: 2019-01-28
Accepted: 2019-02-02
Published Online: 2019-02-21
Published in Print: 2019-04-24

© 2019 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 26.4.2024 from https://www.degruyter.com/document/doi/10.1515/itit-2019-0006/html
Scroll to top button