ABSTRACT
The modern deep neural network has allowed an applicable level of speech-driven facial animation, simulating natural and precise 3D animation from speech data. Regardless, many of the works show weakness in drastic emotional expression and flexibility of the animation. In this work, we introduce emotion guided speech-driven facial animation, simultaneously proceeding with classification and regression from the speech data to generate a controllable level of evident emotional expression on facial animation. Performance using our method shows reasonable expressiveness of facial emotion with controllable flexibility. Extensive experiments indicate that our method generates more expressive facial animation with controllable flexibility compared to previous approaches.
Supplemental Material
- Daniel Cudeiro, Timo Bolkart, Cassidy Laidlaw, Anurag Ranjan, and Michael J. Black. 2019. Capture, Learning, and Synthesis of 3D Speaking Styles. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarCross Ref
- Tero Karras, Timo Aila, Samuli Laine, Antti Herva, and Jaakko Lehtinen. 2017. Audio-Driven Facial Animation by Joint End-to-End Learning of Pose and Emotion. ACM Trans. Graph. 36, 4, Article 94 (July 2017), 12 pages. https://doi.org/10.1145/3072959.3073658Google ScholarDigital Library
- Youngsoo Kim, Shounan An, Youngbak Jo, Seungje Park, Shindong Kang, Insoo Oh, and Duke Donghyun Kim. 2019. Multi-Task Audio-Driven Facial Animation. In ACM SIGGRAPH 2019 Posters (Los Angeles, California) (SIGGRAPH ’19). Association for Computing Machinery, New York, NY, USA, Article 19, 2 pages. https://doi.org/10.1145/3306214.3338541Google ScholarDigital Library
- Hai Xuan Pham, Yuting Wang, and Vladimir Pavlovic. 2018. End-to-End Learning for 3D Facial Animation from Speech. In Proceedings of the 20th ACM International Conference on Multimodal Interaction (Boulder, CO, USA) (ICMI ’18). Association for Computing Machinery, New York, NY, USA, 361–365. https://doi.org/10.1145/3242969.3243017Google ScholarDigital Library
- Riley Swanson, Steven R. Livingstone, and Frank A. Russo. 2019. RAVDESS Facial Landmark Tracking. Funding Information Undergraduate Stipends and Expenses (USE) grant, University of Wisconsin - River Falls.Google Scholar
Recommendations
Speech driven facial animation
PUI '01: Proceedings of the 2001 workshop on Perceptive user interfacesThe results reported in this article are an integral part of a larger project aimed at achieving perceptually realistic animations, including the individualized nuances, of three-dimensional human faces driven by speech. The audiovisual system that has ...
Human-Computer Interaction Using Emotion Recognition from Facial Expression
EMS '11: Proceedings of the 2011 UKSim 5th European Symposium on Computer Modeling and SimulationThis paper describes emotion recognition system based on facial expression. A fully automatic facial expression recognition system is based on three steps: face detection, facial characteristic extraction and facial expression classification. We have ...
Interactive facial animation with deep neural networks
Creating realistic animations of human faces is still a challenging task in computer graphics. While computer graphics (CG) models capture much variability in a small parameter vector, they usually do not meet the necessary visual quality. This is due to ...
Comments