Voice Animator: Automatic Lip-Synching in Limited Animation by Audio

Furukawa, Shoichi; Fukusato, Tsukasa; Yamaguchi, Shugo; Morishima, Shigeo

doi:10.1007/978-3-319-76270-8_12

Shoichi Furukawa¹⁶,
Tsukasa Fukusato¹⁸,
Shugo Yamaguchi¹⁶ &
…
Shigeo Morishima¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10714))

Included in the following conference series:

International Conference on Advances in Computer Entertainment

3050 Accesses
4 Citations

Abstract

Limited animation is one of the traditional techniques for producing cartoon animations. Owing to its expressive style, it has been enjoyed around the world. However, producing high quality animations using this limited style is time-consuming and costly for animators. Furthermore, proper synchronization between the voice-actor’s voice and the character’s mouth and lip motion requires well-experienced animators. This is essential because viewers are very sensitive to audio-lip discrepancies. In this paper, we propose a method that automatically creates high-quality limited-style lip-synched animations using audio tracks. Our system can be applied for creating not only the original animations but also dubbed ones independently of languages. Because our approach follows the standard workflow employed in cartoon animation production, our system can successfully assist animators. In addition, users can implement our system as a plug-in of a standard tool for creating animations (Adobe After Effects) and can easily arrange character lip motion to suit their own style. We visually evaluate our results both absolutely and relatively by comparing them with those of previous works. From the user evaluations, we confirm that our algorithms is able to successfully generate more natural audio-mouth synchronizations in limited-style lip-synched animations than previous algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 179.00; Price excludes VAT (USA)

Softcover Book: USD 229.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

coin: conditional inference procedures in a permutation test framework. https://cran.r-project.org/web/packages/coin/index.html. Accessed 22 Oct 2017
Bregler, C., Covell, M., Slaney, M.: Video rewrite: driving visual speech with audio. In: Proceedings of the 24th Annual Conference on Computer Graphics and Interactive Techniques, pp. 353–360. ACM Press/Addison-Wesley Publishing Co. (1997)
Google Scholar
Cao, C., Hou, Q., Zhou, K.: Displaced dynamic expression regression for real-time facial tracking and animation. ACM Trans. Graph. (TOG) 33(4), 43 (2014)
Google Scholar
Chang, Y.J., Ezzat, T.: Transferable videorealistic speech animation. In: Proceedings of the 2005 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, pp. 143–151. ACM (2005)
Google Scholar
Dunbar, D., Humphreys, G.: A spatial data structure for fast poisson-disk sample generation. ACM Trans. Graph. (TOG) 25(3), 503–508 (2006)
Article Google Scholar
Dvorožňák, M., Bénard, P., Barla, P., Wang, O., Sỳkora, D.: Example-based expressive animation of 2D rigid bodies. ACM Trans. Graph 36(4), 10 (2017)
Google Scholar
Ezzat, T., Geiger, G., Poggio, T.: Trainable videorealistic speech animation. ACM Trans. Graph. (TOG) 21(3), 388–398 (2002)
Article Google Scholar
Ishi, C.T., Liu, C., Ishiguro, H., Hagita, N.: Speech-driven lip motion generation for tele-operated humanoid robots. In: Auditory-Visual Speech Processing 2011 (2011)
Google Scholar
Kawamoto, S.I., Yotsukura, T., Anjyo, K., Nakamura, S.: Efficient lip-synch tool for 3D cartoon animation. Comput. Anim. Virtual Worlds 19(34), 247–257 (2008)
Article Google Scholar
Kazi, R.H., Grossman, T., Umetani, N., Fitzmaurice, G.: Motion amplifiers: sketching dynamic illustrations using the principles of 2D animation. In: Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (2016)
Google Scholar
Kitamura, M., Kanamori, Y., Mitani, J., Fukui, Y., Tsuruno, R.: Motion frame omission for cartoon-like effects. In: Proceedings of International Workshop on Advanced Image Technology (IWAIT), pp. 148–152. KSBE (2014)
Google Scholar
Morishima, S., Kuriyama, S., Kawamoto, S., Suzuki, T., Taira, M., Yotsukura, T., Nakamura, S.: Data-driven efficient production of cartoon character animation. In: ACM SIGGRAPH 2007 Sketches, p. 76. ACM (2007)
Google Scholar
Rothauser, E.: IEEE recommended practice for speech quality measurements. IEEE Trans. Audio Electroacoust. 17, 225–246 (1969)
Article Google Scholar
Wang, J., Drucker, S.M., Agrawala, M., Cohen, M.F.: The cartoon animation filter. ACM Trans. Graph. (TOG) 25, 1169–1173 (2006)
Article Google Scholar
Weise, T., Bouaziz, S., Li, H., Pauly, M.: Realtime performance-based facial animation (TOG). ACM Trans. Graph. 30, 77 (2011)
Article Google Scholar
Weise, T., Li, H., Van Gool, L., Pauly, M.: Face/off: live facial puppetry. In: Proceedings of the 2009 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, pp. 7–16. ACM (2009)
Google Scholar

Download references

Acknowledements

This work was supported in part by the Japanese Information-Technology Promotion Agency (IPA), JST ACCEL Grant No. JPMJAC 1602, and JSPS Grant No. 17H06101, Japan.

Author information

Authors and Affiliations

Waseda University, Tokyo, Japan
Shoichi Furukawa & Shugo Yamaguchi
Waseda Research Institute for Science and Engineering, Tokyo, Japan
Shigeo Morishima
The University of Tokyo, Tokyo, Japan
Tsukasa Fukusato

Authors

Shoichi Furukawa
View author publications
You can also search for this author in PubMed Google Scholar
Tsukasa Fukusato
View author publications
You can also search for this author in PubMed Google Scholar
Shugo Yamaguchi
View author publications
You can also search for this author in PubMed Google Scholar
Shigeo Morishima
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shoichi Furukawa .

Editor information

Editors and Affiliations

City, University of London, London, United Kingdom
Adrian David Cheok
University of Tokyo, Tokyo, Japan
Masahiko Inami
NOVA University of Lisbon, Lisbon, Portugal
Teresa Romão

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Furukawa, S., Fukusato, T., Yamaguchi, S., Morishima, S. (2018). Voice Animator: Automatic Lip-Synching in Limited Animation by Audio. In: Cheok, A., Inami, M., Romão, T. (eds) Advances in Computer Entertainment Technology. ACE 2017. Lecture Notes in Computer Science(), vol 10714. Springer, Cham. https://doi.org/10.1007/978-3-319-76270-8_12

Download citation

DOI: https://doi.org/10.1007/978-3-319-76270-8_12
Published: 21 February 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-76269-2
Online ISBN: 978-3-319-76270-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics