abstract

Towards a data-driven framework for realistic self-organized virtual humans: coordinated head and eye movements

Authors:

Reynold BaileyAuthors Info & Claims

ETRA '19: Proceedings of the 11th ACM Symposium on Eye Tracking Research & Applications

Article No.: 55, Pages 1 - 3

https://doi.org/10.1145/3314111.3322874

Published: 25 June 2019 Publication History

Abstract

Driven by significant investments from the gaming, film, advertising, and customer service industries among others, efforts across many different fields are converging to create realistic representations of humans that look like (computer graphics), sound like (natural language generation), move like (motion capture), and reason like (artificial intelligence) real humans. The ultimate goal of this work is to push the boundaries even further by exploring the development of realistic self-organized virtual humans that are capable of demonstrating coordinated behaviors across different modalities. Eye movements, for example, may be accompanied by changes in facial expression, head orientation, posture, gait properties, or speech. Traditionally however, these modalities are captured and modeled separately and this disconnect contributes to the well-known uncanny valley phenomenon. We focus initially on facial modalities, in particular, coordinated eye and head movements (and eventually facial expressions), but our proposed data-driven framework will be able to accommodate other modalities as well. transfer [Laine et al. 2017]. Despite these advances, the resulting renderings or animations are often still distinguishable from a real human, sometimes in unsettling ways - the so called uncanny valley phenomenon [Mori et al. 2012]. We argue that the traditional approach of capturing and modeling various human modalities separately contributes this effect. In this work, we focus on capturing, transferring, and generating realistic coordinated facial modalities (eye movements, head movements, and eventually facial expressions). We envision a flexible framework that can be extended to accommodate other modalities as well.

References

[1]

Michael Feffer, Rosalind W Picard, et al. 2018. A Mixture of Personalized Experts for Human Affect Estimation. In International Conference on Machine Learning and Data Mining in Pattern Recognition. Springer, 316--330.

Digital Library

[2]

Alexander Gepperth and Barbara Hammer. 2016. Incremental learning algorithms and applications. In European Symposium on Artificial Neural Networks (ESANN).

[3]

Alex Graves. 2013. Generating sequences with recurrent neural networks. arXiv preprint arXiv:1308.0850 (2013).

[4]

Kolja Kahler, J"org Haber, and Hans-Peter Seidel. 2001. Geometry-based Muscle Modeling for Facial Animation. In Proceedings of the Graphics Interface 2001 Conference, June 7-9 2001, Ottawa, Ontario, Canada. 37--46. http://graphicsinterface.org/wp-content/uploads/gi2001-5.pdf

Digital Library

[5]

Max Kochurov, Timur Garipov, Dmitry Podoprikhin, Dmitry Molchanov, Arsenii Ashukha, and Dmitry Vetrov. 2018. Bayesian Incremental Learning for Deep Neural Networks. arXiv preprint arXiv:1802.07329 (2018).

[6]

Rakshit Kothari, Zhizhuo Yang, Kamran Binaee, Reynold Bailey, Christopher Kanan, Jeff Pelz, and Gabriel Diaz. 2018. Classification and Statistics of Gaze In World Events. Journal of Vision 18, 10 (2018), 376--376.

[7]

Samuli Laine, Tero Karras, Timo Aila, Antti Herva, Shunsuke Saito, Ronald Yu, Hao Li, and Jaakko Lehtinen. 2017. Production-level Facial Performance Capture Using Deep Convolutional Neural Networks. In Proceedings of the ACM SIGGRAPH / Eurographics Symposium on Computer Animation (SCA '17). ACM, New York, NY, USA, Article 10, 10 pages.

Digital Library

[8]

Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, et al. 2015. Human-level control through deep reinforcement learning. Nature 518, 7540 (2015), 529.

[9]

Masahiro Mori, Karl F MacDorman, and Norri Kageki. 2012. The uncanny valley {from the field}. IEEE Robotics & Automation Magazine 19, 2 (2012), 98--100.

[10]

Andrew Y. Ng and Stuart Russell. 2000. Algorithms for Inverse ReinforcementLearning., 23--42 pages.

Digital Library

[11]

Meyke Roosink, Nicolas Robitaille, Bradford J McFadyen, Luc J Hébert, Philip L Jackson, Laurent J Bouyer, and Catherine Mercier. 2015. Real-time modulation of visual feedback on human full-body movements in a virtual mirror: development and proof-of-concept. Journal of neuroengineering and rehabilitation 12, 1 (2015), 2.

[12]

Shunsuke Saito, Liwen Hu, Chongyang Ma, Hikaru Ibayashi, Linjie Luo, and Hao Li. 2018. 3D Hair Synthesis Using Volumetric Variational Autoencoders. In SIGGRAPH Asia 2018 Technical Papers (SIGGRAPH Asia '18). ACM, New York, NY, USA, Article 208, 12 pages.

Digital Library

[13]

David Silver, Aja Huang, Chris J Maddison, Arthur Guez, Laurent Sifre, George Van Den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, et al. 2016. Mastering the game of Go with deep neural networks and tree search. nature 529, 7587 (2016), 484.

[14]

David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel, et al. 2018. A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. Science 362, 6419 (2018), 1140--1144.

[15]

David Silver, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, Adrian Bolton, et al. 2017. Mastering the game of go without human knowledge. Nature 550, 7676 (2017), 354.

[16]

Shugo Yamaguchi, Shunsuke Saito, Koki Nagano, Yajie Zhao, Weikai Chen, Kyle Olszewski, Shigeo Morishima, and Hao Li. 2018. High-fidelity Facial Reflectance and Geometry Inference from an Unconstrained Image. ACM Trans.Graph. 37, 4, Article 162 (July 2018), 14 pages.

Digital Library

[17]

Raimondas Zemblys, Diederick C Niehorster, and Kenneth Holmqvist. 2018. gazeNet: End-to-end eye-movement event detection with deep neural networks. Behavior research methods (2018).

[18]

Ruohan Zhang, Zhuode Liu, Luxin Zhang, Jake A. Whritner, Karl S. Muller, Mary M. Hayhoe, and Dana H. Ballard. 2018a. AGIL: Learning Attention from Human for Visuomotor Tasks. (2018), 1--17. arXiv:1806.03960

[19]

Ruohan Zhang, Shun Zhang, Matthew H Tong, Yuchen Cui, A Constantin, Dana H Ballard, and Mary M Hayhoe. 2018b. Modeling sensory-motor decisions in natural behavior. bioRxiv (2018), 1--27.

Index Terms

Towards a data-driven framework for realistic self-organized virtual humans: coordinated head and eye movements
1. Computing methodologies
  1. Computer graphics
    1. Animation
      1. Procedural animation
    2. Graphics systems and interfaces
      1. Perception
      2. Virtual reality
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks

Recommendations

Eye, Head and Torso Coordination During Gaze Shifts in Virtual Reality

Humans perform gaze shifts naturally through a combination of eye, head and body movements. Although gaze has been long studied as input modality for interaction, this has previously ignored the coordination of the eyes, head and body. This article ...
Classifying Head Movements to Separate Head-Gaze and Head Gestures as Distinct Modes of Input
CHI '23: Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems

Head movement is widely used as a uniform type of input for human-computer interaction. However, there are fundamental differences between head movements coupled with gaze in support of our visual system, and head movements performed as gestural ...
Snap, Pursuit and Gain: Virtual Reality Viewport Control by Gaze
CHI '24: Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems

Head-mounted displays let users explore virtual environments through a viewport that is coupled with head movement. In this work, we investigate gaze as an alternative modality for viewport control, enabling exploration of virtual worlds with less head ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ETRA '19: Proceedings of the 11th ACM Symposium on Eye Tracking Research & Applications

June 2019

623 pages

ISBN:9781450367097

DOI:10.1145/3314111

Conference Chairs:
Krzysztof Krejtz
SWPS University, Poland
,
Bonita Sharif
University of Nebraska-Lincoln

Copyright © 2019 Owner/Author.

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 June 2019

Check for updates

Author Tags

Qualifiers

Abstract

Conference

ETRA '19

Sponsor:

ETRA '19: 2019 Symposium on Eye Tracking Research and Applications

June 25 - 28, 2019

Colorado, Denver

Acceptance Rates

Overall Acceptance Rate 69 of 137 submissions, 50%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
124
Total Downloads

Downloads (Last 12 months)5
Downloads (Last 6 weeks)0

Reflects downloads up to 12 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten