ABSTRACT
In this paper, we address gaze estimation under practical and challenging conditions. Multi-view and multi-modal learning have been considered useful for various complex tasks; however, an in-depth analysis or a large-scale dataset on multi-view, multi-modal gaze estimation under a long-distance setup with a low illumination is still very limited. To address these limitations, first, we construct a dataset of images captured under challenging conditions. And we propose a simple deep learning architecture that can handle multi-view multi-modal data for gaze estimation. Finally, we conduct a performance evaluation of the proposed network with the constructed dataset to understand the effects of multiple views of a user and multi-modality (RGB, depth, and infrared). We report various findings from our preliminary experimental results and expect this would be helpful for gaze estimation studies to deal with challenging conditions.
- National Institute on Aging. 2019. Social isolation, loneliness in older people pose health risks. Retrieved January 5, 2020 from https://www.nia.nih.gov/news/social-isolationloneliness-older-people-pose-health-risksGoogle Scholar
- Xucong Zhang, Yusuke Sugano, Mario Fritz, and Andreas Bulling. 2019. MPIIGaze: Real-World Dataset and Deep Appearance-Based Gaze Estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence 41, 1: 162--175.Google ScholarDigital Library
- Yusuke Sugano, Yasuyuki Matsushita, and Yoichi Sato. 2014. Learning-by-synthesis for appearance-based 3D gaze estimation. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE Computer Society, 1821--1828. https://doi.org/10.1109/CVPR.2014.235Google ScholarDigital Library
- Tobias Fischer, Hyung Jin Chang, and Yiannis Demiris. 2018. RT-GENE: Real-time eye gaze estimation in natural environments. In proceedings of the European Conference on Computer Vision (ECCV '18), 334--352. https://doi.org/10.1007/978--3-030-01249--6_21Google ScholarDigital Library
- Petr Kellnhofer, Adria Recasens, Simon Stent, Wojciech Matusik, and Antonio Torralba. 2019. Gaze360: Physically Unconstrained Gaze Estimation in the Wild. arXiv:1910.10088Google Scholar
- Benoit Massé, Stéphane Lathuilière, Pablo Mesejo, and Radu Horaud. 2019. Extended Gaze Following: Detecting Objects in Videos Beyond the Camera Field of View. arXiv:1902.10953Google Scholar
- Dongze Lian, Lina Hu, Weixin Luo, Yanyu Xu, and Lixin Duan. 2019. Multiview Multitask Gaze Estimation With Deep Convolutional Neural Networks. IEEE Transactions on Neural Networks and Learning Systems 30, 10: 3010--3023.Google ScholarCross Ref
- Jie Shiou Tsai and Chang Hong Lin. 2018. Gaze direction estimation using only a depth camera. In Proceedings of the International Conference on Intelligent Green Building and Smart Grid (IGBSG), Institute of Electrical and Electronics Engineers Inc., 1--2. https://doi.org/10.1109/IGBSG.2018.8393539Google ScholarCross Ref
- Kai Su, Dongdong Yu, Zhenqi Xu, Xin Geng, and Changhu Wang. 2019. Multi-Person Pose Estimation with Enhanced Channel-wise and Spatial Information. arXiv:1905.03466Google Scholar
- Hao Tang, Dan Xu, Yan Yan, Jason J.Corso, Philip H.S. Torr, and Nicu Sebe. 2020. Multi-Channel Attention Selection GANs for Guided Image-toImage Translation. arXiv:2002.01048Google Scholar
- Chaoqun Hong, Jun Yu, Jian Zhang, Xiongnan Jin, and KYong-Ho Lee. 2018. Multimodal Face-Pose Estimation with Multitask Manifold Deep Learning. IEEE Transactions on Industrial Informatics 15, 7:3952--3961. https://doi.org/10.1109/TII.2018.2884211Google ScholarCross Ref
Index Terms
- A Preliminary Study on Performance Evaluation of Multi-View Multi-Modal Gaze Estimation under Challenging Conditions
Recommendations
Robust eye contact detection in natural multi-person interactions using gaze and speaking behaviour
ETRA '18: Proceedings of the 2018 ACM Symposium on Eye Tracking Research & ApplicationsEye contact is one of the most important non-verbal social cues and fundamental to human interactions. However, detecting eye contact without specialised eye tracking equipment poses significant challenges, particularly for multiple people in real-world ...
Eye Gaze for Spoken Language Understanding in Multi-modal Conversational Interactions
ICMI '14: Proceedings of the 16th International Conference on Multimodal InteractionWhen humans converse with each other, they naturally amalgamate information from multiple modalities (i.e., speech, gestures, speech prosody, facial expressions, and eye gaze). This paper focuses on eye gaze and its combination with speech. We develop a ...
Gaze from Head: Gaze Estimation Without Observing Eye
Pattern RecognitionAbstractWe propose a gaze estimation method not from eye observation but from head motion. This proposed method is based on physiological studies about the eye-head coordination, and the gaze direction is estimated from observation of head motion by using ...
Comments