A crowdsourced system for robust eye tracking

doi:10.1016/j.jvcir.2019.01.007

Journal of Visual Communication and Image Representation

Volume 60, April 2019, Pages 28-32

https://doi.org/10.1016/j.jvcir.2019.01.007 Get rights and content

Abstract

Eye tracking is widely used in modern intelligent applications, such as HCI, somatosensory game and fatigue driving. Traditional eye tracking system based on Haar-like features or external hardware, which is loss of accuracy and complicated. It is obviously that human gaze point is related to head pose. However, the label of head pose in most dataset is ambiguous. So in this paper, we propose a crowdsourced system which can collect large-scale dataset for eye tracking. For better performance, we leverage head guidance point and random dot instead of fixed dot as the concern when capture frames from camera. And different illumination, poses and persons also considered for robust performance. And we propose a two-phase CNN training strategy for combining head pose and eye angles. The proposed CNN architecture can reduce the overfitting when we train eye tracking models with head pose directly. The experimental results show that our proposed method can perform well in eye tracking.

Introduction

With the development of computer vision, eye tracking is an indispensable technique [3], [33]. Eye tracking can be used in HCI, somatosensory or fatigue driving. HCI is an interface for communication between humans and computers, so eye tracking technique can be easily used for better experiences [31], [32], [33]. Eye tracking aims at considering the direction of gaze, so it is widely used in fatigue driving, which can provide alert when drivers are fatigued because of their changing direction of gaze. Besides, in modern researches, biology and psychology experiments show that human only focus on few objects when they observe a scene or images [5], [7], [9]. The gaze shifting paths (GSPs) can reflect the sequences of saliency regions within an image [2]. As shown in Fig. 1, human tend to focus on the most saliency regions within an image. Eyetracker II is a hardware which can capture human gaze shifting paths when human observe in front of the computer. It has a very high accuracy, but the cumbersome hardware is a limit.

Traditional method for eye tracking always based on Haar-like features. Haar-like feature is widely used in face recognition. Haar-like feature aims at the difference between different regions and consider the difference as the feature. It is low-level feature, so it is loss of accuracy. For example, it is always affected by different illuminations such as dark light. A better method is to use external hardware such as infrared equipment. It can leverage fusion feature of eyes to pursue a better performance. In this way, eye tracking can be achieved in real-time and some applications based on this can be possible. But it is obviously that relying on hardware devices is complicated and expensive.

As far as we know, there is no related dataset in this domain. So in this paper, we propose our dataset collection framework based on crowdsourced system. And a two-phase training strategy is proposed for better performance. In the first phase training, we train head pose and gaze angle respectively, which does not need very precise labels. It undoubtedly reduces the difficulty of collecting data and increases the robustness. In our second phase training, we combine the models of the first phase training to continue finetune. In this phase, the label is gaze point which obtained in our data collecting system. Our two-phase training strategy can reduce overfitting compared with training head pose directly.

The main contributions of our work can be summarized as follows:

•
Most related works demonstrate that training eye tracking model with head pose directly will lead to overfitting. The label of head pose is not easy to obtain. So in order to obtain more accurate result, researches always need precise equipment, which is undoubtedly complicated and expensive. We propose a two-phase training strategy, which does not require very precise labels during the first phase of training.
•
In this paper, we propose a succinct crowdsourced system for collecting dataset for eye tracking. In our implement, we consider most of the environment so our dataset is robust.
•
As far as we know, there is less related dataset for eye tracking. We believe that our work can promote the development of eye tracking.

Section snippets

Related work

In many intelligent systems, eye tracking technique can be used in many domains. Eye tracking can replace the mouse to complete the operation of the computer [11], [13], [22], [23], [24], [25]. And in many computer games, the application of eye tracking will provide better game experiences. Eye gaze system can be used in advertising analysis, fatigue driving. For example, when users browse the webpage, we can record the user's gaze shifting path and analyze the user's attention and time of

The data set

In this section, we propose our framework for eye dataset collection. As shown in Fig. 3, it is our dataset collection platform. In our implement, randomly generated dot is leveraged instead of fixed dot. In this way, large-scale various of eye dataset can be obtained, so our data are diverse. First, a randomly generated dot will be shown in the screen and volunteers are required to point their head at the point where they appeared. Then, around this point, another point was randomly generated.

Dataset collection

In our implement, we invite volunteers to collect eye dataset. In order to obtain robust performance, we set different illumination, persons or wear glasses or not. Our dataset collection is conducted in a PC with a camera.

First, volunteers are required to fixed their head in front of PC, the length $L_{ec}$ between eyes of volunteers and camera is recorded. The angle between eyes and the generated dot is an important information. As shown in Fig. 4, $α$ denotes the angle between the generated dot and

Conclusion

Eye tracking is an indispensable technique in intelligent systems [17], [18], [19], [27], [28], [29], [30]. Traditional eye tracking based on low-level features, the result is always loss of accuracy. In this paper, we propose a data set collected strategy. We leverage randomly generated dot instead of fixed dot in order to obtain more robust result. Based on the data set, we propose a two-phase training strategy for eye tracking. We argue that the two-phase training strategy performs better

Conflict of interest

There is no conflict of interest.

References (34)

Tadas Baltrusaitis, Peter Robinson, Louis-Philippe Morency, OpenFace: an open source facial behavior analysis toolkit,...
Luming Zhang et al.
Spatial-aware object-level saliency prediction by learning graphlet hierarchies
IEEE Trans. Ind. Electron.
(2015)
L. Zhang et al.
Actively learning human gaze shifting paths for semantics-aware photo cropping
IEEE Trans. Image Process.
(2014)
L. Sun et al.
Real-time gaze estimation with online calibration
L. Zhang et al.
Weakly supervised photo cropping
IEEE Trans. Multimedia
(2014)
Mingliang Xu et al.
An efficient method of crowd aggregation computation in public areas
IEEE Trans. Circ. Syst. Video Technol.
(2017)
L. Zhang et al.
Probabilistic graphlet transfer for photo cropping
IEEE Trans. Image Process.
(2013)
G. Cheng et al.
Duplex metric learning for image set classification
IEEE Trans. Image Process.
(2018)
L. Zhang et al.
Fusion of multichannel local and global structural cues for photo aesthetics evaluation
IEEE Trans. Image Process.
(2014)
E. Wood, T. Baltrusaitis, X. Zhang, Y. Sugano, P. Robinson, A. Bulling, Rendering of eyes for eye-shape registration...

L. Zhang et al.

An effective video summarization framework toward handheld devices

IEEE Trans. Ind. Electron.

(2015)

K. Krafka et al.

Eye tracking for everyone

L. Zhang et al.

Discovering discriminative graphlets for aerial image categories recognition

IEEE Trans. Image Process.

(2013)

X. Yao et al.

Revisiting co-saliency detection: a novel approach based on two-stage multi-view spectral rotation co-clustering

IEEE Trans. Image Process.

(2017)

Luming Zhang et al.

Weakly supervised human fixations prediction

IEEE Trans. Cybernet.

(2016)

J. Han et al.

Robust object co-segmentation using background prior

IEEE Trans. Image Process.

(2018)

Luming Zhang et al.

A fine-grained image categorization system by cellet-encoded spatial pyramid modeling

IEEE Trans. Ind. Electron.

(2015)

Cited by (9)

Deep neural networks for low-cost eye tracking
2020, Procedia Computer Science
The paper presents a detailed analysis of modern techniques that can be used to track gaze with a webcam. We present a practical implementation of the most popular methods for tracking gaze. Various models of deep neural networks that can be involved in the process of online gaze monitoring are reviewed. We introduce a new eye-tracking approach where the effectiveness of using a deep learning method is significantly increased. Implementation is in Python where its application is demonstrated by controlling interaction with the computer. Specifically, a dual coordinate system is given for controlling the computer with the help of a gaze. The first set of coordinates-the position of the face relative to the computer, is implemented by detecting color from the infrared LED via the OpenCV library. The second set of coordinates-giving gaze position-is obtained via the YOLO (v3) package. A method of labeling the eyes is given, in which 3 objects are used to track gaze (to the left, to the right, and in the center).
Cycle-GAN for eye-tracking
2022, arXiv
Auditory salience using natural scenes: An online study
2021, Journal of the Acoustical Society of America
A study on VAL platform for 5G network for large-capacity data transmission
2021, Journal of Supercomputing
Enhancing user experience through an HCI research tool. Case study: Cognitive urban planning platform (PUC)
2021, 2021 8th International Conference on eDemocracy and eGovernment, ICEDEG 2021
A Review of the Low-Cost Eye-Tracking Systems for 2010-2020
2021, SSRN

View all citing articles on Scopus

^☆: This article is part of the Special Issue on TIUSM.

View full text

A crowdsourced system for robust eye tracking☆

Abstract

Introduction

Section snippets

Related work

The data set

Dataset collection

Conclusion

Conflict of interest

Spatial-aware object-level saliency prediction by learning graphlet hierarchies

IEEE Trans. Ind. Electron.

Actively learning human gaze shifting paths for semantics-aware photo cropping

IEEE Trans. Image Process.

Real-time gaze estimation with online calibration

Weakly supervised photo cropping

IEEE Trans. Multimedia

An efficient method of crowd aggregation computation in public areas

IEEE Trans. Circ. Syst. Video Technol.

Probabilistic graphlet transfer for photo cropping

IEEE Trans. Image Process.

Duplex metric learning for image set classification

IEEE Trans. Image Process.

Fusion of multichannel local and global structural cues for photo aesthetics evaluation

IEEE Trans. Image Process.

An effective video summarization framework toward handheld devices

IEEE Trans. Ind. Electron.

Eye tracking for everyone

Discovering discriminative graphlets for aerial image categories recognition

IEEE Trans. Image Process.

Revisiting co-saliency detection: a novel approach based on two-stage multi-view spectral rotation co-clustering

IEEE Trans. Image Process.

Weakly supervised human fixations prediction

IEEE Trans. Cybernet.

Robust object co-segmentation using background prior

IEEE Trans. Image Process.

A fine-grained image categorization system by cellet-encoded spatial pyramid modeling

IEEE Trans. Ind. Electron.