A crowdsourced system for robust eye tracking☆
Introduction
With the development of computer vision, eye tracking is an indispensable technique [3], [33]. Eye tracking can be used in HCI, somatosensory or fatigue driving. HCI is an interface for communication between humans and computers, so eye tracking technique can be easily used for better experiences [31], [32], [33]. Eye tracking aims at considering the direction of gaze, so it is widely used in fatigue driving, which can provide alert when drivers are fatigued because of their changing direction of gaze. Besides, in modern researches, biology and psychology experiments show that human only focus on few objects when they observe a scene or images [5], [7], [9]. The gaze shifting paths (GSPs) can reflect the sequences of saliency regions within an image [2]. As shown in Fig. 1, human tend to focus on the most saliency regions within an image. Eyetracker II is a hardware which can capture human gaze shifting paths when human observe in front of the computer. It has a very high accuracy, but the cumbersome hardware is a limit.
Traditional method for eye tracking always based on Haar-like features. Haar-like feature is widely used in face recognition. Haar-like feature aims at the difference between different regions and consider the difference as the feature. It is low-level feature, so it is loss of accuracy. For example, it is always affected by different illuminations such as dark light. A better method is to use external hardware such as infrared equipment. It can leverage fusion feature of eyes to pursue a better performance. In this way, eye tracking can be achieved in real-time and some applications based on this can be possible. But it is obviously that relying on hardware devices is complicated and expensive.
As far as we know, there is no related dataset in this domain. So in this paper, we propose our dataset collection framework based on crowdsourced system. And a two-phase training strategy is proposed for better performance. In the first phase training, we train head pose and gaze angle respectively, which does not need very precise labels. It undoubtedly reduces the difficulty of collecting data and increases the robustness. In our second phase training, we combine the models of the first phase training to continue finetune. In this phase, the label is gaze point which obtained in our data collecting system. Our two-phase training strategy can reduce overfitting compared with training head pose directly.
The main contributions of our work can be summarized as follows:
- •
Most related works demonstrate that training eye tracking model with head pose directly will lead to overfitting. The label of head pose is not easy to obtain. So in order to obtain more accurate result, researches always need precise equipment, which is undoubtedly complicated and expensive. We propose a two-phase training strategy, which does not require very precise labels during the first phase of training.
- •
In this paper, we propose a succinct crowdsourced system for collecting dataset for eye tracking. In our implement, we consider most of the environment so our dataset is robust.
- •
As far as we know, there is less related dataset for eye tracking. We believe that our work can promote the development of eye tracking.
Section snippets
Related work
In many intelligent systems, eye tracking technique can be used in many domains. Eye tracking can replace the mouse to complete the operation of the computer [11], [13], [22], [23], [24], [25]. And in many computer games, the application of eye tracking will provide better game experiences. Eye gaze system can be used in advertising analysis, fatigue driving. For example, when users browse the webpage, we can record the user's gaze shifting path and analyze the user's attention and time of
The data set
In this section, we propose our framework for eye dataset collection. As shown in Fig. 3, it is our dataset collection platform. In our implement, randomly generated dot is leveraged instead of fixed dot. In this way, large-scale various of eye dataset can be obtained, so our data are diverse. First, a randomly generated dot will be shown in the screen and volunteers are required to point their head at the point where they appeared. Then, around this point, another point was randomly generated.
Dataset collection
In our implement, we invite volunteers to collect eye dataset. In order to obtain robust performance, we set different illumination, persons or wear glasses or not. Our dataset collection is conducted in a PC with a camera.
First, volunteers are required to fixed their head in front of PC, the length between eyes of volunteers and camera is recorded. The angle between eyes and the generated dot is an important information. As shown in Fig. 4, denotes the angle between the generated dot and
Conclusion
Eye tracking is an indispensable technique in intelligent systems [17], [18], [19], [27], [28], [29], [30]. Traditional eye tracking based on low-level features, the result is always loss of accuracy. In this paper, we propose a data set collected strategy. We leverage randomly generated dot instead of fixed dot in order to obtain more robust result. Based on the data set, we propose a two-phase training strategy for eye tracking. We argue that the two-phase training strategy performs better
Conflict of interest
There is no conflict of interest.
References (34)
- Tadas Baltrusaitis, Peter Robinson, Louis-Philippe Morency, OpenFace: an open source facial behavior analysis toolkit,...
- et al.
Spatial-aware object-level saliency prediction by learning graphlet hierarchies
IEEE Trans. Ind. Electron.
(2015) - et al.
Actively learning human gaze shifting paths for semantics-aware photo cropping
IEEE Trans. Image Process.
(2014) - et al.
Real-time gaze estimation with online calibration
- et al.
Weakly supervised photo cropping
IEEE Trans. Multimedia
(2014) - et al.
An efficient method of crowd aggregation computation in public areas
IEEE Trans. Circ. Syst. Video Technol.
(2017) - et al.
Probabilistic graphlet transfer for photo cropping
IEEE Trans. Image Process.
(2013) - et al.
Duplex metric learning for image set classification
IEEE Trans. Image Process.
(2018) - et al.
Fusion of multichannel local and global structural cues for photo aesthetics evaluation
IEEE Trans. Image Process.
(2014) - E. Wood, T. Baltrusaitis, X. Zhang, Y. Sugano, P. Robinson, A. Bulling, Rendering of eyes for eye-shape registration...
An effective video summarization framework toward handheld devices
IEEE Trans. Ind. Electron.
Eye tracking for everyone
Discovering discriminative graphlets for aerial image categories recognition
IEEE Trans. Image Process.
Revisiting co-saliency detection: a novel approach based on two-stage multi-view spectral rotation co-clustering
IEEE Trans. Image Process.
Weakly supervised human fixations prediction
IEEE Trans. Cybernet.
Robust object co-segmentation using background prior
IEEE Trans. Image Process.
A fine-grained image categorization system by cellet-encoded spatial pyramid modeling
IEEE Trans. Ind. Electron.
Cited by (9)
Deep neural networks for low-cost eye tracking
2020, Procedia Computer ScienceCycle-GAN for eye-tracking
2022, arXivAuditory salience using natural scenes: An online study
2021, Journal of the Acoustical Society of AmericaA study on VAL platform for 5G network for large-capacity data transmission
2021, Journal of SupercomputingEnhancing user experience through an HCI research tool. Case study: Cognitive urban planning platform (PUC)
2021, 2021 8th International Conference on eDemocracy and eGovernment, ICEDEG 2021
- ☆
This article is part of the Special Issue on TIUSM.