ABSTRACT
The existing eye trackers typically require an explicit personal calibration procedure to estimate subject-dependent eye parameters. Despite efforts in simplifying the calibration process, such a calibration process remains unnatural and bothersome, in particular for users of personal and mobile devices. To alleviate this problem, we introduce a technique that can eliminate explicit personal calibration. Based on combining a new calibration procedure with the eye fixation prediction, the proposed method performs implicit personal calibration without active participation or even knowledge of the user. Specifically, different from traditional deterministic calibration procedure that minimizes the differences between the predicted eye gazes and the actual eye gazes, we introduce a stochastic calibration procedure that minimizes the differences between the probability distribution of the predicted eye gaze and the distribution of the actual eye gaze. Furthermore, instead of using saliency map to approximate eye fixation distribution, we propose to use a regression based deep convolutional neural network (RCNN) that specifically learns image features to predict eye fixation. By combining the distribution based calibration with the deep fixation prediction procedure, personal eye parameters can be estimated without explicit user collaboration. We apply the proposed method to both 2D regression-based and 3D model-based eye gaze tracking methods. Experimental results show that the proposed method outperforms other implicit calibration methods and achieve comparable results to those that use traditional explicit calibration methods.
- Alnajar, F., Gevers, T., Valenti, R., and Ghebreab, S. 2013. Calibration-free gaze estimation using human gaze patterns.Google Scholar
- Beymer, D., and Flickner, M. 2003. Eye gaze tracking using an active stereo head. IEEE Conference in Computer Vision and Pattern Recognition.Google Scholar
- Cerf, M., Harel, J., Einhäuser, W., and Koch, C. 2008. Predicting human gaze using low-level saliency combined with face detection. In Advances in neural information processing systems.Google Scholar
- Cerf, M., Frady, E. P., and Koch, C. 2009. Faces and text attract gaze independent of the task: Experimental data and computer model. Journal of vision.Google ScholarCross Ref
- Chen, J., and Ji, Q. 2011. Probabilistic gaze estimation without active personal calibration. IEEE Conference on Computer Vision and Pattern Recognition. Google ScholarDigital Library
- Chen, J., and Ji, Q. 2014. A probabilistic approach to online eye gaze tracking without personal calibration. IEEE Transactions on Image Processing.Google Scholar
- Chen, J., Tong, Y., Gary, W., and Ji, Q. 2008. A robust 3d eye gaze tracking system using noise reduction. Proceedings of the 2008 symposium on Eye tracking research and applications. Google ScholarDigital Library
- Einhäuser, W., Spain, M., and Perona, P. 2008. Objects predict fixations better than early saliency. Journal of Vision.Google ScholarCross Ref
- Fukushima, K. 1980. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological cybernetics.Google Scholar
- Funes Mora, K., and Odobez, J. 2014. Geometric generative gaze estimation (g3e) for remote rgb-d cameras. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Google ScholarDigital Library
- Guestrin, E. D., and Eizenman, M. 2006. General theory of remote gaze estimation using the pupil center and corneal reflections. IEEE Transcations on Biomedical Engineering.Google Scholar
- Guestrin, E. D., and Eizenman, M. 2008. Remote point-of-gaze estimation requiring a single-point calibration for applications with infants. Proceedings of the 2008 symposium on Eye tracking research and applications. Google ScholarDigital Library
- Hansen, D. W., and Ji, Q. 2010. In the eye of the beholder: A survey of models for eyes and gaze. IEEE Transactions on Pattern Analysis and Machine Intelligence. Google ScholarDigital Library
- Harel, J., Koch, C., and Perona, P. 2006. Graph-based visual saliency. NIPS.Google Scholar
- Harel, J., Koch, C., and Perona, P. 2006. Graph-based visual saliency. In Advances in neural information processing systems, 545--552.Google Scholar
- Hou, X., and Zhang, L. 2007. Saliency detection: A spectral residual approach. In Computer Vision and Pattern Recognition, 2007. CVPR'07. IEEE Conference on.Google Scholar
- Itti, L., and Koch, C. 2000. A saliency-based search mechanism for overt and covert shifts of visual attention. Vision research.Google Scholar
- Itti, L., Koch, C., and Niebur, E. 1998. A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on pattern analysis and machine intelligence 20, 11, 1254--1259. Google ScholarDigital Library
- Judd, T., Ehinger, K., Durand, F., and Torralba, A. 2009. Learning to predict where humans look. IEEE 12th International Conference on Computer Vision.Google Scholar
- Judd, T., Durand, F., and Torralba, A. 2012. A benchmark of computational models of saliency to predict human fixations. In MIT Technical Report.Google Scholar
- Kienzle, W., Franz, M. O., Schölkopf, B., and Wichmann, F. A. 2009. Center-surround patterns emerge as optimal predictors for human saccade targets. Journal of Vision 9, 5, 7.Google ScholarCross Ref
- Krizhevsky, A., Sutskever, I., and Hinton, G. E. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, 1097--1105.Google Scholar
- Kullback, S., and Leibler, R. A. 1951. On information and sufficiency. Ann. Math. Statist. 22, 1 (03), 79--86.Google Scholar
- Kümmerer, M., Theis, L., and Bethge, M. 2014. Deep gaze i: Boosting saliency prediction with feature maps trained on imagenet. arXiv preprint arXiv:1411.1045.Google Scholar
- Le, Q. V. 2013. Building high-level features using large scale unsupervised learning. In Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on.Google Scholar
- LeCun, Y. 2012. Learning invariant feature hierarchies. In Computer vision--ECCV 2012. Workshops and demonstrations, Springer, 496--505. Google ScholarDigital Library
- Lee, H., Grosse, R., Ranganath, R., and Ng, A. Y. 2009. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In Proceedings of the 26th Annual International Conference on Machine Learning. Google ScholarDigital Library
- Lin, Y., Kong, S., Wang, D., and Zhuang, Y. 2014. Saliency detection within a deep convolutional architecture. In Cognitive Computing for Augmented Human Intelligence Workshop, in conduction with AAAI.Google Scholar
- Lu, F., Sugano, Y., Okabe, T., and Sato, Y. 2011. Inferring human gaze from appearance via adaptive linear regression. In Proc. International Conference on Computer Vision. Google ScholarDigital Library
- Lu, F., Sugano, Y., Okabe, T., and Sato, Y. 2012. Head pose-free appearance-based gaze sensing via eye image synthesis. International Conference on Pattern Recognition.Google Scholar
- Maio, W., Chen, J., and Ji, Q. 2011. Constraint-based gaze estimationwithout active calibration. In Automatic Face and Gesture Recognition and Workshops (FG 2011), 2011 IEEE International Conference on.Google Scholar
- Mancas, M., and Pirri, F. adn Pizzoli, M. 2011. From saliency to eye gaze: embodied visual selection for a pan-tilt-based robotic head. LNCS Series from Proceedings of the 7th International Symposium on Visual Computing (ISVC). Google ScholarDigital Library
- Mason, M., Hood, B., and Macrae, C. 2004. Look into my eyes: Gaze direction and person memory. Memory.Google Scholar
- Model, D., and Eizenman, M. 2010. An automatic personal calibration procedure for advanced gaze estimation systems. IEEE Transactions on Biomedical Engineering.Google ScholarCross Ref
- Morimoto, C. H., and Mimica, M. R. 2005. Eye gaze tracking techniques for interactive applications. Computer Vision and Image Understanding, Special Issue on Eye Detection and Tracking. Google ScholarDigital Library
- Nuthmann, A., and Henderson, J. M. 2010. Object-based attentional selection in scene viewing. Journal of vision.Google ScholarCross Ref
- Olshausen, B. A., and Field, D. J. 1997. Sparse coding with an overcomplete basis set: A strategy employed by v1? Vision research.Google Scholar
- Olshausen, B. A., et al. 1996. Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature.Google Scholar
- Riche, N., Duvinage, M., Mancas, M., Gosselin, B., and Dutoit, T. 2013. Saliency and human fixations: State-of-the-art and study of comparison metrics. In Computer Vision (ICCV), 2013 IEEE International Conference on. Google ScholarDigital Library
- Riesenhuber, M., and Poggio, T. 1999. Hierarchical models of object recognition in cortex. Nature neuroscience.Google Scholar
- Rosenholtz, R. 1999. A simple saliency model predicts a number of motion popout phenomena. Vision research.Google Scholar
- Serre, T., Wolf, L., Bileschi, S., Riesenhuber, M., and Poggio, T. 2007. Robust object recognition with cortex-like mechanisms. Pattern Analysis and Machine Intelligence, IEEE Transactions on. Google ScholarDigital Library
- Shen, C., and Zhao, Q. 2014. Learning to predict eye fixations for semantic contents using multi-layer sparse network. Neurocomputing.Google Scholar
- Shih, S. W., and Liu, J. 2004. A novel approach to 3-d gaze tracking using stereo cameras. IEEE Transactions on Systems, Man and Cybernetics, PartB. Google ScholarDigital Library
- Sugano, Y., Matsushita, Y., Sato, Y., and Koike, H. 2007. An incremental learning method for unconstrained gaze estimation. In in European Conference on Computer Vision. Google ScholarDigital Library
- Sugano, Y., Matsushita, Y., and Sato, Y. 2010. Calibration-free gaze sensing using saliency maps. IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
- Sugano, Y., Matsushita, Y., and Sato, Y. 2013. Appearance-based gaze estimation using visual saliency. IEEE Trans. Pattern Analysis and Machine Intelligence. Google ScholarDigital Library
- Sugano, Y., Matsushita, Y., and Sato, Y. 2014. Learning-by-synthesis for appearance-based 3d gaze estimation. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Google ScholarDigital Library
- Tan, K. H., Kriegman, D., and Ahuja, N. 2002. Appearance-based eye gaze estimation. In Proc. 6th IEEE Workshop on Applications of Computer Vision. Google ScholarDigital Library
- Tobii technology. http://www.tobii.com/en/eye-experience/.Google Scholar
- Treisman, A. M., and Gelade, G. 1980. A feature-integration theory of attention. Cognitive psychology 12, 1, 97--136.Google Scholar
- Valenti, R., Sebe, N., and Gevers, T. 2011. What are you looking at? improving visual gaze estimation by saliency. Int J of Computer Vision. Google ScholarDigital Library
- Vig, E., Dorr, M., and Cox, D. 2014. Large-scale optimization of hierarchical features for saliency prediction in natural images. In Computer Vision and Pattern Recognition, 2014. CVPR'14. IEEE Conference on. Google ScholarDigital Library
- Wen, S., Han, J., Zhang, D., and Guo, L. 2014. Saliency detection based on feature learning using deep boltzmann machines. In ICME, 1--6.Google Scholar
- Williams, O., Blake, A., and Cipolla, R. 2006. Sparse and semi-supervised visual mapping with the s3gp. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Google ScholarDigital Library
- Xu, J., Jiang, M., Wang, S., Kankanhalli, M. S., and Zhao, Q. 2014. Predicting human gaze beyond pixels. Journal of vision.Google ScholarCross Ref
- Yamins, D. L., Hong, H., Cadieu, C. F., Solomon, E. A., Seibert, D., and DiCarlo, J. J. 2014. Performance-optimized hierarchical models predict neural responses in higher visual cortex. Proceedings of the National Academy of Sciences, 201403112.Google ScholarCross Ref
- Zhao, Q., and Koch, C. 2011. Learning a saliency map using fixated locations in natural scenes. Journal of vision.Google ScholarCross Ref
- Zhu, Z., and Ji, Q. 2004. Eye and gaze tracking for interactive graphic display. Machine Vision and Applications. Google ScholarDigital Library
- Zhu, Z., Ji, Q., and Bennett, K. P. 2006. Nonlinear eye gaze mapping function estimation via support vector regression. International Conference on Pattern Recognition. Google ScholarDigital Library
Index Terms
- Deep eye fixation map learning for calibration-free eye gaze tracking
Recommendations
EFG-Net: A Unified Framework for Estimating Eye Gaze and Face Gaze Simultaneously
Pattern Recognition and Computer VisionAbstractGaze is of vital importance for understanding human purpose and intention. Recent works have gained tremendous progress in appearance-based gaze estimation. However, all these works deal with eye gaze estimation or face gaze estimation separately, ...
Calibration-free gaze tracking using a binocular 3D eye model
CHI EA '09: CHI '09 Extended Abstracts on Human Factors in Computing SystemsThis paper presents a calibration-free method for estimating the point of gaze (POG) on a display by using two pairs of stereo cameras. By using one pair of cameras and two light sources, the optical axis of the eye and the position of the center of the ...
Eye, Head and Torso Coordination During Gaze Shifts in Virtual Reality
Humans perform gaze shifts naturally through a combination of eye, head and body movements. Although gaze has been long studied as input modality for interaction, this has previously ignored the coordination of the eyes, head and body. This article ...
Comments