Abstract
Eye-tracking and head-tracking techniques have been applied in many fields, including human–computer interaction, gaming, virtual reality (VR), and the medical. In these applications, users must wear special hardware devices such as eye trackers and head-mounted devices. However, these devices are high-priced, operating them may be complicated, and users may feel uncomfortable wearing them. Then, how can we track eye movements and head movements in real time without these devices? In this paper, we present a real-time camera-based gaze-tracking system that provides two interactive modes: eye gaze and head gaze. The system uses the same calibration procedures to project the gaze directions of the eyes or head to the screen coordinates. Then, we designed a 9-point circular interface to examine the accuracy. Eye gaze and head gaze achieved a visual angle error of 1.76 and 2.65 degrees, respectively. They were comparable to commercial eye-trackers. We also applied the system to a game and verified its effectiveness in the realm of interaction by analyzing the user experience and game score under different interactive modes. The experimental results showed users could get a similar score as the keyboard using eye gaze and feel more immersive under head gaze. Our findings can help to provide more funny choices for users to interact with computers.
Similar content being viewed by others
Data availability
The data that support the findings of this study are available from the corresponding author upon reasonable request.
References
Turner, J., Velloso, E., Gellersen, H., Sundstedt, V.: Eyeplay: applications for gaze in games. In: Proceedings of the First ACM SIGCHI Annual Symposium on Computer-Human Interaction in Play, pp. 465–468 (2014). https://doi.org/10.1145/2658537.2659016
Elmadjian, C., Morimoto, C.: Gazebar: Exploiting the midas touch in gaze interaction, pp. 1–7 (2021). https://doi.org/10.1145/3411763.3451703
Rudi, D., Kiefer, P., Giannopoulos, I., Martin, R.: Gaze-based interactions in the cockpit of the future: a survey. J. Multimodal User Interfaces (2019). https://doi.org/10.1007/s12193-019-00309-8
Ivaldi, S., Anzalone, S., Rousseau, W., Sigaud, O., Chetouani, M.: Robot initiative in a team learning task increases the rhythm of interaction but not the perceived engagement. Front. Neurorobot. 8, 5 (2014). https://doi.org/10.3389/fnbot.2014.00005
Ferreira Duarte, N., Raković, M., Marques, J., Santos-Victor, J.: Action alignment from gaze cues in human-human and human-robot interaction: Munich, Germany, September 8–14, 2018. Proc. Part III, 197–212 (2019). https://doi.org/10.1007/978-3-030-11015-4_17
Clifford, R., Tuanquin, N.M., Lindeman, R.: Jedi forceextension: Telekinesis as a virtual reality interaction metaphor, pp. 239–240 (2017). https://doi.org/10.1109/3DUI.2017.7893360
Tadano, K., Kawashima, K.: A pneumatic laparoscope holder controlled by head movement. Int. J. Med. Robot. Comput. Assist. Surg. (2014). https://doi.org/10.1002/rcs.1606
Brewster, S., Lumsden, J., Bell, M., Hall, M., Tasker, S.: Multimodal ’eyes-free’ interaction techniques for wearable devices, pp. 473–480 (2003). https://doi.org/10.1145/642611.642694
Chhimpa, G., Kumar, A., Garhwal, S.: Dhiraj: development of a real-time eye movement-based computer interface for communication with improved accuracy for disabled people under natural head movements. J. Real-Time Image Process. (2023). https://doi.org/10.1007/s11554-023-01336-1
Qian, Y.Y., Teather, R.: The eyes don’t have it: an empirical comparison of head-based and eye-based selection in virtual reality, pp. 91–98 (2017). https://doi.org/10.1145/3131277.3132182
Deng, C.-L., Tian, C.-Y., Kuai, S.: A combination of eye-gaze and head-gaze interactions improves efficiency and user experience in an object positioning task in virtual environments. Appl. Ergon. 103, 103785 (2022). https://doi.org/10.1016/j.apergo.2022.103785
Pathmanathan, N., Becher, M., Rodrigues, N., Reina, G., Ertl, T., Weiskopf, D., Sedlmair, M.: Eye vs. head: comparing gaze methods for interaction in augmented reality, pp. 1–5 (2020). https://doi.org/10.1145/3379156.3391829
Bonino, D., Castellina, E., Corno, F., De Russis, L.: Dogeye: controlling your home with eye interaction. Interact. Comput. 23, 484–498 (2011). https://doi.org/10.1016/j.intcom.2011.06.002
Kocur, M., Dechant, M., Lankes, M., Wolff, C., Mandryk, R.: Eye caramba: Gaze-based assistance for virtual reality aiming and throwing tasks in games, pp. 1–6 (2020). https://doi.org/10.1145/3379156.3391841
Yi, X., Lu, Y., Cai, Z., Wu, Z., Wang, Y., Shi, Y.: Gazedock: Gaze-only menu selection in virtual reality using auto-triggering peripheral menu, pp. 832–842 (2022). https://doi.org/10.1109/VR51125.2022.00105
Zhang, G., Hansen, J.P., Minakata, K.: Hand- and gaze-control of telepresence robots. In: Krejtz, K., Sharif, B. (eds.) Proceedings of the 11th ACM Symposium on Eye Tracking Research & Applications, ETRA 2019, Denver , CO, USA, June 25-28, 2019, pp. 70–1708. ACM, (2019). https://doi.org/10.1145/3317956.3318149
Zhang, G., Hansen, J.P.: People with motor disabilities using gaze to control telerobots. In: Bernhaupt, R., Mueller, F.F., Verweij, D., Andres, J., McGrenere, J., Cockburn, A., Avellino, I., Goguey, A., Bjøn, P., Zhao, S., Samson, B.P., Kocielnik, R. (eds.) Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems, CHI 2020, Honolulu, HI, USA, April 25-30, 2020, pp. 1–9. ACM, (2020). https://doi.org/10.1145/3334480.3382939
Weaver, J., Mock, K., Hoanca, B.: Gaze-based password authentication through automatic clustering of gaze points, pp. 2749–2754 (2011). https://doi.org/10.1109/ICSMC.2011.6084072
Shakil, A., Lutteroth, C., Weber, G.: Codegazer: Making code navigation easy and natural with gaze input, pp. 1–12 (2019). https://doi.org/10.1145/3290605.3300306
Orlosky, J., Toyama, T., Kiyokawa, K., Sonntag, D.: Modular: Eye-controlled vision augmentations for head mounted displays. IEEE Trans. Visual Comput. Graphics 21, 1259–1268 (2015). https://doi.org/10.1109/TVCG.2015.2459852
Zhang, S., Abdel-Aty, M.: Drivers’ visual distraction detection using facial landmarks and head pose. Transport. Res. Record: J. Transport. Res. Board 2676, 036119812210872 (2022). https://doi.org/10.1177/03611981221087234
Yan, Y., Shi, Y., Yu, C., Shi, Y.: Headcross: Exploring head-based crossing selection on head-mounted displays. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 4, 1–22 (2020). https://doi.org/10.1145/3380983
Rudigkeit, N., Gebhard, M.: Amicus—a head motion-based interface for control of an assistive robot. Sensors (2019). https://doi.org/10.3390/s19122836
Baek, S.-J., Choi, K.-A., Ma, C., Kim, Y.-H., Ko, S.-J.: Eyeball model-based iris center localization for visible image-based eye-gaze tracking systems. Consum. Electron. IEEE Trans. 59, 415–421 (2013). https://doi.org/10.1109/TCE.2013.6531125
Ince, I., Kim, J.: A 2d eye gaze estimation system with low-resolution webcam images. J. Adv. Signal Process. (2011). https://doi.org/10.1186/1687-6180-2011-40
Laddi, A., Prakash, N.: Eye gaze tracking based directional control interface for interactive applications. Multimed. Tools Appl. (2019). https://doi.org/10.1007/s11042-019-07940-3
Modi, N., Singh, J.: Real-time camera-based eye gaze tracking using convolutional neural network: a case study on social media website. Virtual Reality 26, 1–18 (2022). https://doi.org/10.1007/s10055-022-00642-6
Rahmaniar, W., Ma’arif, A., Lin, T.-L.: Touchless head-control (thc): Head gesture recognition for cursor and orientation control. IEEE transactions on neural systems and rehabilitation engineering : a publication of the IEEE Engineering in Medicine and Biology Society PP, (2022). https://doi.org/10.1109/TNSRE.2022.3187472
Abiyev, R., Arslan, M.: Head mouse control system for people with disabilities. Expert. Syst. 37, 12398 (2019). https://doi.org/10.1111/exsy.12398
Zhang, X., Sugano, Y., Fritz, M., Bulling, A.: It’s written all over your face: full-face appearance-based gaze estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp. 2299–2308 (2017). https://doi.org/10.1109/CVPRW.2017.284
Abdelrahman, A.A., Hempel, T., Khalifa, A., Al-Hamadi, A.: L2cs-net: fine-grained gaze estimation in unconstrained environments. arXiv preprint arXiv:2203.03339 (2022)
Cheng, Y., Lu, F.: Gaze estimation using transformer, pp. 3341–3347 (2022). https://doi.org/10.1109/ICPR56361.2022.9956687
Zhang, X., Park, S., Beeler, T., Bradley, D., Tang, S., Hilliges, O.: Eth-xgaze: A large scale dataset for gaze estimation under extreme head pose and gaze variation. In: European Conference on Computer Vision (ECCV) (2020)
Pathirana, P., Senarath, S., Meedeniya, D., Jayarathna, S.: Eye gaze estimation: a survey on deep learning-based approaches. Expert Syst. Appl. 199, 1–16 (2022). https://doi.org/10.1016/j.eswa.2022.116894
Cheng, Y., Wang, H., Bao, Y., Lu, F.: Appearance-based gaze estimation with deep learning: a review and benchmark. arXiv preprint arXiv:2104.12668 (2021). https://doi.org/10.48550/arXiv.2104.12668
Huynh, S., Balan, R., Ko, J.: imon: Appearance-based gaze tracking system on mobile devices. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 5, 1–26 (2021). https://doi.org/10.1145/3494999
Lei, Y., Wang, Y., Caslin, T., Wisowaty, A., Zhu, X., Khamis, M., Ye, J.: Dynamicread: exploring robust gaze interaction methods for reading on handheld mobile devices under dynamic conditions, vol. 7 (2023). https://doi.org/10.1145/3591127
Li, J., Chen, Z., Zhong, Y., Lam, H.-K., Han, J., Ouyang, G., Li, X., Liu, H.: Appearance-based gaze estimation for ASD diagnosis. IEEE Trans. Cybern. PP, 1–14 (2022). https://doi.org/10.1109/TCYB.2022.3165063
Kothari, R., Mello, S., Iqbal, U., Byeon, W., Park, S., Kautz, J.: Weakly-supervised physically unconstrained gaze estimation, pp. 9975–9984 (2021). https://doi.org/10.1109/CVPR46437.2021.00985
Cheng, Y., Bao, Y., Lu, F.: Puregaze: purifying gaze feature for generalizable gaze estimation. Proc. AAAI Conf. Artif. Intell. 36, 436–443 (2022). https://doi.org/10.1609/aaai.v36i1.19921
Fang, Y., Tang, J., Shen, W., Shen, W., Gu, X., Song, L., Zhai, G.: Dual attention guided gaze target detection in the wild, pp. 11385–11394 (2021). https://doi.org/10.1109/CVPR46437.2021.01123
Singh, J., Modi, N.: A robust, real-time camera-based eye gaze tracking system to analyze users’ visual attention using deep learning. Interact. Learn. Environ. (2022). https://doi.org/10.1080/10494820.2022.2088561
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Yang, F., Wang, X., Ma, H., Li, J.: Transformers-sklearn: a toolkit for medical language understanding with transformer-based models. BMC Med. Inform. Decis. Making. (2021). https://doi.org/10.1186/s12911-021-01459-0
Clemotte, A., Velasco, M., Torricelli, D., Raya, R., Ceres, R.: Accuracy and precision of the tobii x2-30 eye-tracking under non ideal conditions. (2014). https://doi.org/10.5220/0005094201110116
Onkhar, V., Dodou, D., de Winter, J.: Evaluating the Tobii pro glasses 2 and 3 in static and dynamic. Behav. Res. Methods (2023). https://doi.org/10.3758/s13428-023-02173-7
Ijsselsteijn, W.A., Kort, Y.D., Poels, K.: D3.3 the game experience questionnaire: development of a self-report measure to assess the psychological impact of digital games (2008)
Engl, S., Nacke, L.: Contextual influences on mobile player experience—a game user experience model. Entertain. Comput. 4, 83–91 (2013). https://doi.org/10.1016/j.entcom.2012.06.001
Nacke, L., Grimshaw-Aagaard, M., Lindley, C.: More than a feeling: measurement of sonic user experience and psychophysiology in a first-person shooter game. Interact. Comput. 22, 336–343 (2010). https://doi.org/10.1016/j.intcom.2010.04.005
Tan, C.T., Bakkes, S., Pisan, Y.: Inferring player experiences using facial expressions analysis. In: Proceedings of the 2014 Conference on Interactive Entertainment, IE 2014, Newcastle, NSW, Australia, December 2-3, 2014, pp. 1–8 (2014). https://doi.org/10.1145/2677758.2677765
Lazar, J., Jones, A., Bessière, K., Ceaparu, I., Shneiderman, B.: User frustration with technology in the workplace (2004), p. 283 (2003)
Talen, L., den Uyl, T.: Complex website tasks increase the expression anger measured with facereader online. Int. J. Human-Comput. Interact. 38, 1–7 (2021). https://doi.org/10.1080/10447318.2021.1938390
Acknowledgements
This work was supported in part by Key R &D Program of Hunan (2022SK2104), in part by Leading plan for scientific and technological innovation of high-tech industries of Hunan (2022GK4010), in part by National Key R &D Program of China (2021YFF0900600), and in part by the National Natural Science Foundation of China (61672222).
Author information
Authors and Affiliations
Contributions
HZ and LY wrote the main manuscript text. HZ proposed the conception of this work and revised the paper. LY prepared figures in the paper. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests
Additional information
Communicated by T. Yao.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhang, H., Yin, L. & Zhang, H. A real-time camera-based gaze-tracking system involving dual interactive modes and its application in gaming. Multimedia Systems 30, 15 (2024). https://doi.org/10.1007/s00530-023-01204-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00530-023-01204-9