short-paper

Automatic Recognition and Augmentation of Attended Objects in Real-time using Eye Tracking and a Head-mounted Display

Authors:

Sebastian Kapp,

Daniel SonntagAuthors Info & Claims

ETRA '21 Adjunct: ACM Symposium on Eye Tracking Research and Applications

Article No.: 3, Pages 1 - 4

https://doi.org/10.1145/3450341.3458766

Published: 25 May 2021 Publication History

Abstract

Scanning and processing visual stimuli in a scene is essential for the human brain to make situation-aware decisions. Adding the ability to observe the scanning behavior and scene processing to intelligent mobile user interfaces can facilitate a new class of cognition-aware user interfaces. As a first step in this direction, we implement an augmented reality (AR) system that classifies objects at the user’s point of regard, detects visual attention to them, and augments the real objects with virtual labels that stick to the objects in real-time. We use a head-mounted AR device (Microsoft HoloLens 2) with integrated eye tracking capabilities and a front-facing camera for implementing our prototype.

References

[1]

Ecenaz Alemdag and Kursat Cagiltay. 2018. A systematic review of eye tracking research on multimedia learning. Computers & Education 125 (2018), 413 – 428. https://doi.org/10.1016/j.compedu.2018.06.023

Digital Library

[2]

Michael Barz, Peter Poller, and Daniel Sonntag. 2017. Evaluating Remote and Head-worn Eye Trackers in Multi-modal Speech-based HRI. In Companion of the 2017 {ACM/IEEE} International Conference on Human-Robot Interaction, {HRI} 2017, Vienna, Austria, March 6-9, 2017, Bilge Mutlu, Manfred Tscheligi, Astrid Weiss, and James E Young (Eds.). ACM, New York, NY, USA, 79–80. https://doi.org/10.1145/3029798.3038367

[3]

Michael Barz and Daniel Sonntag. 2016. Gaze-guided object classification using deep neural networks for attention-based computing. In Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing Adjunct - UbiComp ’16. ACM Press, New York, New York, USA, 253–256. https://doi.org/10.1145/2968219.2971389

Digital Library

[4]

Michael Barz, Sven Stauden, and Daniel Sonntag. 2020. Visual Search Target Inference in Natural Interaction Settings with Machine Learning. In Proceedings of the 2020 ACM Symposium on Eye Tracking Research & Applications. Association for Computing Machinery, New York, NY, USA, 1–8. https://doi.org/10.1145/3379155.3391314

Digital Library

[5]

Paulo Blikstein. 2013. Multimodal learning analytics. In Proceedings of the Third International Conference on Learning Analytics and Knowledge - LAK ’13. ACM Press, New York, New York, USA, 102. https://doi.org/10.1145/2460296.2460316

Digital Library

[6]

Ali Borji and Laurent Itti. 2013. State-of-the-Art in Visual Attention Modeling. IEEE Transactions on Pattern Analysis and Machine Intelligence 35, 1 (jan 2013), 185–207. https://doi.org/10.1109/TPAMI.2012.89

Digital Library

[7]

Andreas Bulling. 2016. Pervasive Attentive User Interfaces. IEEE Computer 49, 1 (jan 2016), 94–98. https://doi.org/10.1109/MC.2016.32

Digital Library

[8]

Cristina Conati, Sébastien Lallé, Md Abed Rahman, and Dereck Toker. 2020. Comparing and Combining Interaction Data and Eye-tracking Data for the Real-time Prediction of User Cognitive Abilities in Visualization Tasks. ACM Transactions on Interactive Intelligent Systems 10, 2 (June 2020), 12. https://doi.org/10.1145/3301400 Publisher: Association for Computing Machinery.

Digital Library

[9]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 770–778. http://image-net.org/challenges/LSVRC/2015/

[10]

Koki Ijuin, Kristiina Jokinen Jokinen, Tsuneo Kato, and Seiichi Yamamoto. 2019. Eye-gaze in Social Robot Interactions. Proceedings of the Annual Conference of JSAI JSAI2019 (2019), 3J3E402–3J3E402. https://doi.org/10.11517/pjsai.JSAI2019.0_3J3E402

[11]

Sebastian Kapp, Michael Barz, Sergey Mukhametov, Daniel Sonntag, and Jochen Kuhn. 2021. ARETT: Augmented Reality Eye Tracking Toolkit for Head Mounted Displays. Sensors 21, 6 (March 2021), 2234. https://doi.org/10.3390/s21062234 Number: 6 Publisher: Multidisciplinary Digital Publishing Institute.

[12]

Sebastian Kapp, Michael Thees, Fabian Beil, Thomas Weatherby, Jan-Philipp Burde, Thomas Wilhelm, and Jochen Kuhn. 2020. The Effects of Augmented Reality: A Comparative Study in an Undergraduate Physics Laboratory Course. In Proceedings of the 12th International Conference on Computer Supported Education - Volume 2: CSEDU,. INSTICC, SciTePress, 197–206. https://doi.org/10.5220/0009793001970206

[13]

Michael F. Land and Mary Hayhoe. 2001. In what ways do eye movements contribute to everyday activities?Vision Research 41, 25–26 (nov 2001), 3559–3565. https://doi.org/10.1016/S0042-6989(01)00102-X

[14]

Cynthia Matuszek. 2018. Grounded Language Learning: Where Robotics and NLP Meet. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, 5687–5691. https://doi.org/10.24963/ijcai.2018/810

[15]

Gregor Mehlmann, Markus Häring, Kathrin Janowski, Tobias Baur, Patrick Gebhard, and Elisabeth André. 2014. Exploring a Model of Gaze for Grounding in Multimodal HRI. In Proceedings of the 16th International Conference on Multimodal Interaction - ICMI ’14. ACM, New York, NY, USA, 247–254. https://doi.org/10.1145/2663204.2663275

Digital Library

[16]

Jason Orlosky, Takumi Toyama, Daniel Sonntag, and Kiyoshi Kiyokawa. 2014. Using Eye-Gaze and Visualization to Augment Memory. In Distributed, Ambient, and Pervasive Interactions, Norbert Streitz and Panos Markopoulos (Eds.), Vol. 8530 LNCS. Springer International Publishing, Cham, 282–291. https://doi.org/10.1007/978-3-319-07788-8_27

[17]

Sharon Oviatt. 2018. Ten Opportunities and Challenges for Advancing Student-Centered Multimodal Learning Analytics. In Proceedings of the 2018 on International Conference on Multimodal Interaction - ICMI ’18. ACM Press, New York, New York, USA, 87–94. https://doi.org/10.1145/3242969.3243010

Digital Library

[18]

Karen Panetta, Qianwen Wan, Aleksandra Kaszowska, Holly A. Taylor, and Sos Agaian. 2019. Software Architecture for Automating Cognitive Science Eye-Tracking Data Analysis and Object Annotation. IEEE Transactions on Human-Machine Systems 49, 3 (2019), 268–277. https://doi.org/10.1109/THMS.2019.2892919

[19]

Constantin A. Rothkopf, Dana H. Ballard, and Mary M. Hayhoe. 2016. Task and context determine where you look. Journal of Vision 7, 14 (jul 2016), 16. https://doi.org/10.1167/7.14.16

[20]

Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. 2015. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV) 115, 3 (2015), 211–252. https://doi.org/10.1007/s11263-015-0816-y arxiv:1409.0575

Digital Library

[21]

Hosnieh Sattar, Andreas Bulling, and Mario Fritz. 2017. Predicting the Category and Attributes of Visual Search Targets Using Deep Gaze Pooling. In 2017 IEEE International Conference on Computer Vision Workshop (ICCVW). {IEEE} Computer Society, Los Alamitos, CA, USA, 2740–2748. https://doi.org/10.1109/ICCVW.2017.322

[22]

Daniel Sonntag. 2014. ERmed–Towards Medical Multimodal Cyber-Physical Environments. In Foundations of Augmented Cognition. Advancing Human Performance and Decision-Making through Adaptive Systems. Springer, 359–370. https://doi.org/10.1007/978-3-319-07527-3_34

[23]

Daniel Sonntag. 2015. Kognit: Intelligent Cognitive Enhancement Technology by Cognitive Models and Mixed Reality for Dementia Patients. (2015). https://www.aaai.org/ocs/index.php/FSS/FSS15/paper/view/11702

[24]

Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going Deeper With Convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 9 pages.

[25]

Michael Thees, Sebastian Kapp, Martin P. Strzys, Fabian Beil, Paul Lukowicz, and Jochen Kuhn. 2020. Effects of augmented reality on learning and cognitive load in university physics laboratory courses. Computers in Human Behavior 108 (2020), 106316. https://doi.org/10.1016/j.chb.2020.106316

[26]

Takumi Toyama, Thomas Kieninger, Faisal Shafait, and Andreas Dengel. 2012. Gaze guided object recognition using a head-mounted eye tracker. In Proceedings of the Symposium on Eye Tracking Research and Applications(ETRA ’12). ACM, 91–98. https://doi.org/10.1145/2168556.2168570

Digital Library

[27]

Takumi Toyama and Daniel Sonntag. 2015. Towards episodic memory support for dementia patients by recognizing objects, faces and text in eye gaze. In KI 2015: Advances in Artificial Intelligence, Vol. 9324. Springer International Publishing, Cham, 316–323. https://doi.org/10.1007/978-3-319-24489-1

[28]

Julian Wolf, Stephan Hess, David Bachmann, Quentin Lohmeyer, and Mirko Meboldt. 2018. Automating areas of interest analysis in mobile eye tracking experiments based on machine learning. Journal of Eye Movement Research 11, 6 (2018), 12. https://doi.org/10.3929/ethz-b-000309840

Cited By

Mohamed Selim ABarz MBhatti OAlam HSonntag D(2024)A review of machine learning in scanpath analysis for passive gaze-based interactionFrontiers in Artificial Intelligence10.3389/frai.2024.13917457Online publication date: 5-Jun-2024
https://doi.org/10.3389/frai.2024.1391745
Barz MBednarik RBulling AConati CSonntag D(2024)HumanEYEze 2024: Workshop on Eye Tracking for Multimodal Human-Centric ComputingProceedings of the 26th International Conference on Multimodal Interaction10.1145/3678957.3688384(696-697)Online publication date: 4-Nov-2024
https://dl.acm.org/doi/10.1145/3678957.3688384
Slavuljica ABektas KStrecker JMayer S(2024)NeighboAR: Efficient Object Retrieval using Proximity- and Gaze-based Object Grouping with an AR SystemProceedings of the ACM on Human-Computer Interaction10.1145/36555998:ETRA(1-19)Online publication date: 28-May-2024
https://dl.acm.org/doi/10.1145/3655599
Show More Cited By

Recommendations

The Eye in Extended Reality: A Survey on Gaze Interaction and Eye Tracking in Head-worn Extended Reality
With innovations in the field of gaze and eye tracking, a new concentration of research in the area of gaze-tracked systems and user interfaces has formed in the field of Extended Reality (XR). Eye trackers are being used to explore novel forms of spatial ...
Wearable head-mounted 3D tactile display application scenarios
MobileHCI '16: Proceedings of the 18th International Conference on Human-Computer Interaction with Mobile Devices and Services Adjunct

Current generation virtual reality (VR) and augmented reality (AR) head-mounted displays (HMDs) usually include no or only a single vibration motor for haptic feedback and do not use it for guidance. In a previous work, we presented HapticHead, a ...
Using Head-Mounted Displays for Virtual Reality: Investigating Subjective Reactions to Eye-Tracking Scenarios
Virtual, Augmented and Mixed Reality
Abstract
Virtual reality head-mounted displays (HMDs) have recently incorporated eye-tracking into hardware design. For the present study, the Varjo VR-1 and HTC Vive Pro Eye HMDs were used for three eye-tracking scenarios; and immersion, simulator ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ETRA '21 Adjunct: ACM Symposium on Eye Tracking Research and Applications

May 2021

78 pages

ISBN:9781450383578

DOI:10.1145/3450341

Editors:
Andreas Bulling
University of Stuttgart, Germany
,
Anke Huckauf
Ulm University, Germany
,
Hans Gellersen
Aarhus University, Denmark
,
Daniel Weiskopf
University of Stuttgart, Germany
,
Mihai Bace
ETH Zürich, Switzerland
,
Teresa Hirzle
Ulm University, Germany
,
Florian Alt
Bundeswehr University Munich, Germany
,
Thies Pfeiffer
Hochschule Emden/Leer, Germany
,
Roman Bednarik
University of Eastern Finland, Finland
,
Krzysztof Krejtz
SWPS University of Social Sciences and Humanities, Poland
,
Tanja Blascheck
University of Stuttgart, Germany
,
Michael Burch
University of Applied Sciences, Chur, Switzerland
,
Peter Kiefer
ETH Zurich, Switzerland
,
Michael Dodd
University of Nebraska-Lincoln, USA
,
Bonita Sharif
University of Nebraska-Lincoln, USA

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGGRAPH: ACM Special Interest Group on Computer Graphics and Interactive Techniques

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 May 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper
Research
Refereed limited

Funding Sources

Bundesministerium für Bildung und Forschung

Conference

ETRA '21

Sponsor:

SIGGRAPH

ETRA '21: 2021 Symposium on Eye Tracking Research and Applications

May 25 - 27, 2021

Virtual Event, Germany

Acceptance Rates

Overall Acceptance Rate 69 of 137 submissions, 50%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

11
Total Citations
View Citations
624
Total Downloads

Downloads (Last 12 months)92
Downloads (Last 6 weeks)6

Reflects downloads up to 18 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Mohamed Selim ABarz MBhatti OAlam HSonntag D(2024)A review of machine learning in scanpath analysis for passive gaze-based interactionFrontiers in Artificial Intelligence10.3389/frai.2024.13917457Online publication date: 5-Jun-2024
https://doi.org/10.3389/frai.2024.1391745
Barz MBednarik RBulling AConati CSonntag D(2024)HumanEYEze 2024: Workshop on Eye Tracking for Multimodal Human-Centric ComputingProceedings of the 26th International Conference on Multimodal Interaction10.1145/3678957.3688384(696-697)Online publication date: 4-Nov-2024
https://dl.acm.org/doi/10.1145/3678957.3688384
Slavuljica ABektas KStrecker JMayer S(2024)NeighboAR: Efficient Object Retrieval using Proximity- and Gaze-based Object Grouping with an AR SystemProceedings of the ACM on Human-Computer Interaction10.1145/36555998:ETRA(1-19)Online publication date: 28-May-2024
https://dl.acm.org/doi/10.1145/3655599
Weber BAbbad-Andaloussi AFranceschetti MSeiger RVölzer HZerbato F(2024)Leveraging Digital Trace Data to Investigate and Support Human-Centered Work ProcessesEvaluation of Novel Approaches to Software Engineering10.1007/978-3-031-64182-4_1(1-23)Online publication date: 10-Jul-2024
https://doi.org/10.1007/978-3-031-64182-4_1
Zhang CLiao HHuang YDong W(2023)Evaluating the Usability of a Gaze-Adaptive Approach for Identifying and Comparing Raster Values between MultilayersISPRS International Journal of Geo-Information10.3390/ijgi1210041212:10(412)Online publication date: 8-Oct-2023
https://doi.org/10.3390/ijgi12100412
Strecker JAkhunov KCarbone FGarcía KBektaş KGomez AMayer SYildirim K(2023)MR Object Identification and InteractionProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36108797:3(1-26)Online publication date: 27-Sep-2023
https://dl.acm.org/doi/10.1145/3610879
Syed TSiddiqui MAbdullah HJan SNamoun AAlzahrani ANadeem AAlkhodre A(2022)In-Depth Review of Augmented Reality: Tracking Technologies, Development Tools, AR Displays, Collaborative AR, and Security ConcernsSensors10.3390/s2301014623:1(146)Online publication date: 23-Dec-2022
https://doi.org/10.3390/s23010146
Rong YKassautzki NFuhl WKasneci E(2022)Where and WhatProceedings of the ACM on Human-Computer Interaction10.1145/35308876:ETRA(1-22)Online publication date: 13-May-2022
https://dl.acm.org/doi/10.1145/3530887
Kumari NRuf VMukhametov SSchmidt AKuhn JKüchemann S(2021)Mobile Eye-Tracking Data Analysis Using Object Detection via YOLO v4Sensors10.3390/s2122766821:22(7668)Online publication date: 18-Nov-2021
https://doi.org/10.3390/s21227668
Lauer LAltmeyer KMalone SBarz MBrünken RSonntag DPeschel M(2021)Investigating the Usability of a Head-Mounted Display Augmented Reality Device in Elementary School ChildrenSensors10.3390/s2119662321:19(6623)Online publication date: 5-Oct-2021
https://doi.org/10.3390/s21196623
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents