poster

IMETA: An Interactive Mobile Eye Tracking Annotation Method for Semi-automatic Fixation-to-AOI mapping

Authors:

László Kopácsi,

Omair Shahzad Bhatti,

Daniel SonntagAuthors Info & Claims

IUI '23 Companion: Companion Proceedings of the 28th International Conference on Intelligent User Interfaces

Pages 33 - 36

https://doi.org/10.1145/3581754.3584125

Published: 27 March 2023 Publication History

Abstract

Mobile eye tracking studies involve analyzing areas of interest (AOIs) and visual attention to these AOIs to understand how people process visual information. However, accurately annotating the data collected for user studies can be a challenging and time-consuming task. Current approaches for automatically or semi-automatically analyzing head-mounted eye tracking data in mobile eye tracking studies have limitations, including a lack of annotation flexibility or the inability to adapt to specific target domains. To address this problem, we present IMETA, an architecture for semi-automatic fixation-to-AOI mapping. When an annotator assigns an AOI label to a sequence of frames based on the respective fixation points, an interactive video object segmentation method is used to estimate the mask proposal of the AOI. Then, we use the 3D reconstruction of the visual scene created from the eye tracking video to map these AOI masks to 3D. The resulting 3D segmentation of the AOI can be used to suggest labels for the rest of the video, with the suggestions becoming increasingly accurate as more samples are provided by an annotator using interactive machine learning (IML). IMETA has the potential to reduce the annotation workload and speed up the evaluation of mobile eye tracking studies.

References

[1]

Kristin Altmeyer, Sebastian Kapp, Michael Barz, Luisa Lauer, Sarah Malone, Jochen Kuhn, and Roland Brünken. 2020. The effect of augmented reality on global coherence formation processes during STEM laboratory work in elementary school children. (Oct. 2020). https://doi.org/10.17605/osf.io/gwhu5

[2]

Michael Barz, Florian Daiber, Daniel Sonntag, and Andreas Bulling. 2018. Error-aware gaze-based interfaces for robust mobile gaze interaction. In Proceedings of the 2018 ACM Symposium on Eye Tracking Research & Applications, ETRA 2018, Warsaw, Poland, June 14-17, 2018, Bonita Sharif and Krzysztof Krejtz (Eds.). Acm, 24:1–24:10. https://doi.org/10.1145/3204493.3204536

Digital Library

[3]

Michael Barz and Daniel Sonntag. 2021. Automatic Visual Attention Detection for Mobile Eye Tracking Using Pre-Trained Computer Vision Models and Human Gaze. Sensors 21, 12 (Jan. 2021), 4143. https://doi.org/10.3390/s21124143 Number: 12 Publisher: Multidisciplinary Digital Publishing Institute.

[4]

Aljaž Božič, Pablo Palafox, Justus Thies, Angela Dai, and Matthias Nießner. 2021. TransformerFusion: Monocular RGB Scene Reconstruction using Transformers. https://doi.org/10.48550/arXiv.2107.02191 arXiv:2107.02191 [cs].

[5]

Ho Kei Cheng and Alexander G. Schwing. 2022. XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model. http://arxiv.org/abs/2207.07115 arXiv:2207.07115 [cs].

[6]

Ho Kei Cheng, Yu-Wing Tai, and Chi-Keung Tang. 2021. Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion. http://arxiv.org/abs/2103.07941 arXiv:2103.07941 [cs].

[7]

Stijn De Beugher, Geert Brône, and Toon Goedemé. 2014. Automatic analysis of in-the-wild mobile eye-tracking experiments using object, face and person detection. In 2014 International Conference on Computer Vision Theory and Applications (VISAPP), Vol. 1. 625–633.

[8]

Oliver Deane, Eszter Toth, and Sang-Hoon Yeo. 2022. Deep-SAGA: a deep-learning-based system for automatic gaze annotation from eye-tracking data. Behavior Research Methods (June 2022). https://doi.org/10.3758/s13428-022-01833-4

[9]

Anna Gelencsér-Horváth, László Kopácsi, Viktor Varga, Dávid Keller, Árpád Dobolyi, Kristóf Karacs, and András Lőrincz. 2022. Tracking Highly Similar Rat Instances under Heavy Occlusions: An Unsupervised Deep Generative Pipeline. Journal of Imaging 8, 4 (April 2022), 109. https://doi.org/10.3390/jimaging8040109

[10]

Benjamin Graham and David Novotny. 2020. RidgeSfM: Structure from Motion via Robust Pairwise Matching Under Depth Uncertainty. http://arxiv.org/abs/2011.10359 arXiv:2011.10359 [cs, eess] version: 1.

[11]

Yuying Hao, Yi Liu, Yizhou Chen, Lin Han, Juncai Peng, Shiyu Tang, Guowei Chen, Zewu Wu, Zeyu Chen, and Baohua Lai. 2022. EISeg: An Efficient Interactive Segmentation Tool based on PaddlePaddle. http://arxiv.org/abs/2210.08788 arXiv:2210.08788 [cs].

[12]

László Kopácsi, Árpád Dobolyi, Áron Fóthi, Dávid Keller, Viktor Varga, and András Lőrincz. 2021. RATS: Robust Automated Tracking and Segmentation of Similar Instances. In Artificial Neural Networks and Machine Learning – ICANN 2021: 30th International Conference on Artificial Neural Networks, Bratislava, Slovakia, September 14–17, 2021, Proceedings, Part III. Springer-Verlag, Berlin, Heidelberg, 507–518. https://doi.org/10.1007/978-3-030-86365-4_41

Digital Library

[13]

Niharika Kumari, Verena Ruf, Sergey Mukhametov, Albrecht Schmidt, Jochen Kuhn, and Stefan Küchemann. 2021. Mobile Eye-Tracking Data Analysis Using Object Detection via YOLO v4. Sensors 21, 22 (2021). https://doi.org/10.3390/s21227668

[14]

Kuno Kurzhals. 2021. Image-Based Projection Labeling for Mobile Eye Tracking. In ACM Symposium on Eye Tracking Research and Applications. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3448017.3457382

Digital Library

[15]

Kuno Kurzhals, Cyrill Fabian Bopp, Jochen Bässler, Felix Ebinger, and Daniel Weiskopf. 2014. Benchmark Data for Evaluating Visualization and Analysis Techniques for Eye Tracking for Video Stimuli. In Proceedings of the Fifth Workshop on Beyond Time and Errors: Novel Evaluation Methods for Visualization(Beliv ’14). Association for Computing Machinery, New York, NY, USA, 54–60. https://doi.org/10.1145/2669557.2669558 event-place: Paris, France.

Digital Library

[16]

Eduardo Manuel Silva Machado, Ivan Carrillo, Miguel Collado, and Liming Chen. 2019. Visual Attention-Based Object Detection in Cluttered Environments. In 2019 IEEE SmartWorld, Ubiquitous Intelligence Computing, Advanced Trusted Computing, Scalable Computing Communications, Cloud Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI). 133–139. https://doi.org/10.1109/SmartWorld-UIC-ATC-SCALCOM-IOP-SCI.2019.00064

[17]

Gregor Mehlmann, Markus Häring, Kathrin Janowski, Tobias Baur, Patrick Gebhard, and Elisabeth André. 2014. Exploring a Model of Gaze for Grounding in Multimodal HRI. In Proceedings of the 16th International Conference on Multimodal Interaction(Icmi ’14). Association for Computing Machinery, New York, NY, USA, 247–254. https://doi.org/10.1145/2663204.2663275 event-place: Istanbul, Turkey.

Digital Library

[18]

Alexey Merzlyakov and Steve Macenski. 2021. A Comparison of Modern General-Purpose Visual SLAM Approaches. https://doi.org/10.48550/arXiv.2107.07589 arXiv:2107.07589 [cs].

[19]

Zak Murez, Tarrence van As, James Bartolozzi, Ayan Sinha, Vijay Badrinarayanan, and Andrew Rabinovich. 2020. Atlas: End-to-End 3D Scene Reconstruction from Posed Images. https://doi.org/10.48550/arXiv.2003.10432 arXiv:2003.10432 [cs].

[20]

Karen Panetta, Qianwen Wan, Aleksandra Kaszowska, Holly A. Taylor, and Sos Agaian. 2019. Software Architecture for Automating Cognitive Science Eye-Tracking Data Analysis and Object Annotation. IEEE Transactions on Human-Machine Systems 49, 3 (2019), 268–277. https://doi.org/10.1109/thms.2019.2892919

[21]

Thies Pfeiffer, Patrick Renner, and Nadine Pfeiffer-Leßmann. 2016. EyeSee3D 2.0: Model-Based Real-Time Analysis of Mobile Eye-Tracking in Static and Dynamic Three-Dimensional Scenes. In Proceedings of the Ninth Biennial ACM Symposium on Eye Tracking Research & Applications(Etra ’16). Association for Computing Machinery, New York, NY, USA, 189–196. https://doi.org/10.1145/2857491.2857532 event-place: Charleston, South Carolina.

Digital Library

[22]

Daniel F. Pontillo, Thomas B. Kinsman, and Jeff B. Pelz. 2010. SemantiCode: Using Content Similarity and Database-Driven Matching to Code Wearable Eyetracker Gaze Data. In Proceedings of the 2010 Symposium on Eye-Tracking Research & Applications(Etra ’10). Association for Computing Machinery, New York, NY, USA, 267–270. https://doi.org/10.1145/1743666.1743729 event-place: Austin, Texas.

Digital Library

[23]

Antoni Rosinol, Andrew Violette, Marcus Abate, Nathan Hughes, Yun Chang, Jingnan Shi, Arjun Gupta, and Luca Carlone. 2021. Kimera: from SLAM to Spatial Perception with 3D Dynamic Scene Graphs. https://doi.org/10.48550/arXiv.2101.06894 arXiv:2101.06894 [cs].

[24]

Mohamed Sayed, John Gibson, Jamie Watson, Victor Prisacariu, Michael Firman, and Clément Godard. 2022. SimpleRecon: 3D Reconstruction Without 3D Convolutions. In Computer Vision – ECCV 2022(Lecture Notes in Computer Science), Shai Avidan, Gabriel Brostow, Moustapha Cissé, Giovanni Maria Farinella, and Tal Hassner (Eds.). Springer Nature Switzerland, Cham, 1–19. https://doi.org/10.1007/978-3-031-19827-4_1

Digital Library

[25]

Konstantin Sofiiuk, Ilia Petrov, Olga Barinova, and Anton Konushin. 2020. f-BRS: Rethinking Backpropagating Refinement for Interactive Segmentation. http://arxiv.org/abs/2001.10331 arXiv:2001.10331 [cs].

[26]

Ömer Sümer, Patricia Goldberg, Kathleen Stürmer, Tina Seidel, Peter Gerjets, Ulrich Trautwein, and Enkelejda Kasneci. 2018. Teacher’s Perception in the Classroom. CoRR abs/1805.08897. arXiv:1805.08897http://arxiv.org/abs/1805.08897

[27]

Zachary Teed and Jia Deng. 2020. DeepV2D: Video to Depth with Differentiable Structure from Motion. https://doi.org/10.48550/arXiv.1812.04605 arXiv:1812.04605 [cs].

[28]

Takumi Toyama, Thomas Kieninger, Faisal Shafait, and Andreas Dengel. 2012. Gaze Guided Object Recognition Using a Head-Mounted Eye Tracker. In Proceedings of the Symposium on Eye Tracking Research and Applications(Etra ’12). Association for Computing Machinery, New York, NY, USA, 91–98. https://doi.org/10.1145/2168556.2168570 event-place: Santa Barbara, California.

Digital Library

[29]

Takumi Toyama and Daniel Sonntag. 2015. Towards Episodic Memory Support for Dementia Patients by Recognizing Objects, Faces and Text in Eye Gaze. In KI 2015: Advances in Artificial Intelligence(Lecture Notes in Computer Science), Steffen Hölldobler, Rafael Peñaloza, and Sebastian Rudolph (Eds.). Springer International Publishing, Cham, 316–323. https://doi.org/10.1007/978-3-319-24489-1_29

[30]

Karan Uppal, Jaeah Kim, and Shashank Singh. 2022. Decoding Attention from Gaze: A Benchmark Dataset and End-to-End Models. In NeuRIPS 2022 Workshop on Gaze Meets ML. https://openreview.net/forum?id=1Ty3Xd9HUQv

[31]

Viktor Varga and András Lőrincz. 2021. Fast Interactive Video Object Segmentation with Graph Neural Networks. http://arxiv.org/abs/2103.03821 arXiv:2103.03821 [cs].

[32]

Jianyuan Wang, Yiran Zhong, Yuchao Dai, Stan Birchfield, Kaihao Zhang, Nikolai Smolyanskiy, and Hongdong Li. 2021. Deep Two-View Structure-from-Motion Revisited. http://arxiv.org/abs/2104.00556 arXiv:2104.00556 [cs].

[33]

Julian Wolf, Stephan Hess, David Bachmann, Quentin Lohmeyer, and Mirko Meboldt. 2018. Automating areas of interest analysis in mobile eye tracking experiments based on machine learning. Journal of Eye Movement Research 11, 6 (Dec. 2018). https://doi.org/10.16910/jemr.11.6.6 Section: Articles.

[34]

Guangkai Xu, Wei Yin, Hao Chen, Chunhua Shen, Kai Cheng, Feng Wu, and Feng Zhao. 2022. Towards 3D Scene Reconstruction from Locally Scale-Aligned Monocular Video Depth. https://doi.org/10.48550/arXiv.2202.01470 arXiv:2202.01470 [cs].

[35]

Wei Yin, Jianming Zhang, Oliver Wang, Simon Niklaus, Long Mai, Simon Chen, and Chunhua Shen. 2020. Learning to Recover 3D Scene Shape from a Single Image. https://doi.org/10.48550/arXiv.2012.09365 arXiv:2012.09365 [cs].

[36]

L.H. Yu and M. Eizenman. 2004. A new methodology for determining point-of-gaze in head-mounted eye tracking systems. IEEE Transactions on Biomedical Engineering 51, 10 (Oct. 2004), 1765–1773. https://doi.org/10.1109/tbme.2004.831523

[37]

Zihan Zhu, Songyou Peng, Viktor Larsson, Weiwei Xu, Hujun Bao, Zhaopeng Cui, Martin R. Oswald, and Marc Pollefeys. 2022. NICE-SLAM: Neural Implicit Scalable Encoding for SLAM. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

Cited By

Niehorster DNyström MHessels RAndersson RBenjamins JHansen DHooge I(2025)The fundamentals of eye tracking part 4: Tools for conducting an eye tracking studyBehavior Research Methods10.3758/s13428-024-02529-757:1Online publication date: 6-Jan-2025
https://doi.org/10.3758/s13428-024-02529-7

Index Terms

IMETA: An Interactive Mobile Eye Tracking Annotation Method for Semi-automatic Fixation-to-AOI mapping
1. Computing methodologies
  1. Machine learning
2. Human-centered computing
  1. Human computer interaction (HCI)
    1. Empirical studies in HCI
    2. Interactive systems and tools

Recommendations

Interactive Fixation-to-AOI Mapping for Mobile Eye Tracking Data based on Few-Shot Image Classification
IUI '23 Companion: Companion Proceedings of the 28th International Conference on Intelligent User Interfaces

Mobile eye tracking is an important tool in psychology and human-centred interaction design for understanding how people process visual scenes and user interfaces. However, analysing recordings from mobile eye trackers, which typically include an ...
Image-Based Projection Labeling for Mobile Eye Tracking
ETRA '21 Full Papers: ACM Symposium on Eye Tracking Research and Applications

The annotation of gaze data concerning investigated areas of interest (AOIs) poses a time-consuming step in the analysis procedure of eye tracking experiments. For data from mobile eye tracking glasses, the annotation effort is further increased ...
Fixation detection for head-mounted eye tracking based on visual similarity of gaze targets
ETRA '18: Proceedings of the 2018 ACM Symposium on Eye Tracking Research & Applications

Fixations are widely analysed in human vision, gaze-based interaction, and experimental psychology research. However, robust fixation detection in mobile settings is profoundly challenging given the prevalence of user and gaze target motion. These ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

IUI '23 Companion: Companion Proceedings of the 28th International Conference on Intelligent User Interfaces

March 2023

266 pages

ISBN:9798400701078

DOI:10.1145/3581754

Copyright © 2023 Owner/Author.

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 March 2023

Check for updates

Author Tags

Qualifiers

Poster
Research
Refereed limited

Funding Sources

European Commission
German Federal Ministry of Education and Research

Conference

IUI '23

Sponsor:

IUI '23: 28th International Conference on Intelligent User Interfaces

March 27 - 31, 2023

NSW, Sydney, Australia

Upcoming Conference

IUI '25

Sponsor:
sigai
sigai

30th International Conference on Intelligent User Interfaces

March 24 - 27, 2025

Cagliari , Italy

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
136
Total Downloads

Downloads (Last 12 months)54
Downloads (Last 6 weeks)4

Reflects downloads up to 18 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Niehorster DNyström MHessels RAndersson RBenjamins JHansen DHooge I(2025)The fundamentals of eye tracking part 4: Tools for conducting an eye tracking studyBehavior Research Methods10.3758/s13428-024-02529-757:1Online publication date: 6-Jan-2025
https://doi.org/10.3758/s13428-024-02529-7

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents