research-article

Learning Cooperative Personalized Policies from Gaze Data

Authors:
Christoph Gebhardt

ETH Zürich & Facebook Reality Labs, Zürich, Switzerland

ETH Zürich & Facebook Reality Labs, Zürich, Switzerland
View Profile

,
Brian Hecox

Facebook Reality Labs, Redmond, WA, USA

Facebook Reality Labs, Redmond, WA, USA
View Profile

,
Bas van Opheusden

Facebook Reality Labs & Princeton University, Redmond, WA, USA

Facebook Reality Labs & Princeton University, Redmond, WA, USA
View Profile

,
Daniel Wigdor

University of Toronto, Toronto, ON, Canada

University of Toronto, Toronto, ON, Canada
View Profile

,
James Hillis

Facebook Reality Labs, Redmond, WA, USA

Facebook Reality Labs, Redmond, WA, USA
View Profile

,
Otmar Hilliges

ETH Zürich, Zurich, Switzerland

ETH Zürich, Zurich, Switzerland
View Profile

,
Hrvoje Benko

Facebook Reality Labs, Redmond, WA, USA

Facebook Reality Labs, Redmond, WA, USA
View Profile

UIST '19: Proceedings of the 32nd Annual ACM Symposium on User Interface Software and TechnologyOctober 2019Pages 197–208https://doi.org/10.1145/3332165.3347933

Published:17 October 2019Publication History

UIST '19: Proceedings of the 32nd Annual ACM Symposium on User Interface Software and Technology

Pages 197–208

ABSTRACT

An ideal Mixed Reality (MR) system would only present virtual information (e.g., a label) when it is useful to the person. However, deciding when a label is useful is challenging: it depends on a variety of factors, including the current task, previous knowledge, context, etc. In this paper, we propose a Reinforcement Learning (RL) method to learn when to show or hide an object's label given eye movement data. We demonstrate the capabilities of this approach by showing that an intelligent agent can learn cooperative policies that better support users in a visual search task than manually designed heuristics. Furthermore, we show the applicability of our approach to more realistic environments and use cases (e.g., grocery shopping). By posing MR object labeling as a model-free RL problem, we can learn policies implicitly by observing users' behavior without requiring a visual search model or data annotation.

Supplemental Material

ufp8079pv.mp4

mp4

22.2 MB

Download

ufp8079vf.mp4

mp4

157.9 MB

Download

p197-gebhardt.mp4

mp4

244 MB

Download

Available for Download

srt

ufp8079pvc.srt (845 B)

Preview video captions

References

Pieter Abbeel, Dmitri Dolgov, Andrew Y Ng, and Sebastian Thrun. 2008. Apprenticeship learning for motion planning with application to parking lot navigation. In IEEE International Conference on Intelligent Robots and Systems 2008. (IROS '08). IEEE, 1083--1090. http://dx.doi.org/10.1109/IROS.2008.4651222Google ScholarCross Ref
Pieter Abbeel and Andrew Y Ng. 2004. Apprenticeship learning via inverse reinforcement learning. In Proceedings of the Twenty-First International Conference on Machine Learning. ACM, 1. http://dx.doi.org/10.1145/1015330.1015430Google ScholarDigital Library
Yusuf Aytar, Tobias Pfaff, David Budden, Tom Le Paine, Ziyu Wang, and Nando de Freitas. 2018. Playing hard exploration games by watching YouTube. In Advances in Neural Information Processing Systems (NIPS '18). 2930--2941. https://arxiv.org/abs/1805.11592Google Scholar
Ronald Azuma and Chris Furmanski. 2003. Evaluating label placement for augmented reality view management. In Proceedings of the 2nd IEEE/ACM international Symposium on Mixed and Augmented Reality (ISMAR '03). IEEE, 66.Google ScholarDigital Library
Nikola Banovic, Tofi Buzali, Fanny Chevalier, Jennifer Mankoff, and Anind K Dey. 2016. Modeling and understanding human routine behavior. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (CHI '16). ACM, 248--260. http://dx.doi.org/10.1145/2858036.2858557Google ScholarDigital Library
Blaine Bell, Steven Feiner, and Tobias Hoellerer. 2001. View Management for Virtual and Augmented Reality. In Proceedings of the 14th Annual ACM Symposium on User Interface Software and Technology (UIST '01). 101--110. http://dx.doi.org/10.1145/502348.502363Google ScholarDigital Library
Minmin Chen, Alex Beutel, Paul Covington, Sagar Jain, Francois Belletti, and Ed H Chi. 2019. Top-K Off-Policy Correction for a REINFORCE Recommender System. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining (WSDM '19). ACM, 456--464. http://dx.doi.org/10.1145/3289600.3290999Google ScholarDigital Library
Adam Coates, Pieter Abbeel, and Andrew Y Ng. 2009. Apprenticeship learning for helicopter control. Commun. ACM 52, 7 (2009), 97--105. http://dx.doi.org/10.1145/1538788.1538812Google ScholarDigital Library
Ralf Engbert and Reinhold Kliegl. 2003. Microsaccades uncover the orientation of covert attention . Vision Research 43, 9 (2003), 1035--1045. http://dx.doi.org/10.1016/S0042--6989(03)00084--1Google ScholarCross Ref
Milica Gavs ić and Steve Young. 2014. Gaussian processes for POMDP-based dialogue manager optimization . IEEE Transactions on Audio, Speech and Language Processing 22, 1 (2014), 28--40. http://dx.doi.org/10.1109/TASL.2013.2282190Google ScholarDigital Library
Raphael Grasset, Tobias Langlotz, Denis Kalkofen, Markus Tatzgern, and Dieter Schmalstieg. 2012. Image-driven view management for augmented reality browsers. In Proceedings of the 2012 IEEE International Symposium on Mixed and Augmented Reality (ISMAR '12). IEEE, 177--186. http://dx.doi.org/10.1109/ISMAR.2012.6402555Google ScholarDigital Library
Zehong Hu, Yitao Liang, Jie Zhang, Zhao Li, and Yang Liu. 2018. Inference aided reinforcement learning for incentive mechanism design in crowdsourcing. In Advances in Neural Information Processing Systems (NIPS '18). 5508--5518. https://arxiv.org/abs/1806.00206Google Scholar
Simon Julier, Marco Lanzagorta, Yohan Baillot, Lawrence Rosenblum, Steven Feiner, Tobias Hollerer, and Sabrina Sestito. 2000. Information filtering for mobile augmented reality. In Proceedings IEEE and ACM International Symposium on Augmented Reality (ISAR '00). IEEE, 3--11. http://dx.doi.org/10.1109/MCG.2002.1028721Google ScholarCross Ref
Seong Jae Lee and Zoran Popović. 2010. Learning behavior styles with inverse reinforcement learning. In ACM Transactions on Graphics (TOG), Vol. 29. ACM, 122. http://dx.doi.org/10.1145/1778765.1778859Google ScholarDigital Library
Yongjoon Lee, Kevin Wampler, Gilbert Bernstein, Jovan Popović , and Zoran Popović. 2010. Motion fields for interactive character locomotion. In ACM Transactions on Graphics (TOG), Vol. 29. ACM, 138. http://dx.doi.org/10.1145/1882261.1866160Google Scholar
Alex Leykin and Mihran Tuceryan. 2004. Automatic determination of text readability over textured backgrounds for augmented reality systems. In Third IEEE and ACM International Symposium on Mixed and Augmented Reality. (ISMAR '04). IEEE, 224--230. http://dx.doi.org/10.1109/ISMAR.2004.22Google ScholarDigital Library
Elad Liebman, Maytal Saar-Tsechansky, and Peter Stone. 2015. DJ-MC: A Reinforcement-Learning Agent for Music Playlist Recommendation. In Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems (AAMAS '15). 591--599. https://arxiv.org/abs/1401.1880Google Scholar
Feng Liu, Ruiming Tang, Xutao Li, Weinan Zhang, Yunming Ye, Haokun Chen, Huifeng Guo, and Yuzhou Zhang. 2018. Deep reinforcement learning based recommendation with explicit user-item interactions modeling. arXiv preprint arXiv:1810.12027 (2018). https://arxiv.org/abs/1810.12027Google Scholar
Wan-Yen Lo and Matthias Zwicker. 2008. Real-time planning for parameterized human motion. In Proceedings of the 2008 ACM SIGGRAPH/Eurographics Symposium on Computer Animation (SCA '08). 29--38.Google ScholarDigital Library
Jacob Boesen Madsen, Markus Tatzqern, Claus B Madsen, Dieter Schmalstieg, and Denis Kalkofen. 2016. Temporal coherence strategies for augmented reality labeling. IEEE transactions on visualization and computer graphics 22, 4 (2016), 1415--1423. http://dx.doi.org/10.1109/TVCG.2016.2518318Google ScholarDigital Library
James McCann and Nancy Pollard. 2007. Responsive characters from motion fragments. In ACM Transactions on Graphics (TOG), Vol. 26. ACM, 6. http://dx.doi.org/10.1145/1276377.1276385Google ScholarDigital Library
Andrew Y Ng and Stuart J Russell. 2000. Algorithms for inverse reinforcement learning. In Proceedings of the Seventeenth International Conference on Machine Learning (ICML '00). 663--670.Google Scholar
Xue Bin Peng, Pieter Abbeel, Sergey Levine, and Michiel van de Panne. 2018a. DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills. ACM Transactions on Graphics 37, 4 (2018). http://dx.doi.org/10.1145/3197517.3201311Google ScholarDigital Library
Xue Bin Peng, Angjoo Kanazawa, Jitendra Malik, Pieter Abbeel, and Sergey Levine. 2018b. Sfv: Reinforcement learning of physical skills from videos. ACM Transactions on Graphics 37 (Nov. 2018). http://dx.doi.org/10.1145/3272127.3275014Google ScholarDigital Library
D. Purves, D. Fitzpatrick, L.C. Katz, A.S. Lamantia, J.O. McNamara, S.M. Williams, and G.J. Augustine. 2000. Neuroscience. Sinauer Associates. https://books.google.ch/books?id=F4pTPwAACAAJGoogle Scholar
Aravind Rajeswaran, Kendall Lowrey, Emanuel V. Todorov, and Sham M Kakade. 2017. Towards Generalization and Simplicity in Continuous Control. In Advances in Neural Information Processing Systems (NIPS '17). 6550--6561. https://arxiv.org/abs/1703.02660Google Scholar
Edward Rosten, Gerhard Reitmayr, and Tom Drummond. 2005. Real-time video annotations for augmented reality. In International Symposium on Visual Computing (ISVC '05). Springer, 294--302. http://dx.doi.org/10.1007/11595755_36Google ScholarDigital Library
Tom Schaul, John Quan, Ioannis Antonoglou, and David Silver. 2015. Prioritized experience replay. arXiv preprint arXiv:1511.05952 (2015). https://arxiv.org/abs/1511.05952Google Scholar
Pei-Hao Su, Pawel Budzianowski, Stefan Ultes, Milica Gasic, and Steve Young. 2017. Sample-efficient actor-critic reinforcement learning with supervised data for dialogue management. arXiv preprint arXiv:1707.00130 (2017). https://arxiv.org/abs/1707.00130Google Scholar
Richard S Sutton and Andrew G Barto. 1998. Introduction to reinforcement learning. Vol. 135. MIT press Cambridge.Google ScholarDigital Library
Markus Tatzgern, Denis Kalkofen, Raphael Grasset, and Dieter Schmalstieg. 2014. Hedgehog labeling: View management techniques for external labels in 3D space. In 2014 IEEE Virtual Reality (VR). IEEE, 27--32. http://dx.doi.org/10.1109/VR.2014.6802046Google Scholar
Markus Tatzgern, Valeria Orso, Denis Kalkofen, Giulio Jacucci, Luciano Gamberini, and Dieter Schmalstieg. 2016. Adaptive information density for augmented reality displays. In 2016 IEEE Virtual Reality (VR). IEEE, 83--92. http://dx.doi.org/10.1109/VR.2016.7504691Google Scholar
Adrien Treuille, Yongjoon Lee, and Zoran Popović. 2007. Near-optimal character animation with continuous control. ACM Transactions on Graphics 26, 3 (2007), 7. http://dx.doi.org/10.1145/1276377.1276386Google ScholarDigital Library
J. M. Wolfe. 1994. Guided Search 2. 0 A revised model of visual search . Psychnomic Bulletin & Review 1, 2 (1994), 202--238. http://dx.doi.org/10.3758/BF03200774Google ScholarCross Ref
Jeremy M. Wolfe. 2005. Guidance of visual search by preattentive information . Neurobiology of Attention (2005), 101--104. http://dx.doi.org/10.1016/B978-012375731--9/50021--5Google Scholar

Index Terms

Learning Cooperative Personalized Policies from Gaze Data
1. Computing methodologies
  1. Computer graphics
    1. Graphics systems and interfaces
      1. Mixed / augmented reality
  2. Machine learning
    1. Learning settings
      1. Semi-supervised learning settings

Recommendations

Reward Shaping in Episodic Reinforcement Learning
AAMAS '17: Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems

Recent advancements in reinforcement learning confirm that reinforcement learning techniques can solve large scale problems leading to high quality autonomous decision making. It is a matter of time until we will see large scale applications of ...
Read More
The Eye in Extended Reality: A Survey on Gaze Interaction and Eye Tracking in Head-worn Extended Reality
With innovations in the field of gaze and eye tracking, a new concentration of research in the area of gaze-tracked systems and user interfaces has formed in the field of Extended Reality (XR). Eye trackers are being used to explore novel forms of spatial ...
Read More
COOPERATIVE LEARNING BY POLICY-SHARING IN MULTIPLE AGENTS

Reinforcement learning is one of the more prominent machine-learning technologies due to its unsupervised learning structure and ability to continually learn, even in a dynamic operating environment. Applying this learning to cooperative multi-agent ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
UIST '19: Proceedings of the 32nd Annual ACM Symposium on User Interface Software and Technology
October 2019
1229 pages
ISBN:9781450368162
DOI:10.1145/3332165
General Chair:
François Guimbretière
Cornell University, USA
,
Program Chairs:
Michael Bernstein
Stanford University, USA
,
Katharina Reinecke
University of Washington, USA
Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 17 October 2019
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
eye tracking
mixed reality
reinforcement learning
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate842of3,967submissions,21%
Upcoming Conference
UIST '24

Sponsor:

sigchi

sigchi

UIST '24: The 37th Annual ACM Symposium on User Interface Software and Technology

October 13 - 16, 2024

Pittsburgh , PA , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 32
  Total Citations
  View Citations
- 1,122
  Total Downloads
- Downloads (Last 12 months)152
- Downloads (Last 6 weeks)22
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Learning Cooperative Personalized Policies from Gaze Data

UIST '19: Proceedings of the 32nd Annual ACM Symposium on User Interface Software and Technology

ABSTRACT

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

Reward Shaping in Episodic Reinforcement Learning

The Eye in Extended Reality: A Survey on Gaze Interaction and Eye Tracking in Head-worn Extended Reality

COOPERATIVE LEARNING BY POLICY-SHARING IN MULTIPLE AGENTS