skip to main content
10.1145/2522848.2522853acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
research-article

Predicting where we look from spatiotemporal gaps

Published: 09 December 2013 Publication History

Abstract

When we are watching videos, there exist spatiotemporal gaps between where we look and what we focus on, which result from temporally delayed responses and anticipation in eye movements. We focus on the underlying structures of those gaps and propose a novel method to predict points of gaze from video data. In the proposed methods, we model the spatiotemporal patterns of salient regions that tend to be focused on and statistically learn which types of the patterns strongly appear around the points of gaze with respect to each type of eye movements. It allows us to exploit the structures of gaps affected by eye movements and salient motions for the gaze-point prediction. The effectiveness of the proposed method is confirmed with several public datasets.

References

[1]
A. Borji. Boosting Bottom-up and Top-down Visual Features for Saliency Estimation. In CVPR, 2012.
[2]
A. Borji and L. Itti. State-of-the-art in Visual Attention Modeling. TPAMI, 2012.
[3]
M. Cheng, G. Zhang, N. Mitra, X. Huang, and S. Hu. Global Contrast Based Salient Region Detection. In CVPR, 2011.
[4]
S. Eivazi, R. Bednarik, M. Tukiainen, M. von und zu Fraunberg, V. Leinonen, and J. Jääskeläinen. Gaze Behaviour of Expert and Novice Microneurosurgeons Differs during Observations of Tumor Removal Recordings. In ETRA, 2012.
[5]
P. Felzenszwalb and D. Huttenlocher. Efficient Graph-Based Image Segmentation. IJCV, 59(2):167--181, 2004.
[6]
T. Hirayama, J. B. Dodane, H. Kawashima, and T. Matsuyama. Estimates of User Interest Using Timing Structures between Proactive Content-display Updates and Eye Movements. IEICE Trans. on Information and Systems, E-93D(6):1470--1478, 2010.
[7]
J. Hoffman. Visual Attention and Eye Movements. In H. Pashler, editor, Attention, volume 31, chapter 3, pages 119--153. Psychology Press, 1998.
[8]
L. Itti and R. Carmi. Eye-tracking Data from Human Volunteers Watching Complex Video Stimuli. CRCNS.org. http://dx.doi.org/10.6080/K0TD9V7F. 2009.
[9]
L. Itti, C. Koch, and E. Niebur. A Model of Saliency-Based Visual Attention for Rapid Scene Analysis. TPAMI, 20(11):1254--1259, 1998.
[10]
T. Judd, K. Ehinger, F. Durand, and A. Torralba. Learning to Predict Where Humans Look. In ICCV, 2009.
[11]
D. Lee and H. Seung. Learning the Parts of Objects by Non-negative Matrix Factorization. Nature, 401(6755):788--791, 1999.
[12]
D. Lee and H. Seung. Algorithms for Non-negative Matrix Factorization. In NIPS, 2001.
[13]
L. Li, W. Huang, I. Y. Gu, and Q. Tian. Statistical Modeling of Complex Backgrounds for Foreground Object Detection. TIP, 13(11):1459--1472, 2004.
[14]
Y. Li and A. Ngom. The Non-negative Matrix Factorization Toolbox for Biological Data Mining. BMC Source Code for iology and Medicine, 8(1):10, 2013.
[15]
V. Mahadevan and N. Vasconcelos. Spatiotemporal Saliency in Dynamic Scenes. TPAMI, 32(1):171--177, 2010.
[16]
S. Mathot and J. Theeuwes. Evidence for the Predictive Remapping of Visual Attention. Experimental Brain Research, 200(1):117--122, 2010.
[17]
D. Pang, A. Kimura, T. Takeuchi, J. Yamato, and K. Kashino. A Stochastic Model of Selective Visual Attention with a Dynamic Bayesian Network. In ICME, 2008.
[18]
D. Parkhurst, K. Law, and E. Niebur. Modeling the Role of Salience in the Allocation of Overt Visual Attention. Vision Research, 42(1): 107--123, 2002.
[19]
C. Rashbass. The Relationship between Saccadic and Smooth Tracking Eye Movements. The Journal of Physiology, 159:326--338, 1961.
[20]
K. Rayner. Parafoveal Identification during a Fixation in Reading. Acta Psychologica, 39(4):271--281, 1975.
[21]
N. Riche, M. Mancas, and D. Culibrk. Dynamic Saliency Models and Human Attention: A Comparative Study on Videos. In ACCV, 2012.
[22]
P. Roelfsema and R. Houtkamp. Incremental Grouping of Image Elements in Vision. Attention, Perception & Psychophysics, 73(8):2542--2572, 2011.
[23]
A. Santella, M. Agrawala, D. DeCarlo, D. Salesin, and M. Cohen. Gaze-based Interaction for Semi-automatic Photo Cropping. In CHI, 2006.
[24]
J. Simola, J. Salojärvi, and I. Kojo. Using Hidden Markov Model to Uncover Processing States from Eye Movements in Information Search Tasks. Cognitive Systems Research, 9(4):237--251, 2008.
[25]
J. Simonin, S. Kieffer, and N. Carbonell. Effects of Display Layout on Gaze Activity During Visual Search. In INTERACT, volume 3585, 2005.
[26]
P. Smaragdis and J. Brown. Non-negative Matrix Factorization for Polyphonic Music Transcription. In WASPAA, 2003.
[27]
P.-H. Tseng, I. G. Cameron, G. Pari, J. N. Reynolds, D. P. Munoz, and L. Itti. High-throughput Classification of Clinical Populations from Natural Viewing Eye Movements. Journal of Neurology, 260(1):275--284, 2013.
[28]
W. Xu, X. Liu, and Y. Gong. Document clustering based on non-negative matrix factorization. In SIGIR, 2003.
[29]
R. Yonetani, H. Kawashima, and T. Matsuyama. Multi-mode Saliency Dynamics Model for Analyzing Gaze and Attention. In ETRA, 2012.
[30]
A. Yoshitaka, K. Wakiyama, and T. Hirashima. Recommendation of Visual Information by Gaze-based Implicit Preference Acquisition. In MMM, 2006.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICMI '13: Proceedings of the 15th ACM on International conference on multimodal interaction
December 2013
630 pages
ISBN:9781450321297
DOI:10.1145/2522848
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 December 2013

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. eye movement
  2. saliency map
  3. spatiotemporal gap

Qualifiers

  • Research-article

Conference

ICMI '13
Sponsor:

Acceptance Rates

ICMI '13 Paper Acceptance Rate 49 of 133 submissions, 37%;
Overall Acceptance Rate 453 of 1,080 submissions, 42%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 111
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 07 Jan 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media