Abstract
Interest point detection in still images is a well-studied topic in computer vision. In the spatiotemporal domain, however, it is still unclear which features indicate useful interest points. In this paper we approach the problem by learning a detector from examples: we record eye movements of human subjects watching video sequences and train a neural network to predict which locations are likely to become eye movement targets. We show that our detector outperforms current spatiotemporal interest point architectures on a standard classification dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Dollar, P., Rabaud, V., Cottrell, G., Belongie, S.J.: Behavior recognition via sparse spatio-temporal features. In: International Workshop on Performance Evaluation of Tracking and Surveillance, pp. 65–72 (2005)
Findlay, J.M., Gilchrist, I.D.: Active Vision: The Psychology of Looking and Seeing. Oxford University Press, Oxford (2003)
Frantz, S., Rohr, K., Stiehl, H.S.: On the Localization of 3D Anatomical Point Landmarks in Medical Imagery Using Multi-Step Differential Approaches. In: Proc. DAGM, pp. 340–347 (1997)
Harris, C., Stephens, M.: A combined corner and edge detector. In: Alvey Vision Conference, pp. 147–151 (1988)
Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. IEEE PAMI 20(11), 1254–1259 (1998)
Ke, Y., Sukthankar, R., Hebert, M.: Efficient visual event detection using volumetric features. In: Proc. ICCV, pp. 166–173 (2005)
Kienzle, W., Wichmann, F.A., Schölkopf, B., Franz, M.O.: Learning an interest operator from eye human movements. In: IEEE CVPR Workshop, p. 24. IEEE Computer Society Press, Los Alamitos (2006)
Kienzle, W., Wichmann, F.A., Schölkopf, B., Franz, M.O.: A nonparametric approach to bottom-up visual saliency. In: Proc. NIPS 19 (in press, 2007)
Laptev, I.: On space-time interest points. IJCV 64, 107–123 (2005)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. IJCV 60(2), 91–110 (2004)
Niebles, J.C., Wang, H., Wang, H., Fei Fei, L.: Unsupervised learning of human action categories using spatial-temporal words. In: Proc. BMVC (2006)
Reinagel, P., Zador, A.M.: Natural scene statistics at the center of gaze. Network: Computation in Neural Systems 10(4), 341–350 (1999)
Rutishauser, U., Walther, D., Koch, C., Perona, P.: Is bottom-up attention useful for object recognition? In: IEEE Proc. CVPR, pp. 37–44. IEEE Computer Society Press, Los Alamitos (2004)
Schmid, C., Mohr, R., Bauckhage, C.: Evaluation of interest point detectors. IJCV 37(2), 151–172 (2000)
Schüldt, C., Laptev, I., Caputo, B.: Recognizing human actions: A local SVM approach. In: Proc. ICPR, pp. 32–36 (2004)
The Netlab Toolbox, available at http://www.ncrg.aston.ac.uk/netlab/
Wandell, B.A.: Foundations of Vision. Sinauer Associates, Inc. (1995)
Yarbus, A.: Eye movements and vision. Plenum Press (1967)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kienzle, W., Schölkopf, B., Wichmann, F.A., Franz, M.O. (2007). How to Find Interesting Locations in Video: A Spatiotemporal Interest Point Detector Learned from Human Eye Movements. In: Hamprecht, F.A., Schnörr, C., Jähne, B. (eds) Pattern Recognition. DAGM 2007. Lecture Notes in Computer Science, vol 4713. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74936-3_41
Download citation
DOI: https://doi.org/10.1007/978-3-540-74936-3_41
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74933-2
Online ISBN: 978-3-540-74936-3
eBook Packages: Computer ScienceComputer Science (R0)