How to Find Interesting Locations in Video: A Spatiotemporal Interest Point Detector Learned from Human Eye Movements

Kienzle, Wolf; Schölkopf, Bernhard; Wichmann, Felix A.; Franz, Matthias O.

doi:10.1007/978-3-540-74936-3_41

Wolf Kienzle¹,
Bernhard Schölkopf¹,
Felix A. Wichmann^2,3 &
…
Matthias O. Franz¹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4713))

Included in the following conference series:

Joint Pattern Recognition Symposium

2890 Accesses
23 Citations

Abstract

Interest point detection in still images is a well-studied topic in computer vision. In the spatiotemporal domain, however, it is still unclear which features indicate useful interest points. In this paper we approach the problem by learning a detector from examples: we record eye movements of human subjects watching video sequences and train a neural network to predict which locations are likely to become eye movement targets. We show that our detector outperforms current spatiotemporal interest point architectures on a standard classification dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Dollar, P., Rabaud, V., Cottrell, G., Belongie, S.J.: Behavior recognition via sparse spatio-temporal features. In: International Workshop on Performance Evaluation of Tracking and Surveillance, pp. 65–72 (2005)
Google Scholar
Findlay, J.M., Gilchrist, I.D.: Active Vision: The Psychology of Looking and Seeing. Oxford University Press, Oxford (2003)
Google Scholar
Frantz, S., Rohr, K., Stiehl, H.S.: On the Localization of 3D Anatomical Point Landmarks in Medical Imagery Using Multi-Step Differential Approaches. In: Proc. DAGM, pp. 340–347 (1997)
Google Scholar
Harris, C., Stephens, M.: A combined corner and edge detector. In: Alvey Vision Conference, pp. 147–151 (1988)
Google Scholar
Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. IEEE PAMI 20(11), 1254–1259 (1998)
Google Scholar
Ke, Y., Sukthankar, R., Hebert, M.: Efficient visual event detection using volumetric features. In: Proc. ICCV, pp. 166–173 (2005)
Google Scholar
Kienzle, W., Wichmann, F.A., Schölkopf, B., Franz, M.O.: Learning an interest operator from eye human movements. In: IEEE CVPR Workshop, p. 24. IEEE Computer Society Press, Los Alamitos (2006)
Google Scholar
Kienzle, W., Wichmann, F.A., Schölkopf, B., Franz, M.O.: A nonparametric approach to bottom-up visual saliency. In: Proc. NIPS 19 (in press, 2007)
Google Scholar
Laptev, I.: On space-time interest points. IJCV 64, 107–123 (2005)
Article Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. IJCV 60(2), 91–110 (2004)
Article Google Scholar
Niebles, J.C., Wang, H., Wang, H., Fei Fei, L.: Unsupervised learning of human action categories using spatial-temporal words. In: Proc. BMVC (2006)
Google Scholar
Reinagel, P., Zador, A.M.: Natural scene statistics at the center of gaze. Network: Computation in Neural Systems 10(4), 341–350 (1999)
Article MATH Google Scholar
Rutishauser, U., Walther, D., Koch, C., Perona, P.: Is bottom-up attention useful for object recognition? In: IEEE Proc. CVPR, pp. 37–44. IEEE Computer Society Press, Los Alamitos (2004)
Google Scholar
Schmid, C., Mohr, R., Bauckhage, C.: Evaluation of interest point detectors. IJCV 37(2), 151–172 (2000)
Article MATH Google Scholar
Schüldt, C., Laptev, I., Caputo, B.: Recognizing human actions: A local SVM approach. In: Proc. ICPR, pp. 32–36 (2004)
Google Scholar
The Netlab Toolbox, available at http://www.ncrg.aston.ac.uk/netlab/
Wandell, B.A.: Foundations of Vision. Sinauer Associates, Inc. (1995)
Google Scholar
Yarbus, A.: Eye movements and vision. Plenum Press (1967)
Google Scholar

Download references

Author information

Authors and Affiliations

Max-Planck Institut für biologische Kybernetik, Abteilung Empirische Inferenz, Spemannstr. 38, 72076 Tübingen,
Wolf Kienzle, Bernhard Schölkopf & Matthias O. Franz
Technische Universität Berlin, Fakultät IV, FB Modellierung Kognitiver, Prozesse, Sekr. FR 6-4, Franklinstr. 28/29, 10587 Berlin,
Felix A. Wichmann
Bernstein Center for Computational Neuroscience, Philippstr. 13 Haus 6, 10115 Berlin,
Felix A. Wichmann

Authors

Wolf Kienzle
View author publications
You can also search for this author in PubMed Google Scholar
Bernhard Schölkopf
View author publications
You can also search for this author in PubMed Google Scholar
Felix A. Wichmann
View author publications
You can also search for this author in PubMed Google Scholar
Matthias O. Franz
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Fred A. Hamprecht Christoph Schnörr Bernd Jähne

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kienzle, W., Schölkopf, B., Wichmann, F.A., Franz, M.O. (2007). How to Find Interesting Locations in Video: A Spatiotemporal Interest Point Detector Learned from Human Eye Movements. In: Hamprecht, F.A., Schnörr, C., Jähne, B. (eds) Pattern Recognition. DAGM 2007. Lecture Notes in Computer Science, vol 4713. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74936-3_41

Download citation

DOI: https://doi.org/10.1007/978-3-540-74936-3_41
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74933-2
Online ISBN: 978-3-540-74936-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics