Abstract
In this paper, we study mathematical models of atomic visual patterns from natural videos and establish a generative visual vocabulary for video representation. Empirically, we employ small video patches (e.g., 15×15×5, called video “bricks”) in natural videos as basic analysis unit. There are a variety of brick subspaces (or atomic video words) of varying dimensions in the high dimensional brick space. The structures of the words are characterized by both appearance and motion dynamics. Here, we categorize the words into two pure types: structural video words (SVWs) and textural video words (TVWs). A common generative model is introduced to model these two type video words in a unified form. The representation power of a word is measured by its information gain, based on which words are pursued one by one via a novel pursuit algorithm, and finally a holistic video vocabulary is built up. Experimental results show the potential power of our framework for video representation.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Olshausen, B.A., Field, D.J.: Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381, 607–609 (1996)
Zhu, S.C., Guo, C.E., Wang, Y.Z., Xu, Z.J.: What are textons? IJCV (2005)
Shi, K., Zhu, S.C.: Mapping natural image patches by explicit and implicit manifolds. In: CVPR (2007)
Dollár, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior recognition via sparse spatio-temporal features. In: VS-PETS (2005)
Shechtman, E., Irani, M.: Space-time behavior-based correlation. PAMI (2007)
Zhu, S.C., Wu, Y.N., Mumford, D.: Filters, random-fields and maximum-entropy (frame): Towards a unified theory for texture modeling. IJCV 27, 107–126 (1998)
Veenman, C., Reinders, M., Backer, E.: Resolving motion correspondence for densely moving points. PAMI (2001)
Olshausen, B.A.: Learning sparse, overcomplete representations of time-varying natural images. In: ICIP (2003)
Wu, Y., Hua, G., Yu, T.: Tracking articulated body by dynamic markov network. In: ICCV (2003)
Soatto, S., Doretto, G., Wu, Y.: Dynamic textures. In: ICCV (2001)
Wang, Y., Zhu, S.C.: Modeling textured motion: Particle, wave and sketch. In: ICCV (2003)
Belhumeur, P., Kriegman, D.: What is the set of images of an object under all possible illumination conditions? Int. Journal of Computer Vision 28, 245–260 (1998)
Zhao, Y.D., Gong, H., Lin, L., Jia, Y.: Spatio-temporal patches for night background modeling by subspace learning. In: ICPR (2008)
Derpanis, K.G., Wildes, R.P.: Early spatiotemporal grouping with a distributed oriented energy representation. In: CVPR (2009)
Chan, A.B., Vasconcelos, N.: Modeling, clustering, and segmenting video with mixtures of dynamic textures. PAMI 30, 909–926 (2008)
Wu, Y.N., Si, Z., Fleming, C., Zhu, S.C.: Deformable template as active basis. In: ICCV (2007)
Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: CVPR (2006)
Comaniciu, D., Meer, P.: Mean shift: A robust approach toward feature space analysis. PAMI (2002)
Chan, A.B., Vasconcelos, N.: Layered dynamic textures. PAMI 31, 1862–1879 (2009)
Tenenbaum, J.B., de Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science (2000)
Marszalk, M., Laptev, I., Schmid, C.: Actions in context. In: CVPR (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhao, Y., Gong, H., Jia, Y. (2011). Pursuing Atomic Video Words by Information Projection. In: Kimmel, R., Klette, R., Sugimoto, A. (eds) Computer Vision – ACCV 2010. ACCV 2010. Lecture Notes in Computer Science, vol 6493. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19309-5_20
Download citation
DOI: https://doi.org/10.1007/978-3-642-19309-5_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-19308-8
Online ISBN: 978-3-642-19309-5
eBook Packages: Computer ScienceComputer Science (R0)