Abstract
In this paper, we present a new method for video action recognition. The main contributions are two-fold. First, we propose local coordinates contained descriptors (LCCD) instead of appearance-only descriptors. We encode global geometric correspondence by combining descriptors with spatio-temporal locations, which is different from previous methods such as spatio-temporal pyramid matching (STPM). Spatio-temporal location is taken as part of the coding step by utilizing LCCD. Second, a novel non-negative low rank and sparse coding model is developed to encode descriptors for action recognition. Motivated by low rank matrix recovery and completion, local descriptors in a spatio-temporal neighborhood are similar and should be approximately low rank. The objective function is obtained by seeking non-negative low rank and sparse coefficients for local descriptors. The learned coefficients can capture location information and the structure of descriptors, hence improve the discriminability of representations. Experiments validate that our method achieves the state-of-the-art results on two benchmark datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Yuan, C.F., Li, X., Hu, W.M., Ling, H.B., Maybank, S.: 3D R Transform on Spatio-Temporal Interest Points for Action Recognition. In: 26th IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–7. IEEE Press, Portland (2013)
Gao, S.H., Tsang, I.W.H., Chia, L.T., Zhao, P.L.: Local features are not lonely - laplacian sparse coding for imageclassification. In: 23th IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–2. IEEE Press, San Francisco (2010)
Zhuang, L.S., Gao, H.Y., Lin, Z.C., Ma, Y., Zhang, X.: Non-Negative Low Rank and Sparse Graph for Semi-Supervised Learning. In: 25th IEEE Conference on Computer Vision and Pattern Recognition, pp. 2328–2331. IEEE Press, Providence (2012)
Zhang, T.Z., Ghanem, B., Liu, S., Xu, C.S., Zhang, X., Yu, N.H., Ahuja, N.: Low-Rank Sparse Coding for Image Classification. In: 14th IEEE International Conference on Computer Vision, pp. 281–286. IEEE Press, Sydney (2013)
Zhang, C.J., Liu, J., Tian, Q., Xu, C.S.: Image Classification by Non-Negative Sparse Coding, Low-Rank and Sparse Decomposition. In: 24th IEEE Conference on Computer Vision and Pattern Recognition, pp. 1673–1678. IEEE Press, Colorado (2011)
Choi, J., Wang, Z.Y., Lee, S.C.: Spatio-temporal pyramid matching for sports videos. In: Proceeding MIR 2008 Proceedings of the 1st ACM International Conference on Multimedia Information Retrieval, pp. 291–297. IEEE Press, New York (2008)
Le, Q.V., Zou, W.Y., Yeung, S.Y., Ng, A.Y.: Learning Hierarchical Invariant Spatio-Temporal Features for Action Recognition with Independent Subspace Analysis. In: 24th IEEE Conference on Computer Vision and Pattern Recognition, pp. 3361–3368. IEEE Press, Providence (2011)
Zhang, Y.M.Z., Jiang, Z.L., Davis, L.S.: Learning Structured Low-rank Representations for Image Classification. In: 26th IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–3. IEEE Press, Portland (2013)
Jiang, Z.L., Ghanem, B., Liu, S., Ahuja, N.: Low-rank sparse learning for robust visual tracking. In: 12th European Conference on Computer Vision, pp. 470–474. IEEE Press, Florence (2012)
Zhang, Z.D., Matsushita, Y., Ma, Y.: Camera Calibration with Lens Distortion from Low-rank Textures. In: 24th IEEE Conference on Computer Vision and Pattern Recognition, pp. 2321–2328. IEEE Press, Providence (2011)
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: 19th IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–7. IEEE Press, New York (2006)
Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning Realistic Human Actions from Movies. In: 21th IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–7. IEEE Press, Alaska (2008)
Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: A local SVM approach. In: Proceedings of International Conference on Pattern Recognition, 17th IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–4. IEEE Press, Washington (2004)
Rodriguez, M.D., Ahmed, J., Shah, M.: Action mach a spatio-temporal maximum average correlation height filter for action recognition. In: 21st IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–7. IEEE Press, Alaska (2008)
Wang, H., Klaser, A., Schmid, C., Liu, C.L.: Dense Trajectories and Motion Boundary Descriptors for Action Recognition. IJCV 103, 60–79 (2013)
Zhang, Y.M., Liu, X.M., Chang, M.C., Ge, W.N., Chen, T.: Spatio-Temporal Phrases for Activity Recognition. In: 12th European Conference on Computer Vision, pp. 707–721. IEEE Press, San Francisco (2012)
Brendel, W., Todorovic, S.: Activities as Time Series of Human Postures. In: 11th European Conference on Computer Vision, pp. 9–13. IEEE Press, Greece (2010)
Kovashka, A., Grauman, K.: Learning a hierarchy of discriminative space-time neighborhood features for human action recognition. In: 23rd IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–4. IEEE Press, San Francisco (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sheng, B., Yang, W., Zhang, B., Sun, C. (2014). A Non-negative Low Rank and Sparse Model for Action Recognition. In: Li, S., Liu, C., Wang, Y. (eds) Pattern Recognition. CCPR 2014. Communications in Computer and Information Science, vol 484. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-45643-9_28
Download citation
DOI: https://doi.org/10.1007/978-3-662-45643-9_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-45642-2
Online ISBN: 978-3-662-45643-9
eBook Packages: Computer ScienceComputer Science (R0)