Abstract
Current state-of-the-art approaches for visual human action recognition focus on complex local spatio-temporal descriptors, while the spatio-temporal relations between the descriptors are discarded. These bag-of-features (BOF) based methods come with the disadvantage of limited descriptive power, because class-specific mid- and large-scale spatio-temporal information, such as body pose sequences, cannot be represented. To overcome this restriction, we propose sparse non-negative linear dynamical systems (sNN-LDS) as a dynamic, parts-based, spatio-temporal representation of local descriptors. We provide novel learning rules based on sparse non-negative matrix factorization (sNMF) to simultaneously learn both the parts as well as their transitions. On the challenging UCF-Sports dataset our sNN-LDS combined with simple local features is competitive with state-of-the-art BOF-SVM methods.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Blank, M., Gorelick, L., Shechtman, E., Irani, M., Basri, R.: Actions as Space-Time Shapes. In: IEEE Int. Conf. on Computer Vision, ICCV (2005)
Rodriguez, M.D., Ahmed, J., Shah, M.: Action MACH: A Spatio-temporal Maximum Average Correlation Height Filter for Action Recognition. In: IEEE Conf. on Computer Vision and Pattern Recognition, CVPR (2008)
Olshausen, B., Field, D.J.: Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381, 607–609 (1996)
Paatero, P., Tapper, U.: Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values. Environmetrics 5(2), 111–126 (1994)
Lee, D.D., Seung, S.: Learning the parts of objects by non-negative matrix factorization. Nature 401, 788–791 (1999)
Tian, Y., Sukthankar, R., Shah, M.: Spatiotemporal Deformable Part Models for Action Detection. In: Int. Conf. on Computer Vision and Pattern Recognition, CVPR (2013)
Guthier, T., Eggert, J., Willert, V.: Unsupervised learning of motion patterns. In: European Symposium on Artificial Neural Networks, ESANN (2012)
Hoyer, P.O.: Non-negative sparse coding. IEEE Neural Networks for Signal Processing (2002)
Eggert, J., Koerner, E.: Sparse coding and NMF. In: IEEE Int. Joint Conf. on Neural Networks (IJCNN), vol. 4, pp. 2529–2533 (2004)
Amiri, S.M., Nasiopoulos, P., Leung, V.: Non-negative sparse coding for human action recognition. In: IEEE Int. Conf. on Image Processing, ICIP (2012)
Guha, T., Ward, R.K.: Learning sparse representations for human action recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1576–1588 (2012)
Guthier, T., Willert, V., Schnall, A., Kreuter, K., Eggert, J.: Non-negative sparse coding for motion extraction. In: IEEE Int. Joint Conf. on Neural Networks, IJCNN (2013)
Wang, H., Ullah, M.M., Klaser, A., Laptev, I., Schmid, C.: Evaluation of local spatio-temporal features for action recognition. In: British Machine Vision Conference, BMVC (2009)
Wang, H., Klaser, A., Schmid, C., Liu, C.L.: Dense trajectories and motion boundary descriptors for action recognition. International Journal of Computer Vision, 1–20 (2013)
Lakshminarayanan, B., Raich, R.: Non-negative matrix factorization for parameter estimation in hidden markov models. In: IEEE Int. Workshop on Machine Learning for Signal Processing, MLSP (2010)
Bilinski, P., Bremond, F.: Contextual statistics of space-time ordered features for human action recognition. In: IEEE Int. Conf. on Advanced Video and Signal-Based Surveillance (AVSS), pp. 228–233 (2012)
Wang, J., Chen, Z., Wu, Y.: Action recognition with multiscale spatio-temporal contexts. In: IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 3185–3192 (2011)
Tibshirani, R., Saunders, M., Rosset, S., Zhu, J., Kneight, K.: Sparsity and smoothness via the fused lasso. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 91–108 (2005)
Klaser, A., Marszałek, M., Laptev, I., Schmid, C., et al.: Will person detection help bag-of-features action recognition (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Guthier, T., Šošić, A., Willert, V., Eggert, J. (2014). sNN-LDS: Spatio-temporal Non-negative Sparse Coding for Human Action Recognition. In: Wermter, S., et al. Artificial Neural Networks and Machine Learning – ICANN 2014. ICANN 2014. Lecture Notes in Computer Science, vol 8681. Springer, Cham. https://doi.org/10.1007/978-3-319-11179-7_24
Download citation
DOI: https://doi.org/10.1007/978-3-319-11179-7_24
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11178-0
Online ISBN: 978-3-319-11179-7
eBook Packages: Computer ScienceComputer Science (R0)