Human action recognition based on aggregated local motion estimates

Lucena, M.; Pérez de la Blanca, N.; Fuertes, J. M.

doi:10.1007/s00138-010-0305-9

Human action recognition based on aggregated local motion estimates

Original Paper
Published: 09 December 2010

Volume 23, pages 135–150, (2012)
Cite this article

Machine Vision and Applications Aims and scope Submit manuscript

M. Lucena¹,
N. Pérez de la Blanca² &
J. M. Fuertes¹

280 Accesses
10 Citations
Explore all metrics

Abstract

This paper addresses the human action recognition task from optical flow. This task is in itself an interesting problem, given the lack of accuracy and noisy characteristics of the optical flow estimation. Optical flow is one of the most popular descriptors characterizing motion, but due to its instability is usually used in combination with parametric models. In this work, we develop a non-parametric motion model using only the image region surrounding the actor making the action. To be precise, for every two consecutive frames, a local motion descriptor is calculated from the optical flow orientation histograms collected inside the actor’s bounding box. An action descriptor is built by weighting and aggregating the estimated histograms along the temporal axis. The proposed approach obtains a promising trade-off between complexity and performance compared with state-of-the-art approaches. The action recognition can also be done in real time by accumulating evidence from each new incoming image. Experiments on two well-known video sequence databases are carried out in order to evaluate the behavior of the proposal.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Human Action Recognition Using Dominant Motion Pattern

Automated video analysis for action recognition using descriptors derived from optical acceleration

Article 31 January 2019

An Evaluation of Local Action Descriptors for Human Action Classification in the Presence of Occlusion

References

Aggarwal J., Cai Q.: Human motion analysis: a review. Comput. Vis. Image Underst. 73(3), 428–440 (1999)
Article Google Scholar
Ahmad, M., Lee, S.: HMM-based human action recognition using multiview image sequences. In: International Conference on Pattern Recognition, pp. 263–266 (2006)
Ahmad M., Lee S.: Human action recognition using shape and clg-motion flow from multi-view image sequences. Pattern Recognit. 41, 2237–2252 (2008)
Article MATH Google Scholar
Blank, M., Gorelick, L., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV’05), vol. 2, pp. 1395–1402 (2005)
Bobick A., Davis J.: The recognition of human movement using temporal templates. IEEE Trans. Pattern Anal. Mach. Intell. 23(3), 257–267 (2001)
Article Google Scholar
Boiman, O., Shechtman, E., Irani, M.: In defense of nearest-neighbor based image classification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2008
Brand, M., Oliver, N., Pentland, A.: Coupled hidden Markov models for complex action recognition. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 994–999 (1997)
Brox, T., Bruhn, A., Papenberg, N., Weickert, J.: High accuracy optical flow estimation based on a theory for warping. In: Proceedings of the 8th European Conference on Computer Vision. Lecture Notes in Computer Science, vol. 3024, pp. 25–36. Springer, New York (2004)
Bruhn A., Weickert J., Schnörr C.: Lucas/Kanade meets Horn/Schunck: combining local and global optic flow methods. Int. J. Comput. Vis. 61(3), 211–231 (2005)
Article Google Scholar
Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines. (2001). http://www.csie.ntu.edu.tw/~cjlin/libsvm
Cuntoor, N.P., Yegnanarayana, B., Chellappa, R.: Interpretation of state sequences in hmm for activity representation. In: Proceedings of IEEE ICASSP, pp. 709–712 (2005)
Dalal, N., Triggs, B., Schmid, C.: Human detection using oriented histograms of flow and appearance. In: European Conference on Computer Vision, vol. 2, pp. 428–441 (2006)
Dollar, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior recognition via sparse spatio-temporal features. In: 2nd Joint IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, pp. 65–72 (2005)
Efros, A.A., Berg, A.C., Mori, G., Malik, J.: Recognizing action at a distance. In: IEEE International Conference on Computer Vision, vol. 2, pp. 726–733 (2003)
Farnebäck, G.: Two-frame motion estimation based on polynomial expansion. In: Proceedings of the 13th Scandinavian Conference on Image Analysis. Lecture Notes in Computer Science, vol. 2749, pp. 363–370, Göthenburg, Sweden, June–July, 2003
Ferrari, V., Marin-Jimenez, M., Zisserman, A.: Progressive search space reduction for human pose estimation. In: Int. Conference on Computer Vision and Pattern Recognition, CVPR-08 (2008)
Gavrila D.: The visual analysis of human movement: a survey. Comput. Vis. Image Underst. 73(1), 82–98 (1999)
Article MATH Google Scholar
Ikizler, N., Duygulu, P.: Human action recognition using distribution of oriented rectangular patches. In: Workshop on Human Motion. Lecture notes in Computer Science, vol. 4814, pp. 271–284. Springer, New York (2007)
Isard, M., MacCormick, J.: Bramble: a Bayesian multiple-blob tracker. In: Proceedings of the Eighth IEEE International Conference on Computer Vision (ICCV ’01), vol. 2, pp. 34–41 (2001)
Ke, Y., Sukthankar, R., Hebert, M.: Efficient visual event detection using volumetric features. In: Proceedings of IEEE International Conference on Computer Vision (ICCV ’05), pp. 166–173 (2005)
Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: International Conference on Computer Vision and Pattern Recognition (2008)
Liu, J., Ali, S., Shah, M.: Recognizing human actions using multiple features. In: IEEE Conference on Computer Vision and Pattern Recognition, 2008 (CVPR 2008), June 2008
Liu, J., Shah, M.: Learning human actions via information maximization. In: IEEE Conference on Computer Vision and Pattern Recognition, 2008 (CVPR 2008), June 2008
Lucas, B., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: Proceedings of DARPA IU Workshop, pp. 121–130 (1981)
Lucena, M., Pérez de la Blanca, N., Fuertes, J.M., Marín-Jiménez, M.J.: Human action recognition using optical flow accumulated local histograms. In: Proceedings of the 4th IbPRIA. Lecture Notes in Computer Science, vol. 5524, pp. 32–39, June 2009, Póvoa de Varzim (Portugal). Springer, New York (2009)
Mendoza, M.A., Perez de la Blanca, N.: HMM-based action recognition using contour histograms. In: Proceedings of the 3th Iberian Conference on Pattern Recognition and Image Analysis. Lecture Notes in Computer Science, vol. 4477, pp. 394–401. Springer, New York (2007)
Mendoza, M.A., Perez de la Blanca, N.: Human action recognition using space state models: a comparitive study. In: Proceedings of the AMDO’08. Palma de Mallorca (2008)
Mikolajczyk, K., Uemura, H.: Action recognition with motion-appearance vocabulary forest. In: IEEE Conference on Computer Vision and Pattern Recognition, 2008 (CVPR 2008), June 2008
Moeslund T., Granum E.: A survey of computer vision-based human motion capture. Comput. Vis. Image Underst. 81(3), 231–268 (2001)
Article MATH Google Scholar
Moeslund T., Hilton A., Krger V.: A survey of advances in vision-based human motion capture and analysis. Comput. Vis. Image Underst. 104, 90–126 (2006)
Article Google Scholar
Mokhber A., Achard C., Maurice M.: Recognition of human behavior by space-time silhouette characterization. Pattern Recognit. Lett. 29, 81–89 (2008)
Article Google Scholar
Morency, L.P., Quattoni, A., Darrell, T.: Latent-dynamic discriminative models for continuous gesture recognition. Technical report, Massachussetts Institute of Technology, Computer Science and Artificial Intelligence Laboratory (CSAIL) (2007)
Otsu N.: A threshold selection method from gray level histograms. IEEE Trans. Syst. Man Cybernet. 9, 62–66 (1979)
Article Google Scholar
Polana, R., Nelson, R.: Detecting activities. In: Proceedings of Computer Vision and Pattern Recognition, pp. 2–7 (1993)
Ramanan D., Forsyth D.A., Zisserman A.: Tracking people by learning their appearance. IEEE Trans. Pattern Anal. Mach. Intell. 29(1), 65–81 (2007)
Article Google Scholar
Rao, C., Shah, M.: View-invariance in action recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 316–322 (2001)
Schindler, K., van Gool, L.: Action snippets: how many frames does human action recognition require? In: IEEE Conference on Computer Vision and Pattern Recognition, 2008 (CVPR 2008), June 2008
Schüldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. In: International Conference on Pattern Recognition, vol. 3, pp. 32–36, Cambridge, UK (2004)
Seitz S., Dyer C.: View invariant analysis of cyclic motion. Int. J. Comput. Vis. 25, 231–251 (1997)
Article Google Scholar
Shechtman E., Irani M.: Space-time behavior-based correlation or how to tell if two underlying motion fields are similar without computing them?. IEEE Trans. Pattern Anal. Mach. Intell. 29(11), 2045–2056 (2007)
Article Google Scholar
Sminchisescu, C., Kanaujia, A., Li, Z., Metaxas, D.: Conditional models for contextual human motion recognition. In: Tenth IEEE International Conference on Computer Vision (ICCV’05), vol. 2, pp. 1808–1815 (2005)
Venkatesh Babu R., Anantharaman B., Ramakrishnan K.R., Srinivasan S.H.: Compressed domain action classification using hmm. Pattern Recognit. Lett. 23(10), 1203–1213 (2002)
Article MATH Google Scholar
Venkatesh Babu, R., Ramakrishnan, K.R.: Compressed domain human motion recognition using motion history information. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP ’03), vol. 3, pp. 321–324, 6–10 April 2003
Wang, S., Quattoni, A., Morency, L.P., Demirdjian, D., Darrel, T.: Hidden conditional random fields for gesture recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR ’06), vol. 2, pp. 1521–1527 (2006)
Willems, G., Tuytelaars, T., Gool, L.: An efficient dense and scale-invariant spatio-temporal interest point detector. In: ECCV ’08: Proceedings of the 10th European Conference on Computer Vision, pp. 650–663. Springer-Verlag, Berlin (2008)
Yamato, J., Ohya, J., Ishii, K.: Recognizing human action in time-sequential images using hidden Markov model. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 379–385 (1992)
Zelnik-Manor, L., Irani, M.: Event-based analysis of video. In: Proceedings of Computer Vision and Pattern Recognition, vol. 2, pp. 123–130 (2001)
Zelnik-Manor L., Irani M.: Statistical analysis of dynamic actions. IEEE Trans. Pattern Anal. Mach. Intell. 28(9), 1530–1535 (2006)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Jaén, 23071, Jaén, Spain
M. Lucena & J. M. Fuertes
Department of Computer Science and Artificial Intelligence, University of Granada, 18071, Granada, Spain
N. Pérez de la Blanca

Authors

M. Lucena
View author publications
You can also search for this author in PubMed Google Scholar
N. Pérez de la Blanca
View author publications
You can also search for this author in PubMed Google Scholar
J. M. Fuertes
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to M. Lucena.

Electronic Supplementary Material

The Below is the Electronic Supplementary Material.

ESM 1 (PDF 21 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lucena, M., Pérez de la Blanca, N. & Fuertes, J.M. Human action recognition based on aggregated local motion estimates. Machine Vision and Applications 23, 135–150 (2012). https://doi.org/10.1007/s00138-010-0305-9

Download citation

Received: 03 December 2009
Revised: 09 August 2010
Accepted: 15 November 2010
Published: 09 December 2010
Issue Date: January 2012
DOI: https://doi.org/10.1007/s00138-010-0305-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Human action recognition based on aggregated local motion estimates

Abstract

Access this article

Similar content being viewed by others

Human Action Recognition Using Dominant Motion Pattern

Automated video analysis for action recognition using descriptors derived from optical acceleration

An Evaluation of Local Action Descriptors for Human Action Classification in the Presence of Occlusion

References

Author information

Authors and Affiliations

Corresponding author

Electronic Supplementary Material

ESM 1 (PDF 21 kb)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Human action recognition based on aggregated local motion estimates

Abstract

Access this article

Similar content being viewed by others

Human Action Recognition Using Dominant Motion Pattern

Automated video analysis for action recognition using descriptors derived from optical acceleration

An Evaluation of Local Action Descriptors for Human Action Classification in the Presence of Occlusion

References

Author information

Authors and Affiliations

Corresponding author

Electronic Supplementary Material

ESM 1 (PDF 21 kb)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation