Skip to main content

Efficient Framework for Action Recognition Using Reduced Fisher Vector Encoding

  • Conference paper
  • First Online:

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 460))

Abstract

This paper presents a novel and efficient approach to improve performance of recognizing human actions from video by using an unorthodox combination of stage-level approaches. Feature descriptors obtained from dense trajectory i.e. HOG, HOF and MBH are known to be successful in representing videos. In this work, Fisher Vector Encoding with reduced dimensions are separately obtained for each of these descriptors and all of them are concatenated to form one super vector representing each video. To limit the dimension of this super vector we only include first order statistics, computed by the Gaussian Mixture Model, in the individual Fisher Vectors. Finally, we use elements of this super vector, as inputs to be fed to the Deep Belief Network (DBN) classifier. The performance of this setup is evaluated on KTH and Weizmann datasets. Experimental results show a significant improvement on these datasets. An accuracy of 98.92 and 100 % has been obtained on KTH and Weizmann dataset respectively.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Baccouche, M., Mamalet, F., Wolf, C., Garcia, C., Baskurt, A.: Sequential deep learning for human action recognition. In: Human Behavior Understanding, pp. 29–39. Springer (2011)

    Google Scholar 

  2. Bengio, Y., LeCun, Y., et al.: Scaling learning algorithms towards ai. Large-scale kernel machines 34(5) (2007)

    Google Scholar 

  3. Blank, M., Gorelick, L., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. In: The Tenth IEEE International Conference on Computer Vision (ICCV’05). pp. 1395–1402 (2005)

    Google Scholar 

  4. Bregonzio, M., Gong, S., Xiang, T.: Recognising action as clouds of space-time interest points. In: IEEE Conference on Computer Vision and Pattern Recognition, 2009. CVPR 2009. pp. 1948–1955. IEEE (2009)

    Google Scholar 

  5. Fathi, A., Mori, G.: Action recognition by learning mid-level motion features. In: IEEE Conference on Computer Vision and Pattern Recognition, 2008. CVPR 2008. pp. 1–8. IEEE (2008)

    Google Scholar 

  6. Gao, Z., Chen, M.Y., Hauptmann, A.G., Cai, A.: Comparing evaluation protocols on the kth dataset. In: Human Behavior Understanding, pp. 88–100. Springer (2010)

    Google Scholar 

  7. Gers, F.A., Schraudolph, N.N., Schmidhuber, J.: Learning precise timing with lstm recurrent networks. The Journal of Machine Learning Research 3, 115–143 (2003)

    MathSciNet  MATH  Google Scholar 

  8. Ikizler, N., Duygulu, P.: Histogram of oriented rectangles: A new pose descriptor for human action recognition. Image and Vision Computing 27(10), 1515–1526 (2009)

    Article  Google Scholar 

  9. Kovashka, A., Grauman, K.: Learning a hierarchy of discriminative space-time neighborhood features for human action recognition. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 2046–2053. IEEE (2010)

    Google Scholar 

  10. Lin, Z., Jiang, Z., Davis, L.S.: Recognizing actions by shape-motion prototype trees. In: 2009 IEEE 12th International Conference on Computer Vision,. pp. 444–451. IEEE (2009)

    Google Scholar 

  11. Liu, J., Luo, J., Shah, M.: Recognizing realistic actions from videos in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition, 2009. CVPR 2009. pp. 1996–2003. IEEE (2009)

    Google Scholar 

  12. Liu, J., Shah, M.: Learning human actions via information maximization. In: IEEE Conference on Computer Vision and Pattern Recognition, 2008. CVPR 2008. pp. 1–8. IEEE (2008)

    Google Scholar 

  13. Oneata, D., Verbeek, J., Schmid, C.: Action and event recognition with fisher vectors on a compact feature set. In: 2013 IEEE International Conference on Computer Vision (ICCV). pp. 1817–1824. IEEE (2013)

    Google Scholar 

  14. Sadek, S., Al-Hamadi, A., Michaelis, B., Sayed, U.: An action recognition scheme using fuzzy log-polar histogram and temporal self-similarity. EURASIP Journal on Advances in Signal Processing 2011(1), 540375 (2011)

    Article  Google Scholar 

  15. Schindler, K., Van Gool, L.: Action snippets: How many frames does human action recognition require? In: IEEE Conference on Computer Vision and Pattern Recognition, 2008. CVPR 2008. pp. 1–8. IEEE (2008)

    Google Scholar 

  16. Schüldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local svm approach. In: Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004. vol. 3, pp. 32–36. IEEE (2004)

    Google Scholar 

  17. Sun, C., Nevatia, R.: Large-scale web video event classification by use of fisher vectors. In: 2013 IEEE Workshop on Applications of Computer Vision (WACV). pp. 15–22. IEEE (2013)

    Google Scholar 

  18. Sun, C., Junejo, I., Foroosh, H.: Action recognition using rank-1 approximation of joint self-similarity volume. In: 2011 IEEE International Conference on Computer Vision (ICCV). pp. 1007–1012. IEEE (2011)

    Google Scholar 

  19. Sun, X., Chen, M., Hauptmann, A.: Action recognition via local descriptors and holistic features. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2009. CVPR Workshops 2009. pp. 58–65. IEEE (2009)

    Google Scholar 

  20. Wang, H., Kläser, A., Schmid, C., Liu, C.L.: Action recognition by dense trajectories. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR),. pp. 3169–3176. IEEE (2011)

    Google Scholar 

  21. Wang, H., Ullah, M.M., Klaser, A., Laptev, I., Schmid, C.: Evaluation of local spatio-temporal features for action recognition. In: BMVC 2009-British Machine Vision Conference. pp. 124–1. BMVA Press (2009)

    Google Scholar 

  22. Wang, Y., Mori, G.: Human action recognition by semilatent topic models. IEEE Transactions on Pattern Analysis and Machine Intelligence 31(10), 1762–1774 (2009)

    Article  Google Scholar 

  23. Weinland, D., Boyer, E.: Action recognition using exemplar-based embedding. In: IEEE Conference on Computer Vision and Pattern Recognition, 2008. CVPR 2008. pp. 1–7. IEEE (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Prithviraj Dhar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Science+Business Media Singapore

About this paper

Cite this paper

Dhar, P., Alvarez, J.M., Roy, P.P. (2017). Efficient Framework for Action Recognition Using Reduced Fisher Vector Encoding. In: Raman, B., Kumar, S., Roy, P., Sen, D. (eds) Proceedings of International Conference on Computer Vision and Image Processing. Advances in Intelligent Systems and Computing, vol 460. Springer, Singapore. https://doi.org/10.1007/978-981-10-2107-7_31

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-2107-7_31

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-2106-0

  • Online ISBN: 978-981-10-2107-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics