SV-NET: A Deep Learning Approach to Video Based Human Activity Recognition

Bhattacharya, Sukrit; Shaw, Vaibhav; Singh, Pawan Kumar; Sarkar, Ram; Bhattacharjee, Debotosh

doi:10.1007/978-3-030-49345-5_2

Sukrit Bhattacharya¹⁸,
Vaibhav Shaw¹⁹,
Pawan Kumar Singh²⁰,
Ram Sarkar²⁰ &
…
Debotosh Bhattacharjee²⁰

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1182))

Included in the following conference series:

International Conference on Soft Computing and Pattern Recognition

471 Accesses
6 Citations

Abstract

The automatic identification of physical activities performed by human beings is referred to as Human Activity Recognition (HAR). It aims to infer the actions of one or more persons from a set of observations captured by sensors, videos or still images. Recognizing human activities from video sequences is a much challenging task due to problems such as background clutter, partial occlusion, changes in scale, viewpoint, lighting, and appearance etc. In this paper, we propose a Convolutional Neural Network (CNN) model named as SV-NET, in order to classify human activities obtained directly from RGB videos. The proposed model has been tested on three benchmark video datasets namely, KTH, UCF11 and HMDB51. The results of the proposed model demonstrate improved performance over some existing deep learning based models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Gupta, A., Davis, L.S.: Objects in action: an approach for combining action understanding and object perception (2007)
Google Scholar
Alahi, A., Ramanathan, V., Fei-Fei, L.: Socially-aware large-scale crowd forecasting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2203–2210 (2014)
Google Scholar
Kuehne, H., Arslan, A., Serre, T.: The language of actions: recovering the syntax and semantics of goal-directed human activities. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 780–787 (2014)
Google Scholar
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, New York (2006)
MATH Google Scholar
Yu, G., Yuan, J.: Fast action proposals for human action detection and search. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1302–1311 (2015)
Google Scholar
Fernando, B., Gavves, E., Oramas, J.M., Ghodrati, A., Tuytelaars, T.: Modeling video evolution for action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5378–5387 (2015)
Google Scholar
Kulkarni, K., Evangelidis, G., Cech, J., Horaud, R.: Continuous action recognition based on sequence alignment. Int. J. Comput. Vision 112(1), 90–114 (2015)
Article Google Scholar
Kovashka, A., Grauman, K.: Learning a hierarchy of discriminative space-time neighborhood features for human action recognition. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2046–2053. IEEE, June 2010
Google Scholar
Theodoridis, S., Koutroumbas, K.: Pattern Recognition, 4th edn. Academic Press, Boston (2008)
MATH Google Scholar
Ma, S., Sigal, L., Sclaroff, S.: Space-time tree ensemble for action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5024–5032 (2015)
Google Scholar
Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. In: Proceedings of the 17th International Conference on Pattern Recognition, ICPR 2004, vol. 3, pp. 32–36. IEEE, August 2004
Google Scholar
Liu, J., Luo, J., Shah, M.: Recognizing realistic actions from videos in the wild. In: CVPR, June 2009
Google Scholar
Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: HMDB: a large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563. IEEE, November 2011
Google Scholar
Grushin, A., Monner, D.D., Reggia, J.A., Mishra, A.: Robust human action recognition via long short-term memory. In: The 2013 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE, August 2013
Google Scholar
Naveed, H., Khan, G., Khan, A.U., Siddiqi, A., Khan, M.U.G.: Human activity recognition using mixture of heterogeneous features and sequential minimal optimization. Int. J. Mach. Learn. Cybern. 10(9), 2329–2340 (2019)
Article Google Scholar
Wang, X., Wang, L., Qiao, Y.: A comparative study of encoding, pooling and normalization methods for action recognition. In: Asian Conference on Computer Vision, pp. 572–585. Springer, Heidelberg, November 2012
Google Scholar
Akilandasowmya, G., Sathiya, P., AnandhaKumar, P.: Human action analysis using K-NN classifier. In: 2015 Seventh international conference on advanced computing (ICoAC), pp. 1–7. IEEE, December 2015
Google Scholar
Zhang, Z., Tao, D.: Slow feature analysis for human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 3, 436–450 (2012)
Article Google Scholar
Hasan, M., Roy-Chowdhury, A.K.: Incremental activity modeling and recognition in streaming videos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 796–803 (2014)
Google Scholar
Ikizler-Cinbis, N., Sclaroff, S.: Object, scene and actions: combining multiple features for human action recognition. In: European Conference on Computer Vision, pp. 494–507. Springer, Heidelberg, September 2010
Google Scholar
Wang, H., Kläser, A., Schmid, C., Liu, C.-L.: Action recognition by dense trajectories, June 2011
Google Scholar
Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. In: Advances in Neural Information Processing Systems, pp. 568–576 (2014)
Google Scholar
Wu, J., Hu, D.: Learning effective event models to recognize a large number of human actions. IEEE Trans. Multimedia 16(1), 147–158 (2013)
Article Google Scholar
Lan, Z., Lin, M., Li, X., Hauptmann, A.G., Raj, B.: Beyond Gaussian pyramid: multi-skip feature stacking for action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 204–212 (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electronics and Communication Engineering, SRM Institute of Science and Technology, Chennai, India
Sukrit Bhattacharya
Department of Computer Science and Engineering, Jalpaiguri Government Engineering College, Kolkata, India
Vaibhav Shaw
Department of Computer Science and Engineering, Jadavpur University, Kolkata, India
Pawan Kumar Singh, Ram Sarkar & Debotosh Bhattacharjee

Authors

Sukrit Bhattacharya
View author publications
You can also search for this author in PubMed Google Scholar
Vaibhav Shaw
View author publications
You can also search for this author in PubMed Google Scholar
Pawan Kumar Singh
View author publications
You can also search for this author in PubMed Google Scholar
Ram Sarkar
View author publications
You can also search for this author in PubMed Google Scholar
Debotosh Bhattacharjee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pawan Kumar Singh .

Editor information

Editors and Affiliations

Scientific Network for Innovation and Research Excellence, Machine Intelligence Research Labs (MIR), Auburn, WA, USA
Ajith Abraham
Department of Computer Science and Engineering, Vardhaman College of Engineering, Hyderabad, Telangana, India
M. A. Jabbar
Departamento de Inteligencia Artificial, Universidad Politécnica de Madrid, Madrid, Spain
Sanju Tiwari
ISEP - Instituto Superior de Engenharia do Porto, Porto, Portugal
Isabel M. S. Jesus

Ethics declarations

Conflict of Interests.

The authors declare that there is no conflict of interests regarding the publication of this paper.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bhattacharya, S., Shaw, V., Singh, P.K., Sarkar, R., Bhattacharjee, D. (2021). SV-NET: A Deep Learning Approach to Video Based Human Activity Recognition. In: Abraham, A., Jabbar, M., Tiwari, S., Jesus, I. (eds) Proceedings of the 11th International Conference on Soft Computing and Pattern Recognition (SoCPaR 2019). SoCPaR 2019. Advances in Intelligent Systems and Computing, vol 1182. Springer, Cham. https://doi.org/10.1007/978-3-030-49345-5_2

Download citation

DOI: https://doi.org/10.1007/978-3-030-49345-5_2
Published: 01 August 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-49344-8
Online ISBN: 978-3-030-49345-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics