Abstract
In Human action recognition, the identification of actions is a system that can detect human activities. The types of human activity are classified into four different categories, depending on the complexity of the steps and the number of body parts involved in the action, namely gestures, actions, interactions, and activities [1]. It is challenging for video Human action recognition to capture useful and discriminative features because of the human body's variations. To obtain Intelligent Solutions for action recognition, it is necessary to training models to recognize which action is performed by a person. This paper conducted an experience on Human action recognition compare several deep learning models with a small dataset. The main goal is to obtain the same or better results than the literature, which apply a bigger dataset with the necessity of high-performance hardware. Our analysis provides a roadmap to reach the training, classification, and validation of each model.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ko, T.: A survey on behavior analysis in video surveillance for homeland security applications. In: 37th IEEE Applied Imagery Pattern Recognition Workshop, pp. 1–8. IEEE (2008)
Analide, C., Novais, P., Machado, J., Neves, J.: Quality of knowledge in virtual entities. In: Encyclopedia of Communities of Practice in Information and Knowledge Management, pp. 436–442. IGI Global (2006)
Durães, D., Marcondes, F.S., Gonçalves, F., Fonseca, J., Machado, J., Novais, P.: Detection violent behaviors: a survey. In: International Symposium on Ambient Intelligence, pp. 106–116. Springer, Cham (2020)
Marcondes, F.S., Durães, D., Gonçalves, F., Fonseca, J., Machado, J., Novais, P.: In-vehicle violence detection in carpooling: a brief survey towards a general surveillance system. In: International Symposium on Distributed Computing and Artificial Intelligence, pp. 211–220. Springer, Cham (2020)
Durães, D., Carneiro, D., Jiménez, A., Novais, P.: Characterizing attentive behavior in intelligent environments. Neurocomputing 272, 46–54 (2018)
Costa, R., Neves, J., Novais, P., Machado, J., Lima, L., Alberto, C.: Intelligent mixed reality for the creation of ambient assisted living. In: Portuguese Conference on Artificial Intelligence, pp. 323–331. Springer, Heidelberg (2007)
Zhu, Y., Zhao, X., Fu, Y., Liu, Y.: Sparse coding on local spatial-temporal volumes for human action recognition. In: Asian Conference on Computer Vision, pp. 660–671. Springer, Heidelberg (2010)
Jesus, T., Duarte, J., Ferreira, D., Durães, D., Marcondes, F., Santos, F., Machado, J.: Review of trends in automatic human activity recognition using synthetic audio-visual data. In: International Conference on Intelligent Data Engineering and Automated Learning, pp. 549–560. Springer, Cham (2020)
Shokri, M., Harati, A., Taba, K.: Salient object detection in video using deep non-local neural networks. J. Vis. Commun. Image Represent. 68, 102769 (2020)
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceeding of the IEEE conference on computer vision and pattern recognition, pp. 770–778 (2016).
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)
Hochreiter, S., Bengio, Y., Fransconi, P., Schmidhuber, J.: Gradient flow in recorrent nets: the difficulty of learning long-terms dependencies (2001)
Huang, G., Yu, S., Zhung, L., Daniel, S., Killian, Q.W.: Deep networks with stochastic depth. In: European Conference on Computer Vision, pp. 646–661. Springer Cham (2016)
Feichtenhofer, C., Fan, H., Malik, J., He, K.: Slowfast networks for video recognition. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 6202–6211 (2019)
Carreira, J., Andrew, Z.: Quo vadis, action recognition? A new model and the kinetics dataset. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6299–6308 (2017)
Carreira, J., Noland, E., Hillier, C., Zisserman, A.: A short note on the Kinetics-700 human action dataset. arXiv, vol. preprint, no. 1907.06987 (2019)
Li, A., Thotakuri, M., Ross, D.A., Carreira, J., Vostrikov, A., Zisserman, A.: The AVA-kinetics localized human actions video dataset. arXiv preprint 2005.00214 (2020)
Acknowledgement
This work is supported by: European Structural and Investment Funds in the FEDER component, through the Operational Competitiveness and Internationalization Programme (COMPETE 2020) [Project nº 039334; Funding Reference: POCI-01–0247-FEDER- 039334].
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Santos, F. et al. (2021). Modelling a Deep Learning Framework for Recognition of Human Actions on Video. In: Rocha, Á., Adeli, H., Dzemyda, G., Moreira, F., Ramalho Correia, A.M. (eds) Trends and Applications in Information Systems and Technologies. WorldCIST 2021. Advances in Intelligent Systems and Computing, vol 1365. Springer, Cham. https://doi.org/10.1007/978-3-030-72657-7_10
Download citation
DOI: https://doi.org/10.1007/978-3-030-72657-7_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-72656-0
Online ISBN: 978-3-030-72657-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)