Unethical human action recognition using deep learning based hybrid model for video forensics

Gowada, Raghavendra; Pawar, Digambar; Barman, Biplab

doi:10.1007/s11042-023-14508-9

Unethical human action recognition using deep learning based hybrid model for video forensics

Published: 21 February 2023

Volume 82, pages 28713–28738, (2023)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

318 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

With the rapid growth in multimedia collections around the world, video forensics faces new obstacles in recognizing human actions under video surveillance systems, human-computer interaction, etc. that requires multiple activity recognition systems. Due to issues such as background clutter, partial occlusion, scaling, viewpoint, lighting, and appearance, recognizing human activities from video sequences or still images is a difficult process. In the literature, there are a variety of Deep Learning methods that can be employed to solve the problems of unethical human action recognition which are effective in learning low-level temporal and spatial features but struggle from learning high-level features that affect the feature learning capability of the model. Due to this problem, deep learning methods suffer from poor performance and learning ability. From digital forensic perspective, deep analysis of video has become a prerequisite in human action recognition methods concerning to cyber-crime investigation and prevention. In this paper, we propose a Deep Learning based hybrid model for unethical human action recognition using two-stream inflated 3D ConvNet (I3D) and spatio-temporal attention (STA) modules. The I3D model improves the performance of 3D CNN architecture by inflating 2D convolution kernels into 3D kernels and STA increases the learning capability by giving attention to each frame’s spatial and temporal information. To test the capability of our model, we have built a multi-action dataset using the subset of diverse datasets like Weizmann, HMDB51, UCF-101, NPDI, and UCF-Crime then compared our proposed model with existing models using unique and multi-action datasets to show better performance capability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficiency in Human Actions Recognition in Video Surveillance Using 3D CNN and DenseNet

A Hybrid Architecture for Action Recognition in Videos Using Deep Learning

A weakly supervised CNN model for spatial localization of human activities in unconstraint environment

Article 31 January 2020

Materials Availability

The authors confirm that the data supporting the findings of this study are available within the article.

Code Availability

Available with the authors and provided to the research community based on request and mutual understanding.

References

Avila S, Thome N, Cord M et al (2013) Pooling in image representation: the visual codeword point of view. Comput Vis Image Underst 117(5):453–465
Article Google Scholar
Battiato S, Giudice O, Paratore A (2016) Multimedia forensics: discovering the history of multimedia contents. In: Proceedings of the 17th international conference on computer systems and technologies 2016, pp 5–16
Carreira J, Zisserman A (2017) Quo vadis, action recognition? a new model and the kinetics dataset. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6299–6308
Donahue J, Anne Hendricks L, Guadarrama S et al (2015) Long-term recurrent convolutional networks for visual recognition and description. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2625–2634
Dumoulin V, Visin F (2016) A guide to convolution arithmetic for deep learning. arXiv:160307285
Feichtenhofer C, Pinz A, Zisserman A (2016) Convolutional two-stream network fusion for video action recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1933–1941
Gorelick L, Blank M, Shechtman E et al (2007) Actions as space-time shapes. IEEE Trans Pattern Anal Mach Intell 29(12):2247–2253
Article Google Scholar
Huang Y, Guo Y, Gao C (2020) Efficient parallel inflated 3d convolution architecture for action recognition. IEEE Access 8:45,753–45,765
Article Google Scholar
Jalal A, Kamal S, Azurdia-Meza CA (2019) Depth maps-based human segmentation and action recognition using full-body plus body color cues via recognizer engine. Journal of Electrical Engineering & Technology 14(1):455–461
Article Google Scholar
Ji S, Xu W, Yang M et al (2012) 3d convolutional neural networks for human action recognition. IEEE Trans Pattern Anal Mach Intell 35(1):221–231
Article Google Scholar
Karnataka Minister involved in SEX CD scandal (2021) IndiaToday. https://bit.ly/37I8ZCV, [Online; accessed 23-March-2021]
Karpathy A, Toderici G, Shetty S et al (2014) Large-scale video classification with convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1725–1732
Kay W, Carreira J, Simonyan K et al (2017) The kinetics human action video dataset. arXiv:170506950
Khan MA, Javed K, Khan SA et al (2020) Human action recognition using fusion of multiview and deep features: an application to video surveillance. Multimed Tools Appl, pp 1–27
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, vol 25
Kuehne H, Jhuang H, Garrote E et al (2011) Hmdb: a large video database for human motion recognition. In: 2011 International conference on computer vision, IEEE, pp 2556–2563
Li J, Liu X, Zhang W et al (2020) Spatio-temporal attention networks for action recognition and detection. IEEE Trans Multimedia 22(11):2990–3001
Article Google Scholar
Liu G, Zhang C, Xu Q et al (2020) I3d-shufflenet based human action recognition. Algorithms 13(11):301
Article Google Scholar
Liu J, Shahroudy A, Xu D, et al (2016) Spatio-temporal lstm with trust gates for 3d human action recognition. In: European conference on computer vision. Springer, pp 816–833
Maqsood R, Bajwa UI, Saleem G et al (2021) Anomaly recognition from surveillance videos using 3d convolution neural network. Multimedia Tools and Applications 80(12):18,693–18,716
Article Google Scholar
Moustafa M (2015) Applying deep learning to classify pornographic images and videos. arXiv:151108899
Sam SM, Kamardin K, Sjarif NNA et al (2019) Offline signature verification using deep learning convolutional neural network (cnn) architectures googlenet inception-v1 and inception-v3. Procedia Computer Science 161:475–483
Article Google Scholar
Sargano AB, Wang X, Angelov P et al (2017) Human action recognition using transfer learning with deep representations. In: 2017 international joint conference on neural networks (IJCNN). IEEE, pp 463-469
Schuldt C, Laptev I, Caputo B (2004) Recognizing human actions: a local svm approach. In: Proceedings of the 17th international conference on pattern recognition, 2004. ICPR 2004., IEEE, pp 32–36
Sharma S, Sharma S, Athaiya A (2017) Activation functions in neural networks. Towards Data Science 6(12):310–316
Google Scholar
Silva MVd, Marana AN (2018) Spatiotemporal cnns for pornography detection in videos. In: Iberoamerican congress on pattern recognition. Springer, pp 547–555
Simonyan K, Zisserman A (2014) Two-stream convolutional networks for action recognition in videos. Advances in Neural Information Processing Systems, vol 27
Soomro K, Zamir AR, Shah M (2012) Ucf101: a dataset of 101 human actions classes from videos in the wild. arXiv:12120402
Sultani W, Chen C, Shah M (2018) Real-world anomaly detection in surveillance videos. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6479–6488
Tran D, Bourdev L, Fergus R, et al (2015) Learning spatiotemporal features with 3d convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp 4489–4497
Varol G, Laptev I, Schmid C (2017) Long-term temporal convolutions for action recognition. IEEE Trans Pattern Anal Mach Intell 40(6):1510–1517
Article Google Scholar
Wang X, Miao Z, Zhang R et al (2019) I3d-lstm: a new model for human action recognition. In: IOP conference series: materials science and engineering, IOP Publishing, pp 032035
Zhou Y, Sun X, Zha ZJ et al (2018) Mict: mixed 3d/2d convolutional tube for human action recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 449–458
Zhu Y, Newsam S (2019) Motion-aware feature for improved video anomaly detection. arXiv:190710211

Download references

Funding

No funding was received for conducting this study.

Author information

Authors and Affiliations

School of CIS, University of Hyderabad, Central University, Prof. C R Rao Road, Hyderabad, 500046, Telangana, India
Raghavendra Gowada, Digambar Pawar & Biplab Barman

Authors

Raghavendra Gowada
View author publications
You can also search for this author in PubMed Google Scholar
Digambar Pawar
View author publications
You can also search for this author in PubMed Google Scholar
Biplab Barman
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All the authors contributed to the study’s conception and design.

Corresponding author

Correspondence to Raghavendra Gowada.

Ethics declarations

Conflict of Interests

There are no conflict of interest/Competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Gowada, R., Pawar, D. & Barman, B. Unethical human action recognition using deep learning based hybrid model for video forensics. Multimed Tools Appl 82, 28713–28738 (2023). https://doi.org/10.1007/s11042-023-14508-9

Download citation

Received: 16 July 2021
Revised: 29 April 2022
Accepted: 31 January 2023
Published: 21 February 2023
Issue Date: August 2023
DOI: https://doi.org/10.1007/s11042-023-14508-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Unethical human action recognition using deep learning based hybrid model for video forensics

Abstract

Access this article

Similar content being viewed by others

Efficiency in Human Actions Recognition in Video Surveillance Using 3D CNN and DenseNet

A Hybrid Architecture for Action Recognition in Videos Using Deep Learning

A weakly supervised CNN model for spatial localization of human activities in unconstraint environment

Materials Availability

Code Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Unethical human action recognition using deep learning based hybrid model for video forensics

Abstract

Access this article

Similar content being viewed by others

Efficiency in Human Actions Recognition in Video Surveillance Using 3D CNN and DenseNet

A Hybrid Architecture for Action Recognition in Videos Using Deep Learning

A weakly supervised CNN model for spatial localization of human activities in unconstraint environment

Materials Availability

Code Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation