Abstract
Recognition of violent activities is a sub-problem of activity recognition that is new and less studied, comparatively. This study proposes a method to classify violent activity videos by utilizing 3D Convolutional Neural Networks (CNN) and transfer learning. A 3D feature structure is constructed from deep features obtained from frames of the input video with transfer learning and classified with a 3D CNN classifier. The pre-trained AlexNet model is used for feature extraction. Extracted features are reshaped to a 2D structure and concatenated to build 3D feature volumes. These volumes are used in the 3D CNN model construction. The 3D CNN model can only process fixed-size inputs; thus, the volumes of deep features are resized with 3D interpolation. The proposed model is tested with Hockey Fight, Violent Flow, and Movies datasets and compared to the other studies. Higher classification accuracy is obtained compared with the temporal methods like Lstm and Bi-Lstm.





Similar content being viewed by others
References
Giannakopoulos, T., Makris, A., Kosmopoulos, D., Perantonis, S., Theodoridis, S.: Audio-visual fusion for detecting violent scenes in videos. In: Hellenic conference on artificial intelligence 2010, pp. 91–100. Springer
Sreenu, G., Durai, M.S.: Intelligent video surveillance: a review through deep learning techniques for crowd analysis. J. Big Data 6(1), 48 (2019)
Baek, J.-H., Lee, D.-K., Hong, C.-Y., Ahn, B.-T.: Multimodal approach for blocking obscene and violent contents. J. Convergence Inf. Technol. 7(6), 113–121 (2017)
Carneiro, S.A., da Silva, G.P., Guimaraes, S.J.F., Pedrini, H.: Fight Detection in video sequences based on multi-stream convolutional neural networks. In: 2019 32nd SIBGRAPI conference on graphics, patterns and images (SIBGRAPI) 2019, pp. 8–15. IEEE
Datta, A., Shah, M., Lobo, N.D.V.: Person-on-person violence detection in video data. In: Object recognition supported by user interaction for service robots 2002, pp. 433–438. IEEE
Zhang, T., Jia, W., Yang, B., Yang, J., He, X., Zheng, Z.: MoWLD: a robust motion image descriptor for violence detection. Multim. Tools Appl 76(1), 1419–1438 (2017)
Song, D., Kim, C., Park, S.K.: A multi-temporal framework for high-level activity analysis: violent event detection in visual surveillance. Inform Sci. 447, 83–103 (2018). https://doi.org/10.1016/j.ins.2018.02.065
Hassner, T., Itcher, Y., Kliper-Gross, O.: Violent flows: real-time detection of violent crowd behavior. In: 2012 IEEE computer society conference on computer vision and pattern recognition workshops 2012, pp. 1–6. IEEE
Nguyen, N.T., Phung, D.Q., Venkatesh, S., Bui, H.: Learning and detecting activities from movement trajectories using the hierarchical hidden Markov model. Proc Cvpr Ieee, 955–960 (2005).
Huang, J.F., Chen, S.L.: Detection of violent crowd behavior based on statistical characteristics of the optical flow. 2014 11th international conference on fuzzy systems and knowledge discovery (Fskd), 565–569 (2014).
Zhang, T., Yang, Z.J., Jia, W.J., Yang, B.Q., Yang, J., He, X.J.: A new method for violence detection in surveillance scenes. Multim. Tools Appl. 75(12), 7327–7349 (2016). https://doi.org/10.1007/s11042-015-2648-8
Lloyd, K., Rosin, P.L., Marshall, D., Moore, S.C.: Detecting violent and abnormal crowd activity using temporal analysis of grey level co-occurrence matrix (GLCM)-based texture measures. Mach. Vision Appl. 28(3–4), 361–371 (2017). https://doi.org/10.1007/s00138-017-0830-x
Fu, E.Y., Leong, H.V., Ngai, G., Chan, S.C.F.: Automatic fight detection in surveillance videos. Int. J. Pervasive Comp. 13(2), 130–156 (2017). https://doi.org/10.1108/Ijpcc-02-2017-0018
Sudhakaranu, S., Lanz, O.: Learning to detect violent videos using convolutional long short-term memory. 2017 14th Ieee international conference on advanced video and signal based surveillance (Avss) (2017).
Fenil, E., Manogaran, G., Vivekananda, G., Thanjaivadivel, T., Jeeva, S., Ahilan, A.J.C.N.: Real time violence detection framework for football stadium comprising of big data analysis and deep learning through bidirectional LSTM. Comupter Netw. 151, 191–200 (2019)
Ullah, F.U.M., Ullah, A., Muhammad, K., UL Haq, I., Baik, S.W.: Violence detection using spatiotemporal features with 3D convolutional neural network. Sensors-Basel 19(11), 871 (2019)
Keçeli, A.S., Kaya, A.: Video Görüntülerinde Şiddet İçeren Aktivitelerin Lstm Ağı İle Tespiti. Dokuz Eylül Üniversitesi Mühendislik Fakültesi Fen ve Mühendislik Dergisi 21(63), 933–939.
Keçeli, A.S., Kaya, A.: Optik Akış Görüntüsü ve Bi-Lstm ile Şiddet İçeren Hareketlerin Sınıflandırılması. Avrupa Bilim ve Teknoloji Dergisi(14), 204–208 (2018).
Keçeli, A., Kaya, A.: Violent activity detection with transfer learning method. Electron Lett 53(15), 1047–1048 (2017)
Kaya, A., Keceli, A.S., Catal, C., Yalic, H.Y., Temucin, H., Tekinerdogan, B.: Analysis of transfer learning for deep neural network based plant classification models. Comput. Electron. Agr. 158, 20–29 (2019)
Understanding AlexNet. https://www.learnopencv.com/understanding-alexnet/ (2020).
Tang, Y.J.a.p.a.: Deep learning using linear support vector machines. (2013).
Shin, H.C., Roth, H.R., Gao, M.C., Lu, L., Xu, Z.Y., Nogues, I., Yao, J.H., Mollura, D., Summers, R.M.: Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE T Med. Imag. 35(5), 1285–1298 (2016). https://doi.org/10.1109/Tmi.2016.2528162
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. 2005 Ieee computer society conference on computer vision and pattern recognition, vol 1, proceedings, 886–893 (2005). doi: https://doi.org/10.1109/cvpr.2005.177
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 45, 1097–1105 (2012)
Mishkin, D., Sergievskiy, N., Matas, J.: Systematic evaluation of CNN advances on the ImageNet. arXiv preprint arXiv:1606.02228 (2016).
Hassner, T., Itcher, Y., Kliper-Gross, O.: Violent flows: Real-time detection of violent crowd behavior. In: Computer Vision and Pattern Recognition Workshops (CVPRW), 2012 IEEE computer society conference on 2012, pp. 1–6. IEEE
Nievas, E.B., Suarez, O.D., García, G.B., Sukthankar, R.: Violence detection in video using computer vision techniques. In: international conference on computer analysis of images and patterns 2011, pp. 332–339. Springer
Yeffet, L., Wolf, L.: Local trinary patterns for human action recognition. In: Computer Vision, 2009 IEEE 12th international conference on 2009, pp. 492–497. IEEE
Gao, Y., Liu, H., Sun, X., Wang, C., Liu, Y.: Violence detection using oriented violent flows. Image Vis. Comput. 48, 37–41 (2016)
Arceda, V.M., Ferna, K., Guti, J.: Real time violence detection in video. (2016).
Lohithashva, B., Aradhya, V.M., Guru, D.J.R.D.I.A.: Violent video event detection based on integrated LBP and GLCM texture features. Rev. Intell. Artif. 34(2), 179–187 (2020)
Deepak, K., Vignesh, L., Chandrakala, S.J.I.E.: Autocorrelation of gradients based violence detection in surveillance videos. 6(3), 155–159 (2020).
Lohithashva, B., Aradhya, V.M.: Violent video event detection: a local optimal oriented pattern based approach. In: international conference on applied intelligence and informatics 2021, pp. 268–280. Springer
Deniz, O., Serrano, I., Bueno, G., Kim, T.K.: Fast violence detection in video. proceedings of the 2014 9th international conference on computer vision, theory and applications (Visapp 2014), Vol 2, 478–485 (2014).
Febin, I., Jayasree, K., Joy, P.T.J.P.A.: Applications: violence detection in videos for an intelligent surveillance system using MoBSIFT and movement filtering algorithm. Pattern Anal. Appl. 23(2), 611–623 (2020)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Keceli, A.S., Kaya, A. Violent activity classification with transferred deep features and 3d-Cnn. SIViP 17, 139–146 (2023). https://doi.org/10.1007/s11760-022-02213-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-022-02213-3