Skip to main content
Log in

Violence detection in videos for an intelligent surveillance system using MoBSIFT and movement filtering algorithm

  • Theoretical advances
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

Action recognition is an active research area in computer vision as it has enormous applications in today’s world, out of which, recognizing violent action is of great importance since it is closely related to our safety and security. An intelligent surveillance system is the idea of automatically recognizing suspicious activities in surveillance videos and thereby supporting security personals to take up right action on the right time. Under this area, most of the researchers were focused on people detection and tracking, loitering, etc., whereas detecting violent actions or fights is comparatively a less studied area. Previous works considered the local spatiotemporal feature extractors; however, it accompanies the overhead of complex optical flow estimation. Even though the temporal derivative is a fast alternative to optical flow, it alone gives very low accuracy and scales-dependent result. Hence, here we propose a cascaded method of violence detection based on motion boundary SIFT (MoBSIFT) and movement filtering. In this method, the surveillance videos are checked through a movement filtering algorithm based on temporal derivative and avoid most of the nonviolent actions from going through feature extraction. Only the filtered frames may allow going through feature extraction. In addition to scale-invariant feature transform (SIFT) and histogram of optical flow feature, motion boundary histogram is also extracted and combined to form MoBSIFT descriptor. The experimental results show that the proposed MoBSIFT outperforms the existing methods in accuracy by its high tolerance to camera movements. Time complexity has also proved to be reduced by the use of movement filtering along with MoBSIFT.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. de Souza FD, Chavez GC, do Valle EA, de A Araujo A (2010) Violence detection in video using spatio-temporal features. In: 23rd SIBGRAPI conference on graphics, patterns and images, pp 224–230

  2. Deniz O, Serrano I, Bueno G, Tae-Tyun K (2014) Fast violence detection in video. In: VISAPP 2014  proceedings of the 9th international conference on computer vision theory and applications, pp 478–485

  3. Bermejo E, Deni O, Bueno G, Sukthankar R. (2011) Violence detection in video using computer vision techniques. In: Proceedings of the 14th international conference on computer analysis of images and patterns. Springer, pp 332–339

  4. Ke S-R, Thuc H, Lee Y-J et al (2013) A review on video-based human activity recognition. Computers 2:88–131. https://doi.org/10.3390/computers2020088

    Article  Google Scholar 

  5. Chen M, Hauptmann A (2009) MoSIFT: recognizing human actions in surveillance videos. Technical report, Carnegie Mellon University, Pittsburgh, USA

  6. Dalal N, Triggs B, Schmid C (2006) Human Detection using oriented histograms of flow and appearance. In: Proceedings of 9th ECCV, pp 428–441

  7. Giannakopoulos T, Kosmopoulos D, Aristidou A, Theodoridis S (2006) Violence content classification using audio features. In: Proceedings of the 4th helenic conference on advances in artificial intelligence. Springer, pp 502–507

  8. Gong Y, Wang W, Jiang S, Huang Q, Gao W (2008) Detecting violent scenes in movies by auditory and visual cues. In: Proceedings of the 9th Pacific Rim conference on multimedia. Springer, Berlin, Heidelberg, pp 317–326

  9. Lin J, Wang W (2009) Weakly-supervised violence detection in movies with audio and video based cotraining. In: Proceedings of the 10th Pacific Rim conference on multimedia. Springer, Berlin, Heidelberg, pp 930–935

  10. Nam J, Alghoniemy M, Tewfik AH (1998) Audio-visual content-based violent scene characterization. In: Proceedings 1998 international conference on image processing. ICIP98 (Cat. No. 98CB36269). IEEE Comput. Soc, Chicago, USA, pp 353–357

  11. Cheng W, Chu W, Ling J (2003) Semantic context detection based on hierarchical audio models. In: Proceedings of the ACM SIGMM workshop on multimedia information retrieval, pp. 109–115

  12. Giannakopoulos T, Makris A, Kosmopoulos D, Perantonis S, Theodoridis S (2010) Audio-visual fusion for detecting violent scenes in videos. In: Artificial intelligence: theories, models and applications, pp 91–100

  13. Chen L-H, Hsu H-W, Wang L-Y, Su C-W (2011) Violence detection in movies. In: 2011 Eighth international conference computer graphics, imaging and visualization. IEEE Comput. Soc, Washington, DC, USA, pp 119–124

  14. Clarin C, Dionisio J, Echavez M, Naval P (2005) DOVE: Detection of movie violence using motion intensity analysis on skin and blood. Technical report, University of the Philippines

  15. Zajdel W, Krijnders JD, Andringa T, Gavrila DM (2007) CASSANDRA: audio-video sensor fusion for aggression detection. In: 2007 IEEE conference on advanced video and signal based surveillance, pp 200–205

  16. Datta A, Shah M, Da Vitoria Lobo N (2002) Person-on-person violence detection in video data. In: 16th international conference on pattern recognition, pp 433–438

  17. Yun K, Honorio J, Chattopadhyay D, Berg TL, Samaras D (2012) Two-person interaction detection using body-pose features and multiple instance learning. In: IEEE computer society conference on computer vision and pattern recognition workshops, pp 28–35

  18. Gao Z, Nie W, Liu A, Zhang H (2016) Evaluation of local spatial–temporal features for cross-view action recognition. Neurocomputing 173:110–117

    Article  Google Scholar 

  19. Xu L, Gong C, Yang J, Wu Q, Yao L (2014) Violent video detection based on MoSIFT feature and sparse coding. In: IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 3538–3542

  20. Hassner T, Itcher Y, Kliper-Gross O (2012) Violent flows: real-time detection of violent crowd behavior. In: 2012 IEEE computer society conference on computer vision and pattern recognition workshops. IEEE, Providence, USA, pp 1–6

  21. Mousavi H, Mohammadi S, Perina A, Chellali R, Murino V (2015) Analyzing tracklets for the detection of abnormal crowd behavior. In: IEEE winter conference on applications of computer vision, pp 148–15

  22. Colque RVHM, Junior CAC, Schwartz WR (2015) Histograms of optical flow orientation and magnitude to detect anomalous events in videos. In: 28th SIBGRAPI conference on graphics, patterns and images, pp 126–133

  23. Gao Y, Liu H, Sun X, Wang C, Liu Y (2016) Violence detection using oriented violent flows. Image Vis Comput 48–49:37–41

    Article  Google Scholar 

  24. Zhang T, Yang Z, Jia W, Yang B, Yang J, He X (2016) A new method for violence detection in surveillance scenes. Multimed Tools Appl 75:7327–7349

    Article  Google Scholar 

  25. Zhang T, Jia W, He X, Yang J (2017) Discriminative dictionary learning with motion weber local descriptor for violence detection. IEEE Trans Circuits Syst Video Technol 27(3):696–709

    Article  Google Scholar 

  26. Senst T, Eiselein V, Kuhn A, Sikora T (2017) Crowd violence detection using global motion-compensated lagrangian features and scale-sensitive video-level representation. IEEE Trans Inf Forensics Secur 12(12):2945–2956

    Article  Google Scholar 

  27. Mabrouk AB, Zagrouba E (2017) Spatio-temporal feature using optical flow based distribution for violence detection. Pattern Recognit Lett 92:62–67

    Article  Google Scholar 

  28. Gracia IS, Suarez OD, Garcia GB, Kim T-K (2015) Fast fight detection. PLoS ONE 10(4):e0120448. https://doi.org/10.1371/journal

    Article  Google Scholar 

  29. Schuldt C, Laptev I, Caputo B (2004) Recognizing human actions: a local svm approach. In: 17th international conference on pattern recognition (ICPR’04), IEEE Comp. Soc. Washington, DC, USA, vol 3, pp 32–36

  30. Gorelick L, Blank M, Shechtman E, Irani M, Basri R (2007) Actions as space-time shapes. IEEE Trans Pattern Anal Mach Intell 29(12):2247–2253

    Article  Google Scholar 

  31. Weinland D, Ronfard R, Boyer E (2006) Free viewpoint action recognition using motion history volumes. Comput Vis Image Underst 104:249–257

    Article  Google Scholar 

  32. Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vision 60(2):91–110. https://doi.org/10.1023/B:VISI.0000029664.99615.94

    Article  Google Scholar 

  33. Paul M, Haque SME, Chakraborty S (2013) Human detection in surveillance videos and its applications a review. EURASIP J Adv Signal Process 2013:176

    Article  Google Scholar 

  34. Wang H, Klaser A, Schmid C, Liu C-L (2011) Action recognition by dense trajectories. In: 2011 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, Colorado Springs, USA, pp 3169–3176

  35. Liu M, Wang M, Wang J, Li D (2013) Comparison of random forest, support vector machine and back propagation neural network for electronic tongue data classification: application to the recognition of orange beverage and Chinese vinegar. Sens Actuators B 177:970–980

    Article  Google Scholar 

  36. Lorena AC, Jacintho Luis FO, Siqueira MF, De Giovanni R, Lohmann LG, de André CPLF, Carvalho MY (2011) Comparing machine learning classifiers in potential distribution modelling. Expert Syst Appl 38:5268–5275

    Article  Google Scholar 

Download references

Acknowledgements

I extent my gratitude toward Govt. Model Engineering College for providing all support for this work. I also appreciate the support provided by Bermejo et al. [3] by making Movies and Hockey dataset freely available to access.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to I. P. Febin.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Febin, I.P., Jayasree, K. & Joy, P.T. Violence detection in videos for an intelligent surveillance system using MoBSIFT and movement filtering algorithm. Pattern Anal Applic 23, 611–623 (2020). https://doi.org/10.1007/s10044-019-00821-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-019-00821-3

Keywords

Navigation