Skip to main content

Fight Detection in Images Using Postural Analysis

  • Conference paper
  • First Online:
Hybrid Artificial Intelligent Systems (HAIS 2021)

Abstract

This paper defines the research process that has been carried out to develop a system for detecting fights in images. The system takes input frames and evaluates the probability that the frame contains one or more people fighting. Input frames containing images are initially processed using a well-known neural architecture, called OpenPose, which extracts pose information out of images that contain human postures. Human posture data is then processed by heuristics to extract both: angles for arms and legs for each person in the image. Angles are then used to feed an additional neural network that has been trained to make probability predictions of people being potentially involved in fights. This paper describes the full pipeline regarding techniques, tools and assessment required to create a camera based violence detection system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    COCO is a widely used dataset for object detection, segmentation and labelling tasks. It currently contains around 330000 images, including images of tagged people. It is available at: https://cocodataset.org/.

References

  1. SDM: Rise of surveillance camera installed base slows (2016)

    Google Scholar 

  2. Voulodimos, A., Doulamis, N., Doulamis, A., Protopapadakis, E.: Deep learning for computer vision: a brief review. Comput. Intell. Neurosci. (2018)

    Google Scholar 

  3. Smith, R.: An overview of the tesseract OCR engine. In: Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), vol. 2, pp. 629–633. IEEE (2007)

    Google Scholar 

  4. Badue, C., et al.: Self-driving cars: a survey. Expert Syst. Appl. 165, 113816 (2020)

    Article  Google Scholar 

  5. Bermejo Nievas, E., Deniz Suarez, O., Bueno García, G., Sukthankar, R.: Violence detection in video using computer vision techniques. In: Real, P., Diaz-Pernil, D., Molina-Abril, H., Berciano, A., Kropatsch, W. (eds.) CAIP 2011. LNCS, vol. 6855, pp. 332–339. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23678-5_39

    Chapter  Google Scholar 

  6. Cao, Z., Hidalgo, G., Simon, T., Wei, S.E., Sheikh, Y.: OpenPose: realtime multi-person 2D pose estimation using part affinity fields. IEEE Trans. Pattern Anal. Mach. Intell. 43, 172–186 (2019)

    Article  Google Scholar 

  7. Chen, Y., Tian, Y., He, M.: Monocular human pose estimation: a survey of deep learning-based methods. Comput. Vis. Image Underst. 192, 102897 (2020)

    Article  Google Scholar 

  8. Li, B., Dai, Y., Cheng, X., Chen, H., Lin, Y., He, M.: Skeleton based action recognition using translation-scale invariant image mapping and multi-scale deep CNN. In: 2017 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pp. 601–604. IEEE (2017)

    Google Scholar 

  9. Li, B., Chen, H., Chen, Y., Dai, Y., He, M.: Skeleton boxes: solving skeleton based action detection with a single deep convolutional neural network. In: 2017 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pp. 613–616. IEEE (2017)

    Google Scholar 

  10. Insafutdinov, E., et al.: ArtTrack: articulated multi-person tracking in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6457–6465 (2017)

    Google Scholar 

  11. Cannataro, M., Cuzzocrea, A., Pugliese, A.: Xahm: an adaptive hypermedia model based on xml. In: Proceedings of the 14th International Conference on Software Engineering and Knowledge Engineering, pp. 627–634 (2002)

    Google Scholar 

  12. Shotton, J., et al.: Efficient human pose estimation from single depth images. IEEE Trans. Pattern Anal. Mach. Intell. 35(12), 2821–2840 (2012)

    Article  Google Scholar 

  13. Faessler, M., Mueggler, E., Schwabe, K., Scaramuzza, D.: A monocular pose estimation system based on infrared leds. In: 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 907–913. IEEE (2014)

    Google Scholar 

  14. Zhao, M., et al.: Through-wall human pose estimation using radio signals. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7356–7365 (2018)

    Google Scholar 

  15. Rhodin, H., et al.: Learning monocular 3D human pose estimation from multi-view images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8437–8446 (2018)

    Google Scholar 

  16. Cuzzocrea, A., Mansmann, S.: OLAP visualization: models, issues, and techniques. In: Encyclopedia of Data Warehousing and Mining, Second Edition, pp. 1439–1446. IGI Global (2009)

    Google Scholar 

  17. Cuzzocrea, A., Song, I.Y.: Big graph analytics: the state of the art and future research agenda. In: Proceedings of the 17th International Workshop on Data Warehousing and OLAP, pp. 99–101 (2014)

    Google Scholar 

  18. Luvizon, D.C., Picard, D., Tabia, H.: 2D/3D pose estimation and action recognition using multitask deep learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5137–5146 (2018)

    Google Scholar 

  19. Zhang, F., Zhu, X., Ye, M.: Fast human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3517–3526 (2019)

    Google Scholar 

  20. Moon, G., Chang, J.Y., Lee, K.M.: Posefix: model-agnostic general human pose refinement network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7773–7781 (2019)

    Google Scholar 

  21. Kreiss, S., Bertoni, L., Alahi, A.: PifPaf: composite fields for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11977–11986 (2019)

    Google Scholar 

  22. Li, C., Lee, G.H.: Generating multiple hypotheses for 3D human pose estimation with mixture density network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9887–9895 (2019)

    Google Scholar 

  23. Arnab, A., Doersch, C., Zisserman, A.: Exploiting temporal context for 3D human pose estimation in the wild. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3395–3404 (2019)

    Google Scholar 

  24. Mehta, D., et al.: XNect: real-time multi-person 3D motion capture with a single RGB camera. ACM Trans. Graph. (TOG) 39(4), 82–1 (2020)

    Article  Google Scholar 

  25. Huang, S., Gong, M., Tao, D.: A coarse-fine network for keypoint localization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3028–3037 (2017)

    Google Scholar 

  26. He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2980–2988 (2017)

    Google Scholar 

  27. Xiao, B., Wu, H., Wei, Y.: Simple baselines for human pose estimation and tracking. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11210, pp. 472–487. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01231-1_29

    Chapter  Google Scholar 

  28. Chen, Y., Wang, Z., Peng, Y., Zhang, Z., Yu, G., Sun, J.: Cascaded pyramid network for multi-person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7103–7112 (2018)

    Google Scholar 

  29. Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019)

    Google Scholar 

  30. Newell, A., Huang, Z., Deng, J.: Associative embedding: end-to-end learning for joint detection and grouping. arXiv preprint arXiv:1611.05424 (2016)

  31. Papandreou, G., Zhu, T., Chen, L.-C., Gidaris, S., Tompson, J., Murphy, K.: PersonLab: person pose estimation and instance segmentation with a bottom-up, part-based, geometric embedding model. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) Computer Vision – ECCV 2018. LNCS, vol. 11218, pp. 282–299. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01264-9_17

    Chapter  Google Scholar 

  32. Kocabas, M., Karagoz, S., Akbas, E.: MultiPoseNet: fast multi-person pose estimation using pose residual network. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11215, pp. 437–453. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01252-6_26

    Chapter  Google Scholar 

  33. Cheng, M., Cai, K., Li, M.: Rwf-2000: an open large scale video database for violence detection. arXiv preprint arXiv:1911.05913 (2019)

  34. Blunsden, S., Fisher, R.B.: The BEHAVE video dataset: ground truthed video for multi-person behavior classification. Ann. BMVA 4(1–12), 4 (2010)

    Google Scholar 

  35. Rota, P., Conci, N., Sebe, N., Rehg, J.M.: Real-life violent social interaction detection. In: 2015 IEEE International Conference on Image Processing (ICIP), pp. 3456–3460. IEEE (2015)

    Google Scholar 

  36. Demarty, C.-H., Penet, C., Soleymani, M., Gravier, G.: VSD, a public dataset for the detection of violent scenes in movies: design, annotation, analysis and evaluation. Multimed. Tools Appl. 74(17), 7379–7404 (2015). https://doi.org/10.1007/s11042-014-1984-4

    Article  Google Scholar 

  37. Perez, M., Kot, A.C., Rocha, A.: Detection of real-world fights in surveillance videos. In: ICASSP 2019–2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2662–2666. IEEE (2019)

    Google Scholar 

  38. Nievas, E.B., Suarez, O.D., Garcia, G.B., Sukthankar, R.: Movies fight detection dataset. In: Computer Analysis of Images and Patterns, pp. 332–339. Springer (2011)

    Google Scholar 

  39. Hassner, T., Itcher, Y., Kliper-Gross, O.: Violent flows: real-time detection of violent crowd behavior. In: 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 1–6. IEEE (2012)

    Google Scholar 

  40. Yun, K., Honorio, J., Chattopadhyay, D., Berg, T.L., Samaras, D.: Two-person interaction detection using body-pose features and multiple instance learning. In: 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 28–35. IEEE (2012)

    Google Scholar 

  41. Sultani, W., Chen, C., Shah, M.: Real-world anomaly detection in surveillance videos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6479–6488 (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to José Gaviria de la Puerta .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Landa, E.A., de la Puerta, J.G., Lopez-Gazpio, I., Pastor-López, I., Tellaeche, A., Bringas, P.G. (2021). Fight Detection in Images Using Postural Analysis. In: Sanjurjo González, H., Pastor López, I., García Bringas, P., Quintián, H., Corchado, E. (eds) Hybrid Artificial Intelligent Systems. HAIS 2021. Lecture Notes in Computer Science(), vol 12886. Springer, Cham. https://doi.org/10.1007/978-3-030-86271-8_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-86271-8_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-86270-1

  • Online ISBN: 978-3-030-86271-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics