Fight Detection in Images Using Postural Analysis

Landa, Eneko Atxa; de la Puerta, José Gaviria; Lopez-Gazpio, Inigo; Pastor-López, Iker; Tellaeche, Alberto; Bringas, Pablo García

doi:10.1007/978-3-030-86271-8_20

Eneko Atxa Landa¹³,
José Gaviria de la Puerta¹³,
Inigo Lopez-Gazpio¹³,
Iker Pastor-López¹³,
Alberto Tellaeche¹³ &
…
Pablo García Bringas¹³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12886))

Included in the following conference series:

International Conference on Hybrid Artificial Intelligence Systems

1296 Accesses

Abstract

This paper defines the research process that has been carried out to develop a system for detecting fights in images. The system takes input frames and evaluates the probability that the frame contains one or more people fighting. Input frames containing images are initially processed using a well-known neural architecture, called OpenPose, which extracts pose information out of images that contain human postures. Human posture data is then processed by heuristics to extract both: angles for arms and legs for each person in the image. Angles are then used to feed an additional neural network that has been trained to make probability predictions of people being potentially involved in fights. This paper describes the full pipeline regarding techniques, tools and assessment required to create a camera based violence detection system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
COCO is a widely used dataset for object detection, segmentation and labelling tasks. It currently contains around 330000 images, including images of tagged people. It is available at: https://cocodataset.org/.

References

SDM: Rise of surveillance camera installed base slows (2016)
Google Scholar
Voulodimos, A., Doulamis, N., Doulamis, A., Protopapadakis, E.: Deep learning for computer vision: a brief review. Comput. Intell. Neurosci. (2018)
Google Scholar
Smith, R.: An overview of the tesseract OCR engine. In: Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), vol. 2, pp. 629–633. IEEE (2007)
Google Scholar
Badue, C., et al.: Self-driving cars: a survey. Expert Syst. Appl. 165, 113816 (2020)
Article Google Scholar
Bermejo Nievas, E., Deniz Suarez, O., Bueno García, G., Sukthankar, R.: Violence detection in video using computer vision techniques. In: Real, P., Diaz-Pernil, D., Molina-Abril, H., Berciano, A., Kropatsch, W. (eds.) CAIP 2011. LNCS, vol. 6855, pp. 332–339. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23678-5_39
Chapter Google Scholar
Cao, Z., Hidalgo, G., Simon, T., Wei, S.E., Sheikh, Y.: OpenPose: realtime multi-person 2D pose estimation using part affinity fields. IEEE Trans. Pattern Anal. Mach. Intell. 43, 172–186 (2019)
Article Google Scholar
Chen, Y., Tian, Y., He, M.: Monocular human pose estimation: a survey of deep learning-based methods. Comput. Vis. Image Underst. 192, 102897 (2020)
Article Google Scholar
Li, B., Dai, Y., Cheng, X., Chen, H., Lin, Y., He, M.: Skeleton based action recognition using translation-scale invariant image mapping and multi-scale deep CNN. In: 2017 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pp. 601–604. IEEE (2017)
Google Scholar
Li, B., Chen, H., Chen, Y., Dai, Y., He, M.: Skeleton boxes: solving skeleton based action detection with a single deep convolutional neural network. In: 2017 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pp. 613–616. IEEE (2017)
Google Scholar
Insafutdinov, E., et al.: ArtTrack: articulated multi-person tracking in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6457–6465 (2017)
Google Scholar
Cannataro, M., Cuzzocrea, A., Pugliese, A.: Xahm: an adaptive hypermedia model based on xml. In: Proceedings of the 14th International Conference on Software Engineering and Knowledge Engineering, pp. 627–634 (2002)
Google Scholar
Shotton, J., et al.: Efficient human pose estimation from single depth images. IEEE Trans. Pattern Anal. Mach. Intell. 35(12), 2821–2840 (2012)
Article Google Scholar
Faessler, M., Mueggler, E., Schwabe, K., Scaramuzza, D.: A monocular pose estimation system based on infrared leds. In: 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 907–913. IEEE (2014)
Google Scholar
Zhao, M., et al.: Through-wall human pose estimation using radio signals. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7356–7365 (2018)
Google Scholar
Rhodin, H., et al.: Learning monocular 3D human pose estimation from multi-view images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8437–8446 (2018)
Google Scholar
Cuzzocrea, A., Mansmann, S.: OLAP visualization: models, issues, and techniques. In: Encyclopedia of Data Warehousing and Mining, Second Edition, pp. 1439–1446. IGI Global (2009)
Google Scholar
Cuzzocrea, A., Song, I.Y.: Big graph analytics: the state of the art and future research agenda. In: Proceedings of the 17th International Workshop on Data Warehousing and OLAP, pp. 99–101 (2014)
Google Scholar
Luvizon, D.C., Picard, D., Tabia, H.: 2D/3D pose estimation and action recognition using multitask deep learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5137–5146 (2018)
Google Scholar
Zhang, F., Zhu, X., Ye, M.: Fast human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3517–3526 (2019)
Google Scholar
Moon, G., Chang, J.Y., Lee, K.M.: Posefix: model-agnostic general human pose refinement network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7773–7781 (2019)
Google Scholar
Kreiss, S., Bertoni, L., Alahi, A.: PifPaf: composite fields for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11977–11986 (2019)
Google Scholar
Li, C., Lee, G.H.: Generating multiple hypotheses for 3D human pose estimation with mixture density network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9887–9895 (2019)
Google Scholar
Arnab, A., Doersch, C., Zisserman, A.: Exploiting temporal context for 3D human pose estimation in the wild. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3395–3404 (2019)
Google Scholar
Mehta, D., et al.: XNect: real-time multi-person 3D motion capture with a single RGB camera. ACM Trans. Graph. (TOG) 39(4), 82–1 (2020)
Article Google Scholar
Huang, S., Gong, M., Tao, D.: A coarse-fine network for keypoint localization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3028–3037 (2017)
Google Scholar
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2980–2988 (2017)
Google Scholar
Xiao, B., Wu, H., Wei, Y.: Simple baselines for human pose estimation and tracking. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11210, pp. 472–487. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01231-1_29
Chapter Google Scholar
Chen, Y., Wang, Z., Peng, Y., Zhang, Z., Yu, G., Sun, J.: Cascaded pyramid network for multi-person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7103–7112 (2018)
Google Scholar
Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019)
Google Scholar
Newell, A., Huang, Z., Deng, J.: Associative embedding: end-to-end learning for joint detection and grouping. arXiv preprint arXiv:1611.05424 (2016)
Papandreou, G., Zhu, T., Chen, L.-C., Gidaris, S., Tompson, J., Murphy, K.: PersonLab: person pose estimation and instance segmentation with a bottom-up, part-based, geometric embedding model. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) Computer Vision – ECCV 2018. LNCS, vol. 11218, pp. 282–299. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01264-9_17
Chapter Google Scholar
Kocabas, M., Karagoz, S., Akbas, E.: MultiPoseNet: fast multi-person pose estimation using pose residual network. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11215, pp. 437–453. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01252-6_26
Chapter Google Scholar
Cheng, M., Cai, K., Li, M.: Rwf-2000: an open large scale video database for violence detection. arXiv preprint arXiv:1911.05913 (2019)
Blunsden, S., Fisher, R.B.: The BEHAVE video dataset: ground truthed video for multi-person behavior classification. Ann. BMVA 4(1–12), 4 (2010)
Google Scholar
Rota, P., Conci, N., Sebe, N., Rehg, J.M.: Real-life violent social interaction detection. In: 2015 IEEE International Conference on Image Processing (ICIP), pp. 3456–3460. IEEE (2015)
Google Scholar
Demarty, C.-H., Penet, C., Soleymani, M., Gravier, G.: VSD, a public dataset for the detection of violent scenes in movies: design, annotation, analysis and evaluation. Multimed. Tools Appl. 74(17), 7379–7404 (2015). https://doi.org/10.1007/s11042-014-1984-4
Article Google Scholar
Perez, M., Kot, A.C., Rocha, A.: Detection of real-world fights in surveillance videos. In: ICASSP 2019–2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2662–2666. IEEE (2019)
Google Scholar
Nievas, E.B., Suarez, O.D., Garcia, G.B., Sukthankar, R.: Movies fight detection dataset. In: Computer Analysis of Images and Patterns, pp. 332–339. Springer (2011)
Google Scholar
Hassner, T., Itcher, Y., Kliper-Gross, O.: Violent flows: real-time detection of violent crowd behavior. In: 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 1–6. IEEE (2012)
Google Scholar
Yun, K., Honorio, J., Chattopadhyay, D., Berg, T.L., Samaras, D.: Two-person interaction detection using body-pose features and multiple instance learning. In: 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 28–35. IEEE (2012)
Google Scholar
Sultani, W., Chen, C., Shah, M.: Real-world anomaly detection in surveillance videos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6479–6488 (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Deusto, Avenida de las Universidades 24, 48007, Bilbao, Spain
Eneko Atxa Landa, José Gaviria de la Puerta, Inigo Lopez-Gazpio, Iker Pastor-López, Alberto Tellaeche & Pablo García Bringas

Authors

Eneko Atxa Landa
View author publications
You can also search for this author in PubMed Google Scholar
José Gaviria de la Puerta
View author publications
You can also search for this author in PubMed Google Scholar
Inigo Lopez-Gazpio
View author publications
You can also search for this author in PubMed Google Scholar
Iker Pastor-López
View author publications
You can also search for this author in PubMed Google Scholar
Alberto Tellaeche
View author publications
You can also search for this author in PubMed Google Scholar
Pablo García Bringas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to José Gaviria de la Puerta .

Editor information

Editors and Affiliations

University of Deusto, Bilbao, Spain
Hugo Sanjurjo González
University of Deusto, Bilbao, Spain
Iker Pastor López
University of Deusto, Bilbao, Spain
Pablo García Bringas
University of A Coruña, A Coruña, Spain
Héctor Quintián
University of Salamanca, Salamanca, Spain
Emilio Corchado

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Landa, E.A., de la Puerta, J.G., Lopez-Gazpio, I., Pastor-López, I., Tellaeche, A., Bringas, P.G. (2021). Fight Detection in Images Using Postural Analysis. In: Sanjurjo González, H., Pastor López, I., García Bringas, P., Quintián, H., Corchado, E. (eds) Hybrid Artificial Intelligent Systems. HAIS 2021. Lecture Notes in Computer Science(), vol 12886. Springer, Cham. https://doi.org/10.1007/978-3-030-86271-8_20

Download citation

DOI: https://doi.org/10.1007/978-3-030-86271-8_20
Published: 15 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86270-1
Online ISBN: 978-3-030-86271-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics