research-article

Breaking down violence: A deep-learning strategy to model and classify violence in videos

Authors:

Bruno Malveira Peixoto,

Anderson RochaAuthors Info & Claims

ARES '18: Proceedings of the 13th International Conference on Availability, Reliability and Security

Article No.: 50, Pages 1 - 7

https://doi.org/10.1145/3230833.3232809

Published: 27 August 2018 Publication History

Abstract

Detecting violence in videos through automatic means is significant for law enforcement and analysis of surveillance cameras with the intent of maintaining public safety. Moreover, it may be a great tool for protecting children from accessing inappropriate content and help parents make a better informed decision about what their kids should watch. However, this is a challenging problem since the very definition of violence is broad and highly subjective. Hence, detecting such nuances from videos with no human supervision is not only technical, but also a conceptual problem. With this in mind, we explore how to better describe the idea of violence for a convolutional neural network by breaking it into more objective and concrete parts. Initially, our method uses independent networks to learn features for more specific concepts related to violence, such as fights, explosions, blood, etc. Then we use these features to classify each concept and later fuse them in a meta-classification to describe violence. We also explore how to represent time-based events in still-images as network inputs; since many violent acts are described in terms of movement. We show that using more specific concepts is an intuitive and effective solution, besides being complementary to form a more robust definition of violence. When compared to other methods for violence detection, this approach holds better classification quality while using only automatic features.

References

[1]

Sandra Avila, Daniel Moreira, Mauricio Perez, Daniel Moraes, Isabela Cota, Vanessa Testoni, Eduardo Valle, Siome Goldenstein, Anderson Rocha, et al. 2014. RECOD at MediaEval 2014: Violent scenes detection task. In Working Notes Proceedings of the MediaEval 2014 Workshop, Barcelona, Spain, October 16-17 (CEUR Workshop Proceedings). CEUR-WS.org.

[2]

Ming-yu Chen and Alexander Hauptmann. 2009. MoSIFT: Recognizing human actions in surveillance videos. In CMU-CS-09-161, Carnegie Mellon University (2009).

[3]

Wen-Huang Cheng, Wei-Ta Chu, and Ja-Ling Wu. 2003. Semantic Context Detection Based on Hierarchical Audio Models. In Proceedings of the 5th ACM SIGMM International Workshop on Multimedia Information Retrieval. New York, NY, USA, 109--115.

Digital Library

[4]

Qi Dai, Jian Tu, Ziqiang Shi, Yu-Gang Jiang, and Xiangyang Xue. 2013. Fudan at MediaEval 2013: Violent Scenes Detection Using Motion Features and Part-Level Attributes. In MediaEval.

[5]

Qi Dai, Rui-Wei Zhao, Zuxuan Wu, Xi Wang, Zichen Gu, Wenhai Wu, and Yu-Gang Jiang. 2015. Fudan-Huawei at MediaEval 2015: Detecting Violent Scenes and Affective Impact in Movies with Deep Learning. In Working Notes Proceedings of the MediaEval 2015 Workshop, Wurzen, Germany, September 14-15 (CEUR Workshop Proceedings), Vol. 1436.CEUR-WS.org. http://dblp.uni-trier.de/db/conf/mediaeval/mediaeval2015.html#DaiZWWGWJ15

[6]

Fillipe D.M. De Souza, Guillermo C. Chavez, Eduardo A. do Valle Jr., and Arnaldo de A. Araújo. {n. d.}. Violence detection in video using spatio-temporal features. In 23rd SIBGRAPI: Conference on Graphics, Patterns and Images, 2010. IEEE, 224--230.

Digital Library

[7]

Fillipe Dias Moreira de Souza, Eduardo Valle, Guillermo Cámara Chávez, and Arnaldo de Albuquerque Araújo. 2011. Color-Aware Local Spatiotemporal Features for Action Recognition. In Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications - 16th Iberoamerican Congress, CIARP 2011, Pucón, Chile, November 15-18, 2011. Proceedings. 248--255.

Digital Library

[8]

C. H. Demarty, B. Ionescu, Y. G. Jiang, V L. Quang, M. Schedl, and C. Penet. 2014. Benchmarking Violent Scenes Detection in movies. In 2014 12th International Workshop on Content-Based Multimedia Indexing (CBMI). 1--6.

[9]

Claire-Hélène Demarty, Cédric Penet, Guillaume Gravier, and Mohammad Soleymani. 2012. A Benchmarking Campaign for the Multimodal Detection of Violent Scenes in Movies. In Computer Vision -- ECCV 2012. Workshops and Demonstrations, Andrea Fusiello, Vittorio Murino, and Rita Cucchiara (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 416--425.

Digital Library

[10]

Claire-Hélène Demarty, Cédric Penet, Markus Schedl, Bogdan Ionescu, Vu Lam Quang, and Yu-Gang Jiang. {n. d.}. Benchmarking Violent Scenes Detection in movies. In 2014 12th International Workshop on Content-Based Multimedia Indexing (CBMI).

[11]

Nadia Derbas, Bahjat Safadi, and Georges Quénot. 2013. LIG at MediaEval 2013 Affect Task: Use of a Generic Method and Joint Audio-Visual Words. In MediaEval.

[12]

Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, Xiang-Rui Wang, and Chih-Jen Lin. 2008. LIBLINEAR: A library for large linear classification. Journal of machine learning research 9, Aug (2008), 1871--1874.

Digital Library

[13]

Shuiwang Ji, Wei Xu, Ming Yang, and Kai Yu. 2013. 3D Convolutional Neural Networks for Human Action Recognition. In IEEE Transactions on Pattern Analysis andMachine Intelligence (PAMI), Vol. 35. 221--231.

Digital Library

[14]

Vu Lam, Duy-Dinh Le, Sang Phan, Shin'ichi Satoh, and Duc Anh Duong. 2013. NII-UIT at MediaEval 2013 Violent Scenes Detection Affect Task. In MediaEval.

[15]

Vu Lam, Duy-Dinh Le, Sang Phan, Shin'ichi Satoh, and Duc Anh Duong. 2014. NII-UIT at MediaEval 2014 Violent Scenes Detection Affect Task. In Working Notes Proceedings of the MediaEval 2014 Workshop, Barcelona, Spain, October 16-17 (CEUR Workshop Proceedings). CEUR-WS.org.

[16]

Vu Lam, Sang Phan Le, Duy-Dinh Le, Shin'ichi Satoh, and Duc Anh Duong. 2015. NII-UIT at MediaEval 2015 Affective Impact of Movies Task. In Working Notes Proceedings of the MediaEval 2015 Workshop, Wurzen, Germany, September 14-15 (CEUR Workshop Proceedings), Vol. 1436. CEUR-WS.org.

[17]

Ivan Laptev. 2005. On space-time interest points. International journal of computer vision 64, 2--3 (2005), 107--123.

Digital Library

[18]

Jian Lin and Weiqiang Wang. 2009. Weakly-Supervised Violence Detection in Movies with Audio and Video Based Co-training. In Advances in Multimedia Information Processing - PCM 2009. Springer Berlin Heidelberg, Berlin, Heidelberg, 930--935.

Digital Library

[19]

Daniel Moreira, Sandra Avila, Mauricio Perez, Daniel Moraes, Vanessa Testoni, Eduardo Valle, Siome Goldenstein, and Anderson Rocha. 2016. Pornography classification: The hidden clues in video space--time. Forensic Science International 268 (2016), 46--61.

[20]

Enrique Bermejo Nievas, Oscar Deniz Suarez, Gloria Bueno García, and Rahul Sukthankar. 2011. Violence Detection in Video Using Computer Vision Techniques. In Proceedings of the 14th International Conference on Computer Analysis of Images and Patterns - Volume Part II. Springer-Verlag, 332--339.

Digital Library

[21]

Marin Vlastelica P., Sergey Hayrapetyan, Makarand Tapaswi, and Rainer Stiefelhagen. 2015. KIT at MediaEval 2015 - Evaluating Visual Cues for Affective Impact of Movies Task. In Working Notes Proceedings of the MediaEval 2015 Workshop, Wurzen, Germany, September 14-15 (CEUR Workshop Proceedings), Vol. 1436. CEUR-WS.org. http://ceur-ws.org/Vol-1436/Paper30.pdf

[22]

Mauricio Perez, Sandra Avila, Daniel Moreira, Daniel Moraes, Vanessa Testoni, Eduardo Valle, Siome Goldenstein, and Anderson Rocha. 2017. Video pornography detection through deep learning techniques and motion information. Neurocomputing 230 (2017), 279--293.

Digital Library

[23]

Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going Deeper with Convolutions. In Computer Vision and Pattern Recognition (CVPR). http://arxiv.org/abs/1409.4842

[24]

Chun Chet Tan and Chong-Wah Ngo. 2013. The Vireo Team at MediaEval 2013: Violent Scenes Detection by Mid-level Concepts Learnt from Youtube. In MediaEval.

[25]

Heng Wang, Alexander Kläser, Cordelia Schmid, and Cheng-Lin Liu. 2013. Dense Trajectories and Motion Boundary Descriptors for Action Recognition. 103 (05 2013).

[26]

Jianxin Wu. 2012. Power mean SVM for large scale visual classification. In 2012 IEEE Conference on Computer Vision and Pattern Recognition. 2344--2351.

Digital Library

[27]

Yun Yi, Hanli Wang, Bowen Zhang, and Jian Yu. 2015. MIC-TJU in MediaEval 2015 Affective Impact of Movies Task. In Working Notes Proceedings of the MediaEval 2015 Workshop, Wurzen, Germany, September 14-15 (CEUR Workshop Proceedings), Vol. 1436. CEUR-WS.org. http://dblp.uni-trier.de/db/conf/mediaeval/mediaeval2015.html#YiWZY15

Cited By

Kang RRau P(2024)IIVRS: an Intelligent Image and Video Rating System to Provide Scenario-Based Content for Different UsersInteracting with Computers10.1093/iwc/iwae03436:6(406-415)Online publication date: 21-Jul-2024
https://doi.org/10.1093/iwc/iwae034
Kaur GSingh S(2024)Revisiting vision-based violence detection in videos: A critical analysisNeurocomputing10.1016/j.neucom.2024.128113597(128113)Online publication date: Sep-2024
https://doi.org/10.1016/j.neucom.2024.128113
Tiwari RMaheshwari HAgarwal AJain V(2023)Hybrid CNN-LSTM Model for Automated Violence Detection and Classification in Surveillance Systems2023 12th International Conference on System Modeling & Advancement in Research Trends (SMART)10.1109/SMART59791.2023.10428538(169-175)Online publication date: 22-Dec-2023
https://doi.org/10.1109/SMART59791.2023.10428538
Show More Cited By

Recommendations

Sexting, pressured sexting and associations with dating violence among early adolescents
Abstract
Many adolescents engage in sexting, the sending of self-made sexually explicit images, within the context of a romantic relationship. The aim of the current study is to contribute to the literature by assessing the associations between ...
Highlights
- We investigate the relationships between (pressured) sexting and dating violence.
Technological intimate partner violence: Exploring technology-related perpetration factors and overlap with in-person intimate partner violence
Abstract
Technology creates new opportunities for intimate partner violence (IPV) to occur. There are common risk factors for in-person and technological IPV (tIPV), however considerably less research has investigated technology-specific risk ...
Highlights
- Technological disinhibition predicts technological IPV perpetration.
- Social ...
The Relationship Between Intimate Partner Violence and Youth Violence: A Systematic Literature Review

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ARES '18: Proceedings of the 13th International Conference on Availability, Reliability and Security

August 2018

603 pages

ISBN:9781450364485

DOI:10.1145/3230833

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

Universität Hamburg: Universität Hamburg

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 August 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

Conference

ARES 2018

ARES 2018: International Conference on Availability, Reliability and Security

August 27 - 30, 2018

Hamburg, Germany

Acceptance Rates

ARES '18 Paper Acceptance Rate 128 of 260 submissions, 49%;

Overall Acceptance Rate 228 of 451 submissions, 51%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

15
Total Citations
View Citations
300
Total Downloads

Downloads (Last 12 months)25
Downloads (Last 6 weeks)0

Reflects downloads up to 23 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Kang RRau P(2024)IIVRS: an Intelligent Image and Video Rating System to Provide Scenario-Based Content for Different UsersInteracting with Computers10.1093/iwc/iwae03436:6(406-415)Online publication date: 21-Jul-2024
https://doi.org/10.1093/iwc/iwae034
Kaur GSingh S(2024)Revisiting vision-based violence detection in videos: A critical analysisNeurocomputing10.1016/j.neucom.2024.128113597(128113)Online publication date: Sep-2024
https://doi.org/10.1016/j.neucom.2024.128113
Tiwari RMaheshwari HAgarwal AJain V(2023)Hybrid CNN-LSTM Model for Automated Violence Detection and Classification in Surveillance Systems2023 12th International Conference on System Modeling & Advancement in Research Trends (SMART)10.1109/SMART59791.2023.10428538(169-175)Online publication date: 22-Dec-2023
https://doi.org/10.1109/SMART59791.2023.10428538
Yousaf KNawaz THabib A(2023)Using two-stream EfficientNet-BiLSTM network for multiclass classification of disturbing YouTube videosMultimedia Tools and Applications10.1007/s11042-023-15774-383:12(36519-36546)Online publication date: 17-May-2023
https://doi.org/10.1007/s11042-023-15774-3
Reinolds FNeto CMachado J(2022)Deep Learning for Activity Recognition Using Audio and VideoElectronics10.3390/electronics1105078211:5(782)Online publication date: 3-Mar-2022
https://doi.org/10.3390/electronics11050782
Vrskova RHudec RKamencay PSykora P(2022)Human Activity Classification Using the 3DCNN ArchitectureApplied Sciences10.3390/app1202093112:2(931)Online publication date: 17-Jan-2022
https://doi.org/10.3390/app12020931
Adão Teixeira MAvila S(2021)What should we pay attention to when classifying violent videos?The 16th International Conference on Availability, Reliability and Security10.1145/3465481.3470059(1-10)Online publication date: 17-Aug-2021
https://doi.org/10.1145/3465481.3470059
Sarcar SYousuf M(2021)Detecting Violent Arm Movements Using CNN-LSTM2021 5th International Conference on Electrical Information and Communication Technology (EICT)10.1109/EICT54103.2021.9733510(1-6)Online publication date: 17-Dec-2021
https://doi.org/10.1109/EICT54103.2021.9733510
Saba TRehman ALatif RFati SRaza MSharif M(2021)Suspicious Activity Recognition Using Proposed Deep L4-Branched-Actionnet With Entropy Coded Ant Colony System OptimizationIEEE Access10.1109/ACCESS.2021.30910819(89181-89197)Online publication date: 2021
https://doi.org/10.1109/ACCESS.2021.3091081
Vrskova RHudec RSykora PKamencay PBenco M(2020)Violent Behavioral Activity Classification Using Artificial Neural Network2020 New Trends in Signal Processing (NTSP)10.1109/NTSP49686.2020.9229532(1-5)Online publication date: 14-Oct-2020
https://doi.org/10.1109/NTSP49686.2020.9229532
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten