Analysis of Machine Learning Algorithms for Violence Detection in Audio

Veloso, Bruno; Durães, Dalila; Novais, Paulo

doi:10.1007/978-3-031-18697-4_17

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1678))

Included in the following conference series:

International Conference on Practical Applications of Agents and Multi-Agent Systems

568 Accesses
2 Citations

Abstract

Violence has always been part of humanity, however, there are different types of violence, with physical violence being the most recurrent in our daily lives. This type of violence increasingly affects many people’s lives, so it is essential to try to combat violence. In recent years, human action recognition has been extensively studied, but mainly in video, an important computer vision area. Audio appears as a factor capable of circumventing these problems. Audio sensors can be omnidirectional, requiring less processing power and hardware and software performance when compared to the video. The audio can represent emotions. It is not affected by lighting or temperature problems, nor does it need to be at a favourable angle to capture the intended information. That said, audio is seen as the best way to recognize violence, applied with Machine Learning/Deep Learning/Transfer Learning techniques. In this paper we test a Convolutional Neural Network (CNN), a ResNet50, VGG16 and VGG19, in order to classify audios. Later we see that CNN obtains the best results, with a 92.44% accuracy in the test set. ResNet50 was the worst model used, obtaining an 86.34% accuracy. For the VGG models, both show a good potential but did not get better results than CNN.

Supported by organization ALGORITMI Centre.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

In-Car Violence Detection Based on the Audio Signal

Violence Detection in Videos Using Transfer Learning and LSTM

Violent Scene Detection Using Convolutional Neural Networks and Deep Audio Features

Notes

1.
Biblioteca moviepyhttps://github.com/Zulko/moviepy.
2.
Biblioteca pydubhttps://github.com/jiaaro/pydub.
3.
Biblioteca librosahttps://librosa.org/doc/latest/index.html.

References

Souto, H., Mello, R., Furtado, A.: An acoustic scene classification approach involving domestic violence using machine learning. In: Anais do XVI Encontro Nacional de Inteligência Artificial e Computacional. SBC, 2019. APAV (2021). Estatisticas_APAV_Relatorio_Anual_2020.Pdf., apav.pt/apav_v3/images/pdf/Estatisticas_APAV_Relatorio_Anual_2020.pdf. Accessed 22 Oct 2021
Durães, D., Santos, F., Marcondes, F.S., Lange, S., Machado, J.: Comparison of transfer learning behaviour in violence detection with different public datasets. In: Marreiros, G., Melo, F.S., Lau, N., Lopes Cardoso, H., Reis, L.P. (eds.) EPIA 2021. LNCS (LNAI), vol. 12981, pp. 290–298. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86230-5_23
Chapter Google Scholar
Souto, H., Mello, R., Furtado, A.: An acoustic scene classification approach involving domestic violence using machine learning. In: Anais do XVI Encontro Nacional de Inteligência Artificial e Computacional, pp. 705–716. SBC (2019)
Google Scholar
Durães, D., Marcondes, F.S., Gonçalves, F., Fonseca, J., Machado, J., Novais, P.: Detection violent behaviors: a survey. In: Novais, P., Vercelli, G., Larriba-Pey, J.L., Herrera, F., Chamoso, P. (eds.) ISAmI 2020. AISC, vol. 1239, pp. 106–116. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-58356-9_11
Chapter Google Scholar
Hershey, S., et al.: CNN architectures for large-scale audio classification. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 131–135. IEEE (2017)
Google Scholar
Crocco, M., Cristani, M., Trucco, A., Murino, V.: Audio surveillance: a systematic review. ACM Comput. Surv. (CSUR) 48(4), 1–46 (2016)
Article Google Scholar
Marcondes, F.S., Durães, D., Gonçalves, F., Fonseca, J., Machado, J., Novais, P.: In-vehicle violence detection in carpooling: a brief survey towards a general surveillance system. In: Dong, Y., Herrera-Viedma, E., Matsui, K., Omatsu, S., González Briones, A., Rodríguez González, S. (eds.) DCAI 2020. AISC, vol. 1237, pp. 211–220. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-53036-5_23
Chapter Google Scholar
Jesus, T., et al.: Review of trends in automatic human activity recognition using synthetic audio-visual data. In: Analide, C., Novais, P., Camacho, D., Yin, H. (eds.) IDEAL 2020. LNCS, vol. 12490, pp. 549–560. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-62365-4_53
Chapter Google Scholar
Kong, Y., Fu, Y.: Human action recognition and prediction: a survey. Int. J. Comput. Vision 130(5), 1366–1401 (2022)
Article Google Scholar
Wu, Z., Shen, C., Van Den Hengel, A.: Wider or deeper: revisiting the resnet model for visual recognition. Pattern Recogn. 90, 119–133 (2019)
Article Google Scholar
Soliman, M.M., Kamal, M.H., Nashed, M.A.E.M., Mostafa, Y.M., Chawky, B.S., Khattab, D.: Violence recognition from videos using deep learning techniques. In 2019 Ninth International Conference on Intelligent Computing and Information Systems (ICICIS), pp. 80–85. IEEE (2019)
Google Scholar
Rapid-Rich Object Search Lab, NTU CCTV-Fights Dataset. https://rose1.ntu.edu.sg/dataset/cctvFights/. Accessed on 08 Jan 2022
Wu, P., et al.: Not only look, but also listen: learning multimodal violence detection under weak supervision. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12375, pp. 322–339. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58577-8_20
Chapter Google Scholar
Santos, F., et al.: In-car violence detection based on the audio signal. In: Yin, H., et al. (eds.) IDEAL 2021. LNCS, vol. 13113, pp. 437–445. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-91608-4_43
Chapter Google Scholar
Nanni, L., Costa, Y.M., Aguiar, R.L., Mangolin, R.B., Brahnam, S., Silla, C.N.: Ensemble of convolutional neural networks to improve animal audio classification. EURASIP J. Audio Speech Music Process. 2020(1), 1–14 (2020)
Article Google Scholar
Gartzman, Dalya, Getting to Know the Mel Spectrogram (2019). https://towardsdatascience.com/getting-to-know-the-mel-spectrogram-31bca3e2d9d0. Accessed on 29 Jan 2022
O’Shea, K., Nash, R.: An introduction to convolutional neural networks. arXiv preprint arXiv:1511.08458 (2015)
Gujjar, J.P., Kumar, H.P., Chiplunkar, N.N.: Image classification and prediction using transfer learning in colab notebook. Global Transit. Proc. 2(2), 382–385 (2021)
Article Google Scholar
DarrenLevyOfficial (2021). https://www.youtube.com/watch?v=BB5Y0j8RLE4. Accessed 30 Jan 2022

Download references

Acknowledgements

This work is supported by: FCT Fundação para a Ciência e Tecnologia within the RD Units Project Scope: UIDB/00319/2020.

Author information

Authors and Affiliations

ALGORITMI Centre, University of Minho, Braga, Portugal
Bruno Veloso, Dalila Durães & Paulo Novais

Authors

Bruno Veloso
View author publications
You can also search for this author in PubMed Google Scholar
Dalila Durães
View author publications
You can also search for this author in PubMed Google Scholar
Paulo Novais
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dalila Durães .

Editor information

Editors and Affiliations

University of Salamanca, Salamanca, Spain
Alfonso González-Briones
Institute of Engineering - Polytechnic of Porto, Porto, Portugal
Ana Almeida
Universidad Rey Juan Carlos, Madrid, Spain
Alberto Fernandez
German International University, Cairo, Egypt
Alia El Bolock
University of Minho, Braga, Portugal
Dalila Durães
Universitat Politècnica de València, Valencia, Spain
Jaume Jordán
National Laboratory of Energy and Geology, Amadora, Portugal
Fernando Lopes

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Veloso, B., Durães, D., Novais, P. (2022). Analysis of Machine Learning Algorithms for Violence Detection in Audio. In: González-Briones, A., et al. Highlights in Practical Applications of Agents, Multi-Agent Systems, and Complex Systems Simulation. The PAAMS Collection. PAAMS 2022. Communications in Computer and Information Science, vol 1678. Springer, Cham. https://doi.org/10.1007/978-3-031-18697-4_17

Download citation

DOI: https://doi.org/10.1007/978-3-031-18697-4_17
Published: 13 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-18696-7
Online ISBN: 978-3-031-18697-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Analysis of Machine Learning Algorithms for Violence Detection in Audio