Abstract
In this paper we present our research towards the detection of violent scenes in movies, employing fusion methodologies, based on learning. Towards this goal, a multi-step approach is followed: initially, automated auditory and visual processing and analysis is performed in order to estimate probabilistic measures regarding particular audio and visual related classes. At a second stage, a meta-classification architecture is adopted, which combines the audio and visual information, in order to classify mid-term video segments as “violent” or “non-violent”. The proposed scheme has been evaluated on a real dataset from 10 films.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Datta, A., Shah, M., da Vitoria Lobo, N.: Person-on-person violence detection in video data. In: ICPR, vol. 1, pp. 433–438 (2002)
Zajdel, W., Krijnders, J., Andringa, T., Gavrila, D.: Cassandra: audio-video sensor fusion for aggression detection, pp. 200–205 (2007)
Vasconcelos, N., Lippman, A.: Towards semantically meaningful feature spaces for the characterization of video content. In: ICIP 1997: Proceedings of the 1997 International Conference on Image Processing (ICIP 1997), Washington, DC, USA, 3-Volume Set-Volume 1, p. 25. IEEE Computer Society, Los Alamitos (1997)
Nam, J., Alghoniemy, M., Tewfik, A.H.: Audio-visual content-based violent scene characterization. In: ICIP(1), pp. 353–357 (1998)
Vasconcelos, N., Lippman, A.: Towards semantically meaningful feature spaces for the characterization of video content. In: International Conference on Image Processing, pp. 25–28 (1997)
Datta, A., Shah, M., Lobo, N.V.: Person-on-person violence detection in video data. In: IEEE International Conference on Pattern Recognition, Canada (2002)
Nam, J., Tewfik, A.H.: Event-driven video abstraction and visualization. Multimedia Tools Appl. 16(1-2), 55–77 (2002)
Rasheed, Z., Shah, M.: Movie genre classification by exploiting audio-visual features of previews. In: Proceedings 16th International Conference on Pattern Recognition, pp. 1086–1089 (2002)
Giannakopoulos, T., Kosmopoulos, D., Aristidou, A., Theodoridis, S.: Violence content classification using audio features. In: Antoniou, G., Potamias, G., Spyropoulos, C., Plexousakis, D. (eds.) SETN 2006. LNCS (LNAI), vol. 3955, pp. 502–507. Springer, Heidelberg (2006)
Giannakopoulos, T., Pikrakis, A., Theodoridis, S.: A multi-class audio classification method with respect to violent content in movies, using bayesian networks. In: IEEE International Workshop on Multimedia Signal Processing, MMSP 2007 (2007)
Rifkin, R., Klautau, A.: In defense of one-vs-all classification. J. Mach. Learn. Res. 5, 101–141 (2004)
Lienhart, R., Maydt, J.: An extended set of haar-like features for rapid object detection. In: ICIP(1), pp. 900–903 (2002)
Makris, A., Kosmopoulos, D., Perantonis, S., Theodoridis, S.: Hierarchical feature fusion for visual tracking. In: IEEE International Conference on Image Processing 2007, September 16 -October 19, vol. 6, pp. VI –289–VI –292 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Giannakopoulos, T., Makris, A., Kosmopoulos, D., Perantonis, S., Theodoridis, S. (2010). Audio-Visual Fusion for Detecting Violent Scenes in Videos. In: Konstantopoulos, S., Perantonis, S., Karkaletsis, V., Spyropoulos, C.D., Vouros, G. (eds) Artificial Intelligence: Theories, Models and Applications. SETN 2010. Lecture Notes in Computer Science(), vol 6040. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12842-4_13
Download citation
DOI: https://doi.org/10.1007/978-3-642-12842-4_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12841-7
Online ISBN: 978-3-642-12842-4
eBook Packages: Computer ScienceComputer Science (R0)