Weakly-Supervised Violence Detection in Movies with Audio and Video Based Co-training

Lin, Jian; Wang, Weiqiang

doi:10.1007/978-3-642-10467-1_84

Jian Lin²² &
Weiqiang Wang^22,23

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5879))

Included in the following conference series:

Pacific-Rim Conference on Multimedia

1340 Accesses

Abstract

In this work, we present a novel method to detect violent shots in movies. The detection process is split into two views—the audio and video views. From the audio-view, a weakly-supervised method is exploited to improve the classification performance. And from the video-view, we use a classifier to detect violent shots. Finally, the auditory and visual classifiers are combined in a co-training way. The experimental results on several movies with violent contents preliminarily show the effectiveness of our method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Not only Look, But Also Listen: Learning Multimodal Violence Detection Under Weak Supervision

CoLeaF: A Contrastive-Collaborative Learning Framework for Weakly Supervised Audio-Visual Video Parsing

Multimodal PLSA for Movie Genre Classification

References

Swanson, M.D., Zhu, B., Tewfik, A.H.: Data Hiding for Video-in-Video. In: IEEE International Conference on Image Processing, vol. 2, pp. 676–679 (1997)
Google Scholar
Vasconcelos, N., Lippman, A.: Towards Semantically Meaningful Feature Spaces for The Characterization of Video Content. In: Proceedings of International Conference on Image Processing, 1997, vol. 1, pp. 25–28 (1997)
Google Scholar
Datta, A., Shah, M., Lobo, N.D.V.: Person-on-Person Violence Detection in Video Data. In: IEEE International Conference on Pattern Recognition, pp. 433–438 (2002)
Google Scholar
Nam, J., Alghoniemy, M., Tewfik, A.H.: Audio-Visual Content-Based Violent Scene Characterization. In: IEEE International Conference on Image Processing, vol. 1, pp. 353–357 (1998)
Google Scholar
Cheng, W., Chu, W., Wu, J.: Semantic Context Detection Based on Hierarchical Audio models. In: Proceedings of the 5th ACM SIGMM international Workshop on Multimedia information Retrieval, pp. 109–115 (2003)
Google Scholar
Cai, L.J., Hofmann, T.: Text Categorization by Boosting Automatically Extracted Concepts. In: Proceedings of the 26th annual international ACM SIGIR conference on Research and Development, pp. 182–189 (2003)
Google Scholar
Akita, Y., Kawahara, K.: Language Modeling Adaptation Based on PLSA of Topics and Speakers. In: 8th International Conference on Spoken Language Processing, pp. 1045–1048 (2004)
Google Scholar
Wold, E., Blum, T., Keislar, D., Wheaten, J.: Content-Based Classification, Search, and Retrieval of Audio, Multimedia, IEEE 3, 27–36 (1996)
Google Scholar
Cai, R., Lu, L., Hanjalic, A., Zhang, H.J., Cai, L.H.: A Flexible Framework for Key Audio Effects Detection and Auditory Context Inference. IEEE Transaction on Audio, Speech and Language Processing 14, 1026–1039 (2006)
Article Google Scholar
Wang, Y., Liu, Z., Huang, J.C.: Multimedia Content Analysis Using Both Audio and Visual Clues. IEEE Signal Processing Magazine 17, 12–36 (2000)
Article Google Scholar
Sarkar, A.: Applying Co-Training Methods to Statistical Parsing. In: Proceedings of the 2nd Annual Meeting of the NAACL (2001)
Google Scholar
Ng, V., Cardie, C.: Weakly Supervised Natural Language Learning Without Redundant Views. In: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, vol. 1, pp. 94–101 (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Graduate University of Chinese Academy of Sciences, Beijing, China
Jian Lin & Weiqiang Wang
Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Weiqiang Wang

Authors

Jian Lin
View author publications
You can also search for this author in PubMed Google Scholar
Weiqiang Wang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Naresuan University, 65000, Phisanulok, Thailand
Paisarn Muneesawang
Microsoft Research Asia, 100109, Beijing, China
Feng Wu
Tokyo Institute of Technology, 226-8503, Yokohama, Japan
Itsuo Kumazawa
Mahanakorn University of Technology, 10530, Bankok, Thailand
Athikom Roeksabutr
Institute of Information Science, Academia Sinica, Taipei, Taiwan
Mark Liao
Chinese University of Hong Kong, Shatin, N.T., Hong Kong,
Xiaoou Tang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lin, J., Wang, W. (2009). Weakly-Supervised Violence Detection in Movies with Audio and Video Based Co-training. In: Muneesawang, P., Wu, F., Kumazawa, I., Roeksabutr, A., Liao, M., Tang, X. (eds) Advances in Multimedia Information Processing - PCM 2009. PCM 2009. Lecture Notes in Computer Science, vol 5879. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10467-1_84

Download citation

DOI: https://doi.org/10.1007/978-3-642-10467-1_84
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-10466-4
Online ISBN: 978-3-642-10467-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics