Addressing Multimodality in Overt Aggression Detection

Lefter, Iulia; Rothkrantz, Leon J. M.; Burghouts, Gertjan; Yang, Zhenke; Wiggers, Pascal

doi:10.1007/978-3-642-23538-2_4

Iulia Lefter^21,22,23,
Leon J. M. Rothkrantz^21,22,
Gertjan Burghouts²³,
Zhenke Yang²² &
…
Pascal Wiggers²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6836))

Included in the following conference series:

International Conference on Text, Speech and Dialogue

Abstract

Automatic detection of aggressive situations has a high societal and scientific relevance. It has been argued that using data from multimodal sensors as for example video and sound as opposed to unimodal is bound to increase the accuracy of detections. We approach the problem of multimodal aggression detection from the viewpoint of a human observer and try to reproduce his predictions automatically. Typically, a single ground truth for all available modalities is used when training recognizers. We explore the benefits of adding an extra level of annotations, namely audio-only and video-only. We analyze these annotations and compare them to the multimodal case in order to have more insight into how humans reason using multimodal data. We train classifiers and compare the results when using unimodal and multimodal labels as ground truth. Both in the case of audio and video recognizer the performance increases when using the unimodal labels.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

A Multimodal Approach to Psycho-Emotional State Detection of a Vehicle Driver

Survey on AI-Based Multimodal Methods for Emotion Detection

³Comparative Analysis of Audio–Video Multimodal Methods for Emotion Recognition

References

Douglas-Cowie, E., Devillers, L., Martin, J.C., Cowie, R., Savvidou, S., Abrilian, S., Cox, C.: Multimodal databases of everyday emotion: Facing up to complexity. In: Ninth European Conference on Speech Communication and Technology (2005)
Google Scholar
Hendriks, R.C., Heusdens, R., Jensen, J.: MMSE based noise PSD tracking with low complexity. In: IEEE Int. Conf. Acoust, Speech, Signal Processing, pp. 4266–4269 (2010)
Google Scholar
Juslin, P.N., Scherer, K.R.: Vocal expression of affect. In: Harrigan, J., Rosenthal, R., Scherer, K. (eds.) The New Handbook of Methods in Nonverbal Behavior Research, pp. 65–135. Oxford University Press, Oxford (2005)
Google Scholar
Kipp, M.: Spatiotemporal Coding in ANVIL. In: Proceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008 (2008)
Google Scholar
Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: IEEE Int. Conf. of Computer Vision and Pattern Recognition (2008)
Google Scholar
Lefter, I., Rothkrantz, L.J.M., Wiggers, P., Van Leeuwen, D.A.: Emotion recognition from speech by combining databases and fusion of classifiers. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2010. LNCS, vol. 6231, pp. 353–360. Springer, Heidelberg (2010)
Chapter Google Scholar
Nöth, E., Hacker, C., Batliner, A.: Does multimodality really help? The classification of emotion and of On/Off-focus in multimodal dialogues-two case studies. In: ELMAR, pp. 9–16 (2007)
Google Scholar
Yang, Z.: Multi-Modal Aggression Detection in Trains. PhD thesis, Delft Univeristy of Technology (2009)
Google Scholar
Zajdel, W., Krijnders, J.D., Andringa, T.C., Gavrila, D.M.: CASSANDRA: audio-video sensor fusion for aggression detection. In: Proc. IEEE Conference on Advanced Video and Signal Based Surveillance AVSS, pp. 200–205 (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Delft University of Technology, The Netherlands
Iulia Lefter, Leon J. M. Rothkrantz & Pascal Wiggers
The Netherlands Defense Academy, The Netherlands
Iulia Lefter, Leon J. M. Rothkrantz & Zhenke Yang
TNO, The Netherlands
Iulia Lefter & Gertjan Burghouts

Authors

Iulia Lefter
View author publications
You can also search for this author in PubMed Google Scholar
Leon J. M. Rothkrantz
View author publications
You can also search for this author in PubMed Google Scholar
Gertjan Burghouts
View author publications
You can also search for this author in PubMed Google Scholar
Zhenke Yang
View author publications
You can also search for this author in PubMed Google Scholar
Pascal Wiggers
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Sciences, University of West Bohemia, Univerzitní 22, 306 14, Pilsen, Czech Republic
Ivan Habernal
Faculty of Applied Sciences, Dept. of Computer Science and Engineering, University of West Bohemia, Univerzitni 8, 306 14, Pilsen, Czech Republic
Václav Matoušek

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lefter, I., Rothkrantz, L.J.M., Burghouts, G., Yang, Z., Wiggers, P. (2011). Addressing Multimodality in Overt Aggression Detection. In: Habernal, I., Matoušek, V. (eds) Text, Speech and Dialogue. TSD 2011. Lecture Notes in Computer Science(), vol 6836. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23538-2_4

Download citation

DOI: https://doi.org/10.1007/978-3-642-23538-2_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23537-5
Online ISBN: 978-3-642-23538-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics