Extracting Activities from Multimodal Observation

Brdiczka, Oliver; Maisonnasse, Jérôme; Reignier, Patrick; Crowley, James L.

doi:10.1007/11893004_21

Oliver Brdiczka²¹,
Jérôme Maisonnasse²¹,
Patrick Reignier²¹ &
…
James L. Crowley²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4252))

Included in the following conference series:

International Conference on Knowledge-Based and Intelligent Information and Engineering Systems

3027 Accesses
1 Citations

Abstract

This paper addresses the extraction of small group configurations and activities in an intelligent meeting environment. The proposed approach takes a continuous stream of observations coming from different sensors in the environment as input. The goal is to separate distinct distributions of these observations corresponding to distinct group configurations and activities. In this paper, we explore an unsupervised method based on the calculation of the Jeffrey divergence between histograms over observations. The obtained distinct distributions of observations can be interpreted as distinct segments of group configuration and activity. To evaluate this approach, we recorded a seminar and a cocktail party meeting. The observations of the seminar were generated by a speech activity detector, while the observations of the cocktail party meeting were generated by both the speech activity detector and a visual tracking system. We measured the correspondence between detected segments and labelled group configurations and activities. The obtained results are promising, in particular as our method is completely unsupervised.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Recognizing Interactions Between People from Video Sequences

Audio-Visual Speech-Turn Detection and Tracking

Change Points Detection in Multivariate Signal Applied to Human Activity Segmentation

References

Bilmes, J.A.: A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models, Technical Report, University of Berkeley (1998)
Google Scholar
Brdiczka, O., Maisonnasse, J., Reignier, P.: Automatic Detection of Interaction Groups. In: Proc. Int’l Conf. Multimodal Interfaces (2005)
Google Scholar
Brdiczka, O., Reignier, P., Maisonnasse, J.: Unsupervised segmentation of small group meetings using speech activity detection. In: Proc. Int’l Workshop on Multimodal Multiparty Meeting Processing (2005)
Google Scholar
Caporossi, A., Hall, D., Reignier, P., Crowley, J.L.: Robust visual tracking from dynamic control of processing. In: Proc. Int’l PETS Workshop (2004)
Google Scholar
McCowan, I., Gatica-Perez, D., Bengio, S., Lathoud, G., Barnard, M., Zhang, D.: Auto-matic Analysis of Multimodal Group Actions in Meetings. IEEE Trans. on Pattern Analysis and Machine Intelligence (March 2005)
Google Scholar
Muehlenbrock, M., Brdiczka, O., Snowdon, D., Meunier, J.-L.: Learning to Detect User Activity and Availability from a Variety of Sensor Data. In: Proc. IEEE Int’l Conference on Pervasive Computing and Communications (March 2004)
Google Scholar
Puzicha, J., Hofmann, T., Buhmann, J.: Non-parametric Similarity Measures for Unsupervised Texture Segmentation and Image Retrieval. In: Proc. Int’l Conf. Computer Vision and Pattern Recognition (1997)
Google Scholar
Qian, R.J., Sezan, M.I., Mathews, K.E.: Face Tracking Using Robust Statistical Estimation. In: Proc. Workshop on Perceptual User Interfaces, San Francisco (1998)
Google Scholar
Stiefelhagen, R., Steusloff, H., Waibel, A.: CHIL - Computers in the Human Interaction Loop. In: Proc. Int’l Workshop on Image Analysis for Multimedia Interactive Services (2004)
Google Scholar
Zaidenberg, S., Brdiczka, O., Reignier, P., Crowley, J.L.: Learning context models for the recognition of scenarios. In: Proc. IFIP Conf. on AI Applications and Innovations (2006)
Google Scholar
Zhang, D., Gatica-Perez, D., Bengio, S., McCowan, I., Lathoud, G.: Multimodal Group Action Clustering in Meetings. In: Proc. Int’l Workshop on Video Surveillance & Sensor Networks (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

INRIA Rhône-Alpes, Montbonnot, France
Oliver Brdiczka, Jérôme Maisonnasse, Patrick Reignier & James L. Crowley

Authors

Oliver Brdiczka
View author publications
You can also search for this author in PubMed Google Scholar
Jérôme Maisonnasse
View author publications
You can also search for this author in PubMed Google Scholar
Patrick Reignier
View author publications
You can also search for this author in PubMed Google Scholar
James L. Crowley
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Design, Engineering and Computing, Bournemouth University, UK
Bogdan Gabrys
Centre for SMART Systems, School of Environment and Technology, University of Brighton, BN2 4GJ, Brighton, UK
Robert J. Howlett
School of Electrical and Information Engineering, Knowledge Based Intelligent Engineering Systems Centre, University of South Australia, SA, 5095, Mawson Lakes, Australia
Lakhmi C. Jain

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Brdiczka, O., Maisonnasse, J., Reignier, P., Crowley, J.L. (2006). Extracting Activities from Multimodal Observation. In: Gabrys, B., Howlett, R.J., Jain, L.C. (eds) Knowledge-Based Intelligent Information and Engineering Systems. KES 2006. Lecture Notes in Computer Science(), vol 4252. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11893004_21

Download citation

DOI: https://doi.org/10.1007/11893004_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-46537-9
Online ISBN: 978-3-540-46539-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Extracting Activities from Multimodal Observation

Abstract

Access this chapter

Preview

Similar content being viewed by others

Recognizing Interactions Between People from Video Sequences

Audio-Visual Speech-Turn Detection and Tracking

Change Points Detection in Multivariate Signal Applied to Human Activity Segmentation

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Extracting Activities from Multimodal Observation

Abstract

Access this chapter

Preview

Similar content being viewed by others

Recognizing Interactions Between People from Video Sequences

Audio-Visual Speech-Turn Detection and Tracking

Change Points Detection in Multivariate Signal Applied to Human Activity Segmentation

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation