Abstract
This chapter addresses the problem of multimodal analysis of human face–to–face communication. This is imporant since in the near future, smart environments equipped with multiple sensory systems will be able to sense the presence of humans and assess recognize their behaviours, actions, and emotional states. The main goal of the presented study is to develop models of communicative/interactive events in multimedia (audio and video), suitable for the analysis and subsequent incorporation within virtual reality environments. Interactive, environmental, and emotional characteristics of the communicators are estimated in order to define the communication event as one entity. This is achieved by putting together results obtained in social sciences and multimedia signal processing under one umbrella – the communication atmosphere analysis. Experiments based on real life recordings support the approach.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Gibson, J.J.: The theory of affordances. In: Shaw, R., Bransford, J. (eds.) Perceiving, Acting and Knowing, Erlbaum, Hillsdale (1977)
Norris, S.: Analyzing Multimodal Interaction - A Methodological Framework. Routledge, London (2004)
Moore, D.M., Burton, J.K., Myers, R.J.: Multiple-channel communication: The theoretical and research foundations of multimedia. In: Jonassen, D. (ed.) Handbook of Research for Educational Communications and Technology, pp. 851–875. Prentice-Hall, Englewood Cliffs (1996)
Chen, M.: Visualizing the pulse of a classroom. In: Proceedings of the Eleventh ACM International Conference on Multimedia, Berkeley, CA, USA, pp. 555–561. ACM Press, New York (2003)
Pantic, M., et al.: Human computing and machine understanding of human behavior: A survey. In: Proceedings of The ACM International Conference on Multimodal Interfaces, pp. 239–248 (2006)
Rutkowski, T.M., et al.: Toward the human communication efficiency monitoring from captured audio and video media in real environments. In: Palade, V., Howlett, R.J., Jain, L. (eds.) KES 2003. LNCS, vol. 2774, pp. 1093–1100. Springer, Heidelberg (2003)
Rutkowski, T.M., et al.: Evaluation of the communication atmosphere. In: Negoita, M.G., Howlett, R.J., Jain, L.C. (eds.) KES 2004. LNCS (LNAI), vol. 3215, pp. 364–370. Springer, Heidelberg (2004)
Dimberg, U., Thunberg, E., Elmhed, K.: Unconscious facial reactions to emotional expressions. Psychological Science 11(1), 149–182 (2000)
Pease, A., Pease, B.: The definitive book of body language - How to read others’ thoughts by their gestures. Pease International (2004)
Birdwhistell, R.: The language of the body: the natural environment of words. In: Human Communication: Theoretical Explorations, pp. 203–220. Lawrence Erlbaum Associates Publishers, Hillsdale (1974)
Ralescu, A., Hartani, R.: Fuzzy modeling based approach to facial expressions understanding. Journal of Advanced Computational Intelligence 1(1), 45–61 (1997)
Pantic, M., Rothkrantz, L.J.M.: Automatic analysis of facial expressions: the state of the art. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(12), 1424–1445 (2000)
Mehrabian, M.: Silent Messages. Wadsworth, Belmont (1971)
Rutkowski, T.M., Cichocki, A., Barros, A.K.: Speech enhancement using adaptive filters and independent component analysis approach. In: Proceedings of International Conference on Artificial Intelligence in Science and Technology, AISAT2000, Hobart, Tasmania, December 17–20, 2000, pp. 191–196 (2000)
Rutkowski, T.M., et al.: Identification and tracking of active speaker’s position in noisy environments. In: Proceedings of International Workshop on Acoustic Echo and Noise Control (IWAENC2003), Kyoto, Japan, September 2003, pp. 283–286 (2003)
Liu, W., Mandic, D.P., Cichocki, A.: Blind second-order source extraction of instantaneous noisy mixtures. IEEE Transactions on Circuits and Systems II 53, 931–935 (2006)
Liu, W., Mandic, D.P.: A normalised kurtosis based algorithm for blind source extraction from noisy measurements. Signal Processing 86, 1580–1585 (2006)
Dellaert, F., Polzin, T., Waibel, A.: Recognizing emotion in speech. In: Proceedings of The Fourth International Conference on Spoken Language Processing, ICSLP’96, vol. 3, Philadelphia, PA, pp. 1970–1973 (1996)
Vapnik, V.: The Nature of Statistical Learnig Theory. Springer, Heidelberg (1995)
Adler, R.B., Rodman, G.: Undestanding Human Communication. Oxford University Press, Oxford (2003)
Hsu, C.-W., Lin, C.-J.: A comparison of methods for multi-class support vector machines. IEEE Transactions on Neural Networks 13(2) (2002)
Stamper, R.: Signs, information, norms and systems. In: Holmqvist, B., et al. (eds.) Signs of Work, pp. 349–399. De Gruyter, Berlin (1996)
Kryssanov, V., Kakusho, K.: From semiotics of hypermedia to physics of semiosis: A view from system theory. Semiotica (in press) (2005)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Rutkowski, T.M., Mandic, D.P. (2007). Modelling the Communication Atmosphere: A Human Centered Multimedia Approach to Evaluate Communicative Situations. In: Huang, T.S., Nijholt, A., Pantic, M., Pentland, A. (eds) Artifical Intelligence for Human Computing. Lecture Notes in Computer Science(), vol 4451. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72348-6_8
Download citation
DOI: https://doi.org/10.1007/978-3-540-72348-6_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72346-2
Online ISBN: 978-3-540-72348-6
eBook Packages: Computer ScienceComputer Science (R0)