research-article

Context in Affective Multiparty and Multimodal Interaction: Why, Which, How and Where?

Authors:

Aggeliki Vlachostergiou,

George Caridakis,

Stefanos KolliasAuthors Info & Claims

UM3I '14: Proceedings of the 2014 workshop on Understanding and Modeling Multiparty, Multimodal Interactions

Pages 3 - 8

https://doi.org/10.1145/2666242.2666245

Published: 16 November 2014 Publication History

Abstract

Recent advances in Affective Computing (AC) include research towards automatic analysis of human emotionally enhanced behavior during multiparty interactions within different contextual settings. Current paper delves on how is context incorporated into multiparty and multimodal interaction within the AC framework. Aspects of context incorporation such as importance and motivation for context incorporation, appropriate emotional models, resources of multiparty interactions useful for context analysis, context as another modality in multimodal AC and context-aware AC systems are addressed as research questions reviewing the current state-of-the-art in the research field. Challenges that arise from the incorporation of context are identified and discussed in order to foresee future research directions in the domain. Finally, we propose a context incorporation architecture into affect-aware systems with multiparty interaction including detection and extraction of semantic context concepts, enhancing emotional models with context information and context concept representation in appraisal estimation.

References

[1]

G. D. Abowd, A. K. Dey, P. J. Brown, N. Davies, M. Smith, and P. Steggles. Towards a better understanding of context and context-awareness. In Handheld and Ubiquitous Computing, pages 304--307. Springer, 1999.

[2]

S. Al Moubayed, J. Edlund, and J. Gustafson. Analysis of gaze and speech patterns in three-party quiz game interaction. In INTERSPEECH, pages 1126--1130, 2013.

[3]

S. O. Ba and J.-M. Odobez. Multi-party focus of attention recognition in meetings from head pose and multimodal contextual cues. In International Conference on Acoustics, Speech and Signal Processing,ICASSP, pages 2221--2224. IEEE, 2008.

[4]

S. O. Ba and J.-M. Odobez. Recognizing visual focus of attention from head pose in natural meetings. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 39(1):16--33, 2009.

Digital Library

[5]

R. Bock, A. Wendemuth, S. Gluge, and I. Siegert. Annotation and classification of changes of involvement in group conversation. In Proceedings of the Humaine Association Conference on Affective Computing and Intelligent Interaction, pages 803--808. IEEE, 2013.

Digital Library

[6]

F. Bonin, R. Bock, and N. Campbell. How do we react to context? annotation of individual and group engagement in a video corpus. In Privacy, Security, Risk and Trust (PASSAT), International Conference on Social Computing (SocialCom), pages 899--903. IEEE, 2012.

Digital Library

[7]

D. Borth, R. Ji, T. Chen, T. Breuel, and S.-F. Chang. Large-scale visual sentiment ontology and detectors using adjective noun pairs. In Proceedings of the 21st ACM International Conference on Multimedia, pages 223--232. ACM, 2013.

Digital Library

[8]

P. J. Brown, J. D. Bovey, and X. Chen. Context-aware applications: from the laboratory to the marketplace. Personal Communications, 4(5):58--64, 1997.

[9]

R. A. Calvo and S. D'Mello. Affect detection: An interdisciplinary review of models, methods, and their applications. IEEE Transactions on Affective Computing, 1(1):18--37, 2010.

Digital Library

[10]

J. M. Carroll and J. A. Russell. Do facial expressions signal specific emotions? judging emotion from the face in context. Journal of Personality and Social Psychology, 70(2):205, 1996.

[11]

T. Choudhury and A. Pentland. Modeling face-to-face communication using the sociometer. 5:3--8, 2003.

[12]

J. R. Curhan and A. Pentland. Thin slices of negotiation: predicting outcomes from conversational dynamics within the first 5 minutes. Journal of Applied Psychology, 92(3):802, 2007.

[13]

A. Dhall, R. Goecke, S. Lucey, and T. Gedeon. Collecting large, richly annotated facial-expression databases from movies. IEEE MultiMedia, 19(3):0034, 2012.

Digital Library

[14]

Z. Duric, W. D. Gray, R. Heishman, F. Li,A. Rosenfeld, M. J. Schoelles, C. Schunn, and H. Wechsler. Integrating perceptual and cognitive modeling for adaptive and intelligent human-computer interaction. Proceedings of the IEEE, 90(7):1272--1289, 2002.

[15]

{15} S. Favre, H. Salamin, J. Dines, and A. Vinciarelli. Role recognition in multiparty recordings using social affiliation networks and discrete distributions. In Proceedings of the 10th international conference on Multimodal interfaces, pages 29--36. ACM, 2008.

Digital Library

[16]

D. Gatica-Perez. Analyzing group interactions in conversations: a review. In International Conference on Multisensor Fusion and Integration for Intelligent Systems, pages 41--46. IEEE, 2006.

[17]

D. Gatica-Perez. Automatic nonverbal analysis of social interaction in small groups: A review. Image and Vision Computing, 27(12):1775--1787, 2009.

Digital Library

[18]

H. Gunes and B. Schuller. Categorical and dimensional affect analysis in continuous input: Current trends and future directions. Image and Vision Computing, 31(2):120--136, 2013.

Digital Library

[19]

N. Jovanovic and R. op den Akker. Towards automatic addressee identification in multi-party dialogues. Association for Computational Linguistics, 2004.

[20]

M. Koutsombogera, S. A. Moubayed, B. Bollepalli, A. H. Abdelaziz, M. Johansson, J. D. Á. Lopes, J. Novikova, C. Oertel, K. Stefanov, and G. Varol. The tutorbot corpus; a corpus for studying tutoring behaviour in multiparty face-to-face spoken dialogue. In LREC, pages 4196--4201, 2014.

[21]

D. Lenat. The dimensions of context-space. Cycorp Technical Report, 1998.

[22]

H. Liu and P. Singh. Conceptnet-a practical commonsense reasoning toolkit. BT technology journal, 22(4):211--226, 2004.

Digital Library

[23]

T. Masuda, P. C. Ellsworth, B. Mesquita, J. Leu, S. Tanida, and E. Van de Veerdonk. Placing the face in context: cultural differences in the perception of facial emotion. Journal of personality and social psychology, 94(3):365, 2008.

[24]

G. A. Miller. Wordnet: a lexical database for english. Communications of the ACM, 38(11):39--41, 1995.

Digital Library

[25]

M. Mortillaro, B. Meuleman, and K. R. Scherer. Advocating a componential appraisal model to guide emotion recognition. International Journal of Synthetic Emotions (IJSE), 3(1):18--32, 2012.

Digital Library

[26]

M. Pantic, A. Nijholt, A. Pentland, and T. S. Huanag. Human-centred intelligent human-computer interaction (hci2): how far are we from attaining it? International Journal of Autonomous and Adaptive Communications Systems, 1(2):168--187, 2008.

Digital Library

[27]

F. Pianesi, N. Mana, A. Cappelletti, B. Lepri, and M. Zancanaro. Multimodal recognition of personality traits in social interactions. In Proceedings of the 10th international conference on Multimodal interfaces, pages 53--60. ACM, 2008.

Digital Library

[28]

N. S. Ryan, J. Pascoe, and D. R. Morse. Enhanced reality fieldwork: the context-aware archaeological assistant. In V. Gaffney, M. van Leusen, and S. Exxon, editors, Computer applications in archaeology, British Archaeological Reports, pages 182--196. Tempus Reparatum, 1998.

[29]

H. Salamin and A. Vinciarelli. Automatic role recognition in multiparty conversations: An approach based on turn organization, prosody, and conditional random fields. IEEE Transactions on Multimedia, 14(2):338--345, 2012.

Digital Library

[30]

B. Schilit, N. Adams, and R. Want. Context-aware computing applications. In Proceedings of the First Workshop on Mobile Computing Systems and Applications, pages 85--90. IEEE, 1994.

Digital Library

[31]

M. Soleymani, M. Larson, T. Pun, and A. Hanjalic. Corpus development for affective video indexing. IEEE Transactions on Multimedia, 16(4), 2014.

Digital Library

[32]

M. Soleymani and M. Pantic. Human-centered implicit tagging: Overview and perspectives. In International Conference on Systems, Man, and Cybernetics (SMC), pages 3304--3309. IEEE, 2012.

[33]

P. J. Stone, D. C. Dunphy, M. S. Smith, and D. M. Ogilvie. The General Inquirer: A Computer Approach to Content Analysis. MIT Press, Cambridge, MA, 1966.

[34]

F. Talantzis, A. Pnevmatikakis, and A. G. Constantinides. Audio-visual active speaker tracking in cluttered indoors environments. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 38(3):799--807, 2008.

Digital Library

[35]

A. Vinciarelli. Speakers role recognition in multiparty audio recordings using social network analysis and duration distribution modeling. IEEE Transactions on Multimedia, 9(6):1215--1226, 2007.

Digital Library

[36]

A. Vinciarelli, A. Dielmann, S. Favre, and H. Salamin. Canal9: A database of political debates for analysis of social interactions. In Proceedings of the 3rd International Conference on Affective Computing and Intelligent Interaction,(ACII), pages 1--4. IEEE, 2009.

[37]

A. Vinciarelli, H. Salamin, and M. Pantic. Social signal processing: Understanding social interactions through nonverbal behavior analysis. In Proceedings of International Workshop on Computer Vision and Pattern Recognition for Human Behavior Workshops, pages 42--49. IEEE, 2009.

[38]

M. Wöllmer, F. Eyben, B. W. Schuller, and G. Rigoll. Temporal and situational context modeling for improved dominance recognition in meetings. In Proc. of Interspeech 2012, Portland, Oregon, USA, pages 350--353. ISCA, 2012.

[39]

B. Wrede and E. Shriberg. The relationship between dialogue acts and hot spots in meetings. In Proc. IEEE Automatic Speech Recognition and Understanding Workshop, (ASRU'03), pages 180--185, 2003.

[40]

Z. Zeng, M. Pantic, G. I. Roisman, and T. S. Huang. A survey of affect recognition methods: Audio, visual, and spontaneous expressions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(1):39--58, 2009.

Digital Library

[41]

A. Zimmermann, A. Lorenz, and R. Oppermann. An operational definition of context. In Modeling an Using Context, pages 558--571. Springer Berlin Heidelberg, 2007.

Digital Library

Cited By

Vlachostergiou AHarisson AKhooshabeh P(2020)See with Your Eyes, Hear with Your Ears and Listen to Your Heart: Moving from Dyadic Teamwork Interaction towards a More Effective Team Cohesion and Collaboration in Long-Term Spaceflights under Stressful ConditionsBig Data and Cognitive Computing10.3390/bdcc40300184:3(18)Online publication date: 28-Jul-2020
https://doi.org/10.3390/bdcc4030018
Vlachostergiou AStratogiannis GCaridakis GSiolas GMylonas P(2019)User Adaptive and Context-Aware Smart Home Using Pervasive and Semantic TechnologiesJournal of Electrical and Computer Engineering10.1155/2016/47898032016(8)Online publication date: 24-Nov-2019
https://dl.acm.org/doi/10.1155/2016/4789803
Mou WGunes HPatras I(2019)Alone versus In-a-groupACM Transactions on Multimedia Computing, Communications, and Applications10.1145/332150915:2(1-23)Online publication date: 10-Jun-2019
https://dl.acm.org/doi/10.1145/3321509
Show More Cited By

Index Terms

Context in Affective Multiparty and Multimodal Interaction: Why, Which, How and Where?

Recommendations

Automatic understanding of affective and social signals by multimodal mimicry recognition
ACII'11: Proceedings of the 4th international conference on Affective computing and intelligent interaction - Volume Part II

Human mimicry is one of the important behavioral cues displayed during social interaction that inform us about the interlocutors' interpersonal states and attitudes. For example, the absence of mimicry is usually associated with negative attitudes. A ...
Multimodal embodied mimicry in interaction
COST'10: Proceedings of the 2010 international conference on Analysis of Verbal and Nonverbal Communication and Enactment

Nonverbal behavior plays an important role in human-human interaction. One particular kind of nonverbal behavior is mimicry. Behavioral mimicry supports harmonious relationships in social interaction through creating affiliation, rapport, and liking ...
Toward multimodal fusion of affective cues
HCM '06: Proceedings of the 1st ACM international workshop on Human-centered multimedia

During face to face communication, it has been suggested that as much as 70% of what people communicate when talking directly with others is through paralanguage involving multiple modalities combined together (e.g. voice tone and volume, body language)...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

UM3I '14: Proceedings of the 2014 workshop on Understanding and Modeling Multiparty, Multimodal Interactions

November 2014

58 pages

ISBN:9781450306522

DOI:10.1145/2666242

Organizing Chairs:
Samer Al Moubayed
Disney Research, USA
,
Dan Bohus
Microsoft Research, USA
,
Anna Esposito
Second University of Naples, Italy
,
Dirk Heylen
University of Twente, The Netherlands
,
Maria Koutsombogera
Institute for Language and Speech Processing/Athena R.C., Greece
,
Harris Papageorgiou
Institute for Language and Speech Processing/Athena R.C., Greece
,
Gabriel Skantze
KTH Royal Institute of Technology, Sweden

Copyright © 2014 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGCHI: ACM Special Interest Group on Computer-Human Interaction

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 November 2014

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Greek Ministry of Education Education and Lifelong Learning program

Conference

ICMI '14

Sponsor:

SIGCHI

ICMI '14: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION

November 16, 2014

Istanbul, Turkey

Acceptance Rates

UM3I '14 Paper Acceptance Rate 8 of 8 submissions, 100%;

Overall Acceptance Rate 8 of 8 submissions, 100%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
247
Total Downloads

Downloads (Last 12 months)11
Downloads (Last 6 weeks)4

Reflects downloads up to 19 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Vlachostergiou AHarisson AKhooshabeh P(2020)See with Your Eyes, Hear with Your Ears and Listen to Your Heart: Moving from Dyadic Teamwork Interaction towards a More Effective Team Cohesion and Collaboration in Long-Term Spaceflights under Stressful ConditionsBig Data and Cognitive Computing10.3390/bdcc40300184:3(18)Online publication date: 28-Jul-2020
https://doi.org/10.3390/bdcc4030018
Vlachostergiou AStratogiannis GCaridakis GSiolas GMylonas P(2019)User Adaptive and Context-Aware Smart Home Using Pervasive and Semantic TechnologiesJournal of Electrical and Computer Engineering10.1155/2016/47898032016(8)Online publication date: 24-Nov-2019
https://dl.acm.org/doi/10.1155/2016/4789803
Mou WGunes HPatras I(2019)Alone versus In-a-groupACM Transactions on Multimedia Computing, Communications, and Applications10.1145/332150915:2(1-23)Online publication date: 10-Jun-2019
https://dl.acm.org/doi/10.1145/3321509
Mou WGunes HPatras I(2016)Automatic Recognition of Emotions and Membership in Group Videos2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)10.1109/CVPRW.2016.185(1478-1486)Online publication date: Jun-2016
https://doi.org/10.1109/CVPRW.2016.185
Vlachostergiou ACaridakis GRaouzaiou AKollias S(2015)HCI and Natural Progression of Context-Related QuestionsHuman-Computer Interaction: Design and Evaluation10.1007/978-3-319-20901-2_50(530-541)Online publication date: 21-Jul-2015
https://doi.org/10.1007/978-3-319-20901-2_50

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten