skip to main content
chapter

Real-time sensing of affect and social signals in a multimodal framework: a practical approach

Published:01 October 2018Publication History
First page image

References

  1. J. Allwood, L. Cerrato, K. Jokinen, C. Navarretta, and P. Paggio. 2007. The MUMIN coding scheme for the annotation of feedback, turn management and sequencing phenomena. Language Resources and Evaluation, 41(3-4): 273--287. 232Google ScholarGoogle ScholarCross RefCross Ref
  2. O. Aran and D. Gatica-Perez. 2010. Fusing audio-visual nonverbal cues to detect dominant people in conversations. In International Conference on Pattern Recognition (ICPR). 234 Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. T. Baltrusaitis, C. Ahuja, and L.-P. Morency. 2018. Multimodal machine learning: Challenges and applications. In S. Oviatt, B. Schuller, P. Cohen, D. Sonntag, G. Potamianos, and A. Krueger, eds., The Handbook of Multimodal-Multisensor Interfaces, Volume 2: Signal Processing, Architectures, and Detection of Emotion and Cognition, Chapter 1. Morgan & Claypool Publishers, San Rafael, CA.Google ScholarGoogle Scholar
  4. A. Batliner, K. Fischer, R. Huber, J. Spilker, and Nöth. 2000. Desperately seeking emotions: Actors, wizards, and human beings. In Workshop on Speech and Emotion: A Conceptual Framework for Research at International Symposium on Computer Architecture (ISCA). 234, 251Google ScholarGoogle Scholar
  5. A. Battocchi, F. Pianesi, and D. Goren-Bar. 2005. DaFEx: Database of facial expressions. In M. T. Maybury, O. Stock, and W. Wahlster, eds., INTETAIN, vol. 3814 of Lecture Notes in Computer Science, pp. 303--306. Springer, Berlin Heidelberg. 228 Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. P. Blache, R. Bertrand, and G. Ferré. 2009. Creating and exploiting multimodal annotated corpora: The ToMA project. In M. Kipp, J.-C. Martin, P. Paggio, and D. Heylen, eds., Multimodal Corpora, vol. 5509 of Lecture Notes in Computer Science, pp. 38--53. Springer, Berlin Heidelberg. 232 Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. F. Burkhardt, A. Paeschke, M. Rolfes, W. Sendlmeier, and B. Weiss. 2005. A database of german emotional speech. In Conference of the International Speech Communication Association (INTERSPEECH), pp. 1517--1520. DOI: 10.1.1.130.8506&rep=rep1&type=pdf. 228Google ScholarGoogle Scholar
  8. C. Busso, M. Bulut, C.-C. Lee, A. Kazemzadeh, E. Mower, S. Kim, J. N. Chang, S. Lee, and S. Narayanan. 2008. IEMOCAP: interactive emotional dyadic motion capture database. Language Resources and Evaluation, 42(4): 335--359. 231Google ScholarGoogle ScholarCross RefCross Ref
  9. J. Carletta, S. Isard, G. Doherty-Sneddon, A. Isard, J. C. Kowtko, and A. H. Anderson. 1997. The reliability of a dialogue structure coding scheme. Computational Linguistics, 23(1): 13--31. 232 Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. Carletta, S. Ashby, S. Bourban, M. Flynn, M. Guillemot, T. Hain, J. Kadlec, V. Karaiskos, W. Kraaij, M. Kronenthal, G. Lathoud, M. Lincoln, A. Lisowska, I. McCowan, W. Post, D. Reidsma, and P. Wellner. 2006. The AMI meeting corpus: A pre-announcement. In Machine Learning for Multimodal Interaction (MLMI), pp. 28--39. Springer, Berlin Heidelberg. 231 Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. F. Cavicchio and M. Poesio. 2009. Multimodal corpora annotation: Validation methods to assess coding scheme reliability. In M. Kipp, J.-C. Martin, P. Paggio, and D. Heylen, eds., Multimodal Corpora, vol. 5509 of Lecture Notes in Computer Science, pp. 109--121. Springer, Berlin Heidelberg. 232 Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. L. Chen and T. Huang. 2000. Emotional expressions in audiovisual human computer interaction. In International Conference on Multimedia and Expo (ICME), vol. 1, pp. 423--426. 236Google ScholarGoogle Scholar
  13. L. Chen, T. Huang, T. Miyasato, and R. Nakatsu. 1998. Multimodal human emotion/expression recognition. In International Conference on Automatic Face and Gesture Recognition (FGR), pp. 366--371. 233 Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. C. Chiarcos, S. Dipper, M. Götze, U. Leser, A. Lüdeling, J. Ritz, and M. Stede. 2008. A flexible framework for integrating annotations from different tools and tag sets. Translation and Literature, 49(2): 217--246. 232Google ScholarGoogle Scholar
  15. M. G. Core and J. F. Allen. 1997. Coding dialogs with the DAMSL annotation scheme. In Working Notes of the AAAI Fall Symposium on Communicative Action in Humans and Machines, pp. 28--35. Cambridge, MA.Google ScholarGoogle Scholar
  16. R. Cowie and R. R. Cornelius. 2003. Describing the emotional states that are expressed in speech. Speech Communication, 40(1-2): 5--32. 230 Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. R. Cowie, E. Douglas-Cowie, and C. Cox. 2005. Beyond emotion archetypes: Databases for emotion modelling using neural networks. Neural Networks, 18(4): 371--88. 230, 231 Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. E. Cox. 1993. Adaptive fuzzy systems. Spectrum, 30(2): 27--31. 238 Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. L. De Silva and P. C. Ng. 2000. Bimodal emotion recognition. In International Conference on Automatic Face and Gesture Recognition (FGR), pp. 332--335. 233 Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. S. D'Mello and J. Kory. 2012. Consistent but modest: A meta-analysis on unimodal and multimodal affect detection accuracies from 30 studies. In International Conference on Multimodal Interaction (ICMI '12), pp. 31--38. ACM, New York. 234, 251 Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. S. D'Mello, N. Bosch, and H. Chen. 2018. Multimodal-multisensor affect detection. In S. Oviatt, B. Schuller, P. Cohen, D. Sonntag, G. Potamianos, and A. Krueger, eds., The Handbook of Multimodal-Multisensor Interfaces, Volume 2: Signal Processing, Architectures, and Detection of Emotion and Cognition, Chapter 6. Morgan & Claypool Publishers, San Rafael, CA.Google ScholarGoogle Scholar
  22. E. Douglas-Cowie, R. Cowie, and M. Schröder. 2000. A new emotion database: Considerations, sources and scope. In ISCA Workshop on Speech and Emotion: A Conceptual Framework for Research, pp. 39--44. Textflow, Belfast. 231Google ScholarGoogle Scholar
  23. E. Douglas-Cowie, N. Campbell, R. Cowie, and P. Roach. 2003. Emotional speech: Towards a new generation of databases. Speech Communication, 40(c): 33--60. 230, 251 Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. E. Douglas-Cowie, L. Devillers, J.-C. Martin, R. Cowie, S. Savvidou, S. Abrilian, and C. Cox. 2005. Multimodal databases of everyday emotion: facing up to complexity. In Conference of the International Speech Communication Association (INTERSPEECH), pp. 813--816. ISCA. 229, 230, 231, 232, 234, 251, 527Google ScholarGoogle Scholar
  25. E. Douglas-Cowie, R. Cowie, I. Sneddon, C. Cox, O. Lowry, M. McRorie, J.-C. Martin, L. Devillers, S. Abrilian, A. Batliner, N. Amir, and K. Karpouzis. 2007. The HUMAINE database: Addressing the collection and annotation of naturalistic and induced emotional data. In A. Paiva, R. Prada, and R. W. Picard, eds., International Conference on Affective Computing and Intelligent Interaction (ACII), vol. 4738 of Lecture Notes in Computer Science, pp. 488--500. Springer. 230 Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. E. Douglas-Cowie, R. Cowie, C. Cox, N. Amir, and D. Heylen. 2008. The sensitive artificial listener: an induction technique for generating emotionally coloured conversation. In L. Devillers, J.-C. Martin, R. Cowie, E. Douglas-Cowie, and A. Batliner, eds., LREC Workshop on Corpora for Research on Emotion and Affect, pp. 1--4. ELRA, Paris, France. 231Google ScholarGoogle Scholar
  27. A. Eerekoviae. 2014. An insight into multimodal databases for social signal processing: acquisition, efforts, and directions. Artificial Intelligence Review, 42(4): 663--692. 229, 232, 527 Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. P. Ekman and W. Friesen. 1978. Facial Action Coding System: A Technique for the Measurement of Facial Movement. Consulting Psychologists Press, Palo Alto, CA. 232Google ScholarGoogle Scholar
  29. I. S. Engberg, A. V. Hansen, O. Andersen, and P. Dalsgaard. 1997. Design, recording and verification of a danish emotional speech database. In G. Kokkinakis, N. Fakotakis, and E. Dermatas, eds., European Conference on Speech Communication and Technology (EUROSPEECH). ISCA. 228Google ScholarGoogle Scholar
  30. F. Eyben, M. Wöllmer, M. F. Valstar, H. Gunes, B. Schuller, and M. Pantic. 2011. String-based audiovisual fusion of behavioural events for the assessment of dimensional affect. In International Conference on Automatic Face and Gesture Recognition (FGR), pp. 322--329. IEEE Computer Society. 237Google ScholarGoogle Scholar
  31. M. Grimm, K. Kroschel, and S. Narayanan. 2008. The vera am mittag german audio-visual emotional speech database. In International Conference on Multimedia and Expo (ICME), pp. 865--868. 231Google ScholarGoogle Scholar
  32. A. Hanjalic. 2006. Extracting moods from pictures and sounds: towards truly personalized TV. IEEE Signal Processing Magazine, 23(2): 90--100. 239Google ScholarGoogle ScholarCross RefCross Ref
  33. J. Henrich, S. J. Heine, and A. Norenzayan. 2010. The weirdest people in the world? Behavioral and Brain Sciences, 33(2-3): 61--83. 230Google ScholarGoogle ScholarCross RefCross Ref
  34. J. Hofmann, F. Stoffel, A. Weber, and T. Platt, 2012. The 16 enjoyable emotions induction task (16-EEIT). Unpublished. 245Google ScholarGoogle Scholar
  35. H. Hung and G. Chittaranjan. 2010. The idiap wolf corpus: exploring group behaviour in a competitive role-playing game. In A. D. Bimbo, S.-F. Chang, and A. W. M. Smeulders, eds., International Conference on Multimedia (MM), pp. 879--882. ACM. 231, 251 Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. D. B. Jayagopi, H. Hung, C. Yeo, and D. Gatica-Perez. 2009. Modeling dominance in group conversations using nonverbal activity cues. Audio, Speech and Language Processing, 17(3): 501--513. 234Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. T. Johnstone. 1996. Emotional speech elicited using computer games. In International Conference on Spoken Language Processing (ICSLP). ISCA. 230Google ScholarGoogle ScholarCross RefCross Ref
  38. R. E. Kalman. 1960. A new approach to linear filtering and prediction problems. Basic Engineering, 82(Series D): 35--45. 238Google ScholarGoogle Scholar
  39. I. Kanluan, M. Grimm, and K. Kroschel. 2008. Audio-visual emotion recognition using an emotion space concept. In European Signal Processing Conference (EUSIPCO). 233Google ScholarGoogle Scholar
  40. J. F. Kelley. 1984. An iterative design methodology for user-friendly natural language office information applications. Information Systems, 2(1): 26--41. 230 Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. P. M. Kenealy. 1986. The velten mood induction procedure: A methodological review. Motivation and Emotion, 10(4): 315--335. 230Google ScholarGoogle ScholarCross RefCross Ref
  42. G. Keren, A. E.-D. Mousa, O. Pietquin, S. Zafeiriou, and B. Schuller. 2018. Deep learning for multisensorial and multimodal interaction. In S. Oviatt, B. Schuller, P. Cohen, D. Sonntag, G. Potamianos, and A. Krueger, eds., The Handbook of Multimodal-Multisensor Interfaces, Volume 2: Signal Processing, Architectures, and Detection of Emotion and Cognition, Chapter 4. Morgan & Claypool Publishers, San Rafael, CA.Google ScholarGoogle Scholar
  43. M. Kipp. 2013. ANVIL: The video annotation research tool. In J. Durand, U. Gut, and G. Kristofferson, eds., Handbook of Corpus Phonology. University Press, Oxford, UK. 233Google ScholarGoogle Scholar
  44. A. Kleinsmith and N. Bianchi-Berthouze. 2007. Recognizing affective dimensions from body posture. In A. Paiva, R. Prada, and R. Picard, eds., Affective Computing and Intelligent Interaction, vol. 4738 of Lecture Notes in Computer Science, pp. 48--58. Springer, Berlin Heidelberg. 239 Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. P. O. Kristensson and L. C. Denby. 2011. Continuous recognition and visualization of pen strokes and touch-screen gestures. In Eurographics Symposium on Sketch-Based Interfaces and Modeling (SBIM), pp. 95--102. ACM, New York. 240 Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. P. J. Lang, M. M. Bradley, and B. N. Cuthbert. 2008. International affective picture system (iaps): Affective ratings of pictures and instruction manual. Technical Report A-8, The Center for Research in Psychophysiology, University of Florida, Gainesville, FL. 230Google ScholarGoogle Scholar
  47. J. Lichtenauer, J. Shen, M. F. Valstar, and M. Pantic. 2011. Cost-effective solution to synchronised audio-visual data capture using multiple sensors. Image and Vision Computing, 29: 666--680. 232 Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. F. Lingenfelser, J. Wagner, and E. André. 2011. A systematic discussion of fusion techniques for multi-modal affect recognition tasks. In International Conference on Multimodal Interfaces (ICMI), pp. 19--26. ACM, New York, NY, USA. 234, 251 Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. F. Lingenfelser, J. Wagner, E. André, G. McKeown, and W. Curran. 2014. An event driven fusion approach for enjoyment recognition in real-time. In International Conference on Multimedia (MM), pp. 377--386. ACM, New York. 237, 247, 250, 252 Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. G. McKeown, M. Valstar, R. Cowie, and M. Pantic. 2010. The SEMAINE corpus of emotionally coloured character interactions. In International Conference on Multimedia and Expo (ICME), pp. 1079--1084. 231Google ScholarGoogle Scholar
  51. G. McKeown, W. Curran, J. Wagner, F. Lingenfelser, and E. André. 2015. The belfast storytelling database---a spontaneous social interaction database with laughter focused annotation. In International Conference on Affective Computingand Intelligent Interaction and Workshops (ACII). Xi'an, China. 231, 245 Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. D. S. Messinger, T. D. Cassel, S. I. Acosta, Z. Ambadar, and J. F. Cohn. 2008. Infant smiling dynamics and perceived positive emotion. Nonverbal Behavior, 32(3): 133--155. 233Google ScholarGoogle ScholarCross RefCross Ref
  53. D. S. Messinger, M. H. Mahoor, S.-M. Chow, and J. F. Cohn. 2009. Automated measurement of facial expression in infant-mother interaction: A pilot study. Infancy, 14(3): 285--305. 233Google ScholarGoogle ScholarCross RefCross Ref
  54. A. Metallinou, A. Katsamanis, and S. S. Narayanan. 2013. Tracking continuous emotional trends of participants during affective dyadic interactions using body language and speech information. Image and Vision Computing, 31(2): 137--152. 239 Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. M. A. Nicolaou, H. Gunes, and M. Pantic. 2010. Audio-visual classification and fusion of spontaneous affective data in likelihood space. In International Conference on Pattern Recognition (ICPR), pp. 3695--3699. IEEE. 236, 252 Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. M. A. Nicolaou, H. Gunes, and M. Pantic. 2011. Continuous prediction of spontaneous affect from multiple cues and modalities in valence-arousal space. Affective Computing, 2(2): 92--105. 236, 239, 252 Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. R. Niewiadomski and C. Pelachaud. 2012. Towards multimodal expression of laughter. In Y. Nakano, M. Neff, A. Paiva, and M. A. Walker, eds., International Conference on Intelligent Virtual Agents (IVA), vol. 7502 of Lecture Notes in Computer Science, pp. 231--244. Springer. 246 Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Y. Panagakis, O. Rudovic, and M. Pantic. 2018. Learning for multi-modal and context-sensitive interfaces. In S. Oviatt, B. Schuller, P. Cohen, D. Sonntag, G. Potamianos, and A. Krueger, eds., The Handbook of Multimodal-Multisensor Interfaces, Volume 2: Signal Processing, Architectures, and Detection of Emotion and Cognition, Chapter 3. Morgan & Claypool Publisher, San Rafael, CA.Google ScholarGoogle Scholar
  59. M. Pantic, A. Pentland, A. Nijholt, and T. S. Huang. 2007. Human computing and machine understanding of human behavior: A survey. In Artifical Intelligence for Human Computing, ICMI 2006 and IJCAI 2007 International Workshops, pp. 47--71. Banff, Canada, November 3, 2006, Hyderabad, India, January 6, 2007, Revised Seleced and Invited Papers. 229, 518 Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. A. Pentland. 2007. Social signal processing. IEEE Signal Processing Magazine, 24(4): 108--111. 231Google ScholarGoogle ScholarCross RefCross Ref
  61. S. Petridis and M. Pantic. 2008. Audiovisual discrimination between laughter and speech. In Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on, pp. 5117--5120. 234Google ScholarGoogle Scholar
  62. W. Ruch and P. Ekman. 2001. The expressive pattern of laughter. In A. W. Kaszniak, ed., Emotion Qualia, and Consciousness, pp. 426--443. Word Scientific Publisher. 245, 246Google ScholarGoogle Scholar
  63. J. Sasiadek and P. Hartana. 2000. Sensor data fusion using Kalman filter. In International Conference on Information Fusion (FUSION), vol. 2, pp. WED5/19--WED5/25. 238Google ScholarGoogle Scholar
  64. K. R. Scherer and M. R. Zentner. 2001. Emotional effects of music: Production rules. In P. N. Juslin and J. A. Sloboda, eds., Music and Emotion: Theory and Research, pp. 361--392. University Press, Oxford. 230Google ScholarGoogle Scholar
  65. F. Schiel, S. Steininger, and U. Tuürk. 2002. The smartkom multimodal corpus at BAS. In International Conference on Language Resources and Evaluation (LREC). European Language Resources Association.Google ScholarGoogle Scholar
  66. T. Schmidt. 2004. Transcribing and annotating spoken language with EXMARaLDA. In International Conference on Language Resources and Evaluation (LREC) Workshop on XML based Richly Annotated Corpora, pp. 879--896. ELRA, Paris. 233Google ScholarGoogle Scholar
  67. M. Schröder, R. Cowie, E. Douglas-Cowie, S. Savvidou, E. McMahon, and M. Sawey. 2000. FEELTRACE: An instrument for recording perceived emotion in real time. In ISCA Workshop on Speech and Emotion: A Conceptual Framework for Research, pp. 19--24. Textflow, Belfast. 233Google ScholarGoogle Scholar
  68. M. Schröder, P. Baggia, F. Burkhardt, C. Pelachaud, C. Peter, and E. Zovato. 2011. EmotionML---an upcoming standard for representing emotions and related states. In S. K. D'Mello, A. C. Graesser, B. Schuller, and J.-C. Martin, eds., International Conference on Affective Computing and Intelligent Interaction (ACII), vol. 6974 of Lecture Notes in Computer Science, pp. 316--325. Springer. 233 Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. B. Schuller. 2018. Multimodal user state and trait recognition: An overview. In S. Oviatt, B. Schuller, P. Cohen, D. Sonntag, G. Potamianos, and A. Krueger, eds., The Handbook of Multimodal-Multisensor Interfaces, Volume 2: Signal Processing, Architectures, and Detection of Emotion and Cognition, Chapter 5. Morgan & Claypool Publishers, San Rafael, CA.Google ScholarGoogle Scholar
  70. B. Schuller, A. Batliner, S. Steidl, and D. Seppi. 2011. Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge. Speech Communication, 53(9-10): 1062--1087. 239, 252 Google ScholarGoogle ScholarDigital LibraryDigital Library
  71. H. Sloetjes, A. Russel, and A. Klassmann. 2007. ELAN: a free and open-source multimedia annotation tool. In Conference of the International Speech Communication Association (INTERSPEECH), pp. 4015--4016. ISCA. 233Google ScholarGoogle Scholar
  72. M. Song, J. Bu, C. Chen, and N. Li. 2004. Audio-visual based emotion recognition---a new approach. In Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1020--1025. 236, 252 Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. X. Sun, J. Lichtenauer, M. F. Valstar, A. Nijholt, and M. Pantic. 2011. A multimodal database for mimicry analysis. In International Conference on Affective Computing and Intelligent Interaction (ACII). Memphis, TN. 232 Google ScholarGoogle ScholarDigital LibraryDigital Library
  74. A. Vinciarelli and A. Esposito. 2018. Multimodal analysis of social signals. In S. Oviatt, B. Schuller, P. Cohen, D. Sonntag, G. Potamianos, and A. Krueger, eds., The Handbook of Multimodal-Multisensor Interfaces, Volume 2: Signal Processing, Architectures, and Detection of Emotion and Cognition, Chapter 7. Morgan & Claypool Publishers, San Rafael, CA.Google ScholarGoogle Scholar
  75. A. Vinciarelli, M. Pantic, H. Bourlard, and A. Pentland. 2008a. Social signals, their function, and automatic analysis: A survey. In International Conference on Multimodal Interfaces (ICMI), pp. 61--68. ACM, New York. 229, 517 Google ScholarGoogle ScholarDigital LibraryDigital Library
  76. A. Vinciarelli, M. Pantic, H. Bourlard, and A. Pentland. 2008b. Social signal processing: State of the art and future perspectives of an emerging domain. In International Conference on Multimedia (MM), pp. 1061--1070. Vancouver, Canada. 229, 518 Google ScholarGoogle ScholarDigital LibraryDigital Library
  77. A. Vinciarelli, A. Dielmann, S. Favre, and H. Salamin. 2009a. Canal9: A database of political debates for analysis of social interactions. In International Conference on Affective Computing and Intelligent Interaction (ACII), pp. 1--4.Google ScholarGoogle Scholar
  78. A. Vinciarelli, M. Pantic, and H. Bourlard. 2009b. Social signal processing: Survey of an emerging domain. Image Vision Computing, 27(12): 1743--1759. 231, 251 Google ScholarGoogle ScholarDigital LibraryDigital Library
  79. A. Vinciarelli, M. Pantic, D. Heylen, C. Pelachaud, I. Poggi, F. D'Ericco, and M. Schröder. 2012. Bridging the gap between social animal and unsocial machine: A survey of social signal processing. Affective Computing, 3(1): 69--87. Issue 1. 229, 527 Google ScholarGoogle ScholarDigital LibraryDigital Library
  80. J. Wagner, F. Lingenfelser, E. André, J. Kim, and T. Vogt. 2011. Exploring fusion methods for multimodal emotion recognition with missing data. Affective Computing, 2(4): 206--218. 238 Google ScholarGoogle ScholarDigital LibraryDigital Library
  81. J. Wagner, F. Lingenfelser, T. Baur, I. Damian, F. Kistler, and E. André. 2013. The social signal interpretation (SSI) framework: multimodal signal processing and recognition in real-time. In International Conference on Multimedia (MM), pp. 831--834. ACM, New York. 242 Google ScholarGoogle ScholarDigital LibraryDigital Library
  82. T. Wilson. 2008. Annotating subjective content in meetings. In International Conference on Language Resources and Evaluation (LREC). European Language Resources Association. 232Google ScholarGoogle Scholar
  83. P. Wittenburg, H. Brugman, A. Russel, A. Klassmann, and H. Sloetjes. 2006. ELAN: a professional framework for multimodality research. In International Conference on Language Resources and Evaluation (LREC). 233Google ScholarGoogle Scholar
  84. M. Wöllmer, F. Eyben, S. Reiter, B. Schuller, C. Cox, E. Douglas-Cowie, and R. Cowie. 2008. Abandoning emotion classes---towards continuous emotion recognition with modelling of long-range dependencies. In Conference of the International Speech Communication Association (INTERSPEECH), pp. 597--600. ISCA. 239Google ScholarGoogle Scholar
  85. M. Wöllmer, M. Al-Hames, F. Eyben, B. Schuller, and G. Rigoll. 2009. A multidimensional dynamic time warping algorithm for efficient multimodal fusion of asynchronous data streams. Neurocomputing, 73(1-3): 366--380. 236 Google ScholarGoogle ScholarDigital LibraryDigital Library
  86. M. Wöllmer, B. Schuller, F. Eyben, and G. Rigoll. 2010. Combining long short-term memory and dynamic Bayesian networks for incremental emotion-sensitive artificial listening. Selected Topics Signal Processing, 4(5): 867--881. 236, 239, 252Google ScholarGoogle ScholarCross RefCross Ref
  87. D. Wu, T. D. Parsons, E. Mower, and S. Narayanan. 2010. Speech emotion estimation in 3D space. In Proceedings of IEEE. Singapore. 239Google ScholarGoogle Scholar
  88. Z. Zeng, J. Tu, B. M. Pianfetti, and T. S. Huang. 2008. Audio-visual affective expression recognition through multistream fused HMM. Multimedia, 10(4): 570--577. 236, 238, 252 Google ScholarGoogle ScholarDigital LibraryDigital Library
  89. Z. Zeng, M. Pantic, G. Roisman, and T. Huang. 2009. A survey of affect recognition methods: Audio, visual, and spontaneous expressions. Pattern Analysis and Machine Intelligence, 31(1): 39--58. 229, 234 Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Real-time sensing of affect and social signals in a multimodal framework: a practical approach
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Books
      The Handbook of Multimodal-Multisensor Interfaces: Signal Processing, Architectures, and Detection of Emotion and Cognition - Volume 2
      October 2018
      2034 pages
      ISBN:9781970001716
      DOI:10.1145/3107990

      Publisher

      Association for Computing Machinery and Morgan & Claypool

      Publication History

      • Published: 1 October 2018

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • chapter

      Appears In

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader