ABSTRACT
In this paper, we explore a multimodal approach to sensing affective state during exposure to visual narratives. Using four different modalities, consisting of visual facial behaviors, thermal imaging, heart rate measurements, and verbal descriptions, we show that we can effectively predict changes in human affect. Our experiments show that these modalities complement each other, and illustrate the role played by each of the four modalities in detecting human affect.
- J. Allen. Photoplethysmography and its application in clinical physiological measurement. Physiological measurement, 28:R1, 2007.Google ScholarCross Ref
- C. Alm, D. Roth, and R. Sproat. Emotions from text: Machine learning for text-based emotion prediction. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 347--354, Vancouver, Canada, 2005. Google ScholarDigital Library
- V. Blanz and T. Vetter. A morphable model for the synthesis of 3d faces. In SIGGRAPH '99: Proceedings of the 26th annual conference on Computer graphics and interactive techniques, pages 187--194, New York, NY, USA, 1999. ACM Press/Addison-Wesley Publishing Co. Google ScholarDigital Library
- C. Busso and S. Narayanan. Interrelation between speech and facial gestures in emotional utterances: a single subject study. IEEE Transactions on Audio, Speech and Language Processing, 15(8):2331--2347, November 2007. Google ScholarDigital Library
- C. Camerer. Behavioral Game Theory: Predicting Human Behavior in Strategic Situations. Princeton University Press, 2004.Google Scholar
- J. Cardoso. High-order contrasts for independent component analysis. Neural computation, 11(1):157--192, 1999. Google ScholarDigital Library
- T. Cootes, G. Edwards, and C. Taylor. Active appearance models. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 23(6):681--685, Jun 2001. Google ScholarDigital Library
- T. F. Cootes and C. J. Taylor. Active shape models - 'smart snakes'. In Proceedings of the British Machine Vision Conference, 1992.Google ScholarCross Ref
- D. Cristinacce and T. Cootes. Feature detection and tracking with constrained local models. In Brittish Machine Vision Conference, 2006.Google ScholarCross Ref
- J. Fei and I. Pavlidis. Thermistor at a distance: Unobtrusive measurement of breathing. Biomedical Engineering, IEEE Transactions on, 57(4):988--998, 2010.Google Scholar
- G. Fine. The Storied Group: Social Movements as "Bundles of Narratives". SUNY Press, 2002.Google Scholar
- M. Garbey, N. Sun, A. Merla, and I. Pavlidis. Contact-free measurement of cardiac pulse based on the analysis of thermal imagery. Biomedical Engineering, IEEE Transactions on, 54(8):1418--1426, 2007.Google Scholar
- E. Greneker. Radar sensing of heartbeat and respiration at a distance with applications of the technology. In Radar 97 (Conf. Publ. No. 449), pages 150--154. IET, 1997.Google Scholar
- H. Gunes and M. Pantic. Automatic, dimensional and continuous emotion recognition. Int'l Journal of Synthetic Emotion, 1(1):68--99, 2010.Google ScholarDigital Library
- C. Izard. Emotion theory. Annual Review of Psychology, 60(1), 2009.Google Scholar
- A. Kane. Finding Emotion in Social Movement Processes: Irish Land Movement Metaphors and Narratives. University of Chicago Press, 2001.Google Scholar
- M. Khan, R. Ward, and M. Ingleby. Classifying pretended and evoked facial expressions of positive and negative affective states using infrared measurement of skin temperature. ACM Transactions on Applied Perception, 6(1), 2009. Google ScholarDigital Library
- A. Liu and D. Salvucci. Modeling and prediction of human driver behavior. In Proceedings of the International Conference on Human-Computer Interaction, 2001.Google Scholar
- A. Maas, R. Daly, P. Pham, D. Huang, A. Ng, and C. Potts. Learning word vectors for sentiment analysis. In Proceedings of the Association for Computational Linguistics (ACL 2011), Portland, OR, 2011. Google ScholarDigital Library
- G. Mishne. Experiments with mood classification in blog posts. In Proceedings of the 1st Workshop on Stylistic Analysis Of Text For Information Access (Style 2005), Brazile, 2005.Google Scholar
- L. Morency, R. Mihalcea, and P. Doshi. Towards multimodal sentiment analysis: Harvesting opinions from the web. In Proceedings of the International Conference on Multimodal Computing, Alicante, Spain, 2011. Google ScholarDigital Library
- L. Morency, J. Whitehill, and J. Movellan. Generalized adaptive view-based appearance model: Integrated framework for monocular head pose estimation. In Automatic Face and Gesture Recognition, pages 1--8, 2008.Google ScholarCross Ref
- E. Murphy-Chutorian and M. Trivedi. Head pose estimation in computer vision: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31:607--626, 2009. Google ScholarDigital Library
- M. Nicolaou, H. Gunes, and M. Pantic. Audio-visual classification and fusion of spontaneous affective data in likelihood space. In ICPR, 2010. Google ScholarDigital Library
- M. Nicolaou, H. Gunes, and M. Pantic. Output-associative rvm regression for dimensional and continuous emotion prediction. In IEEE FG'11, 2011.Google ScholarCross Ref
- N. Oliver, B. Rosario, and A. Pentland. A Bayesian computer vision system for modeling human interactions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8), 2000. Google ScholarDigital Library
- B. Pang and L. Lee. A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In Proceedings of the 42nd Meeting of the Association for Computational Linguistics, Barcelona, Spain, July 2004. Google ScholarDigital Library
- M. Pantic and M. S. Bartlett. Machine analysis of facial expressions. In K. Delac and M. Grgic, editors, Face Recognition, pages 377--416. I-Tech Education and Publishing, Vienna, Austria, July 2007.Google ScholarCross Ref
- I. Pavlidis and J. Levine. Thermal image analysis for polygraph testing. IEEE Engineering in Medicine and Biology Magazine, 21(6), 2002.Google ScholarCross Ref
- M. Poh, D. McDuff, and R. Picard. Non-contact, automated cardiac pulse measurements using video imaging and blind source separation. Optical Society of America, 2010.Google ScholarCross Ref
- C. Puri, L. Olson, I. Pavlidis, J. Levine, , and J. Starren. Stresscam: non-contact measurement of users' emotional states through thermal imaging. In Proceedings of the 2005 ACM Conference on Human Factors in Computing Systems (CHI), 2005. Google ScholarDigital Library
- S. Raaijmakers, K. Truong, and T. Wilson. Multimodal subjectivity analysis of multiparty conversation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 466--474, Honolulu, Hawaii, 2008. Google ScholarDigital Library
- E. Riloff and J. Wiebe. Learning extraction patterns for subjective expressions. In Conference on Empirical Methods in Natural Language Processing (EMNLP-03), pages 105--112, 2003. Google ScholarDigital Library
- N. Sebe, I. Cohen, T. Gevers, and T. Huang. Emotion recognition based on joint visual and audio cues. In ICPR, 2006. Google ScholarDigital Library
- C. Strapparava and R. Mihalcea. Semeval-2007 task 14: Affective text. In Proceedings of the 4th International Workshop on the Semantic Evaluations (SemEval 2007), Prague, Czech Republic, 2007. Google ScholarDigital Library
- C. Strapparava and R. Mihalcea. Learning to identify emotions in text. In Proceedings of the ACM Conference on Applied Computing ACM-SAC 2008, Fortaleza, Brazile, 2008. Google ScholarDigital Library
- C. Strapparava and A. Valitutti. Wordnet-affect: an affective extension of wordnet. In Proceedings of the 4th International Conference on Language Resources and Evaluation, Lisbon, 2004.Google Scholar
- M. Tarvainen, P. Ranta-Aho, and P. Karjalainen. An advanced detrending method with application to hrv analysis. Biomedical Engineering, IEEE Transactions on, 49(2):172--175, 2002.Google Scholar
- P. Turney. Thumbs up or thumbs down? semantic orientation applied to unsupervised classification of reviews. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL 2002), pages 417--424, Philadelphia, 2002. Google ScholarDigital Library
- S. Ulyanov and V. Tuchin. Pulse-wave monitoring by means of focused laser beams scattered by skin surface and membranes. In Proceedings of SPIE, 1993.Google Scholar
- P. Viola and M. Jones. Rapid object detection using a boosted cascade of simple features. In Computer Vision and Pattern Recognition, 2001. CVPR 2001. Proceedings of the 2001 IEEE Computer Society Conference on, volume 1, pages I--511. IEEE, 2001.Google Scholar
- D. Watson and L. Clark. THE PANAS-X Manual for the Positive and Negative Affect Schedule - Expanded Form. University of Iowa.Google Scholar
- J. Wiebe. Learning subjective adjectives from corpora. In Proceedings of the American Association for Artificial Intelligence (AAAI 2000), pages 735--740, Austin, Texas, 2000. Google ScholarDigital Library
- J. Wiebe, T. Wilson, and C. Cardie. Annotating expressions of opinions and emotions in language. Language Resources and Evaluation, 39(2--3):165--210, 2005.Google Scholar
- T. Wilson, J. Wiebe, and P. Hoffmann. Recognizing contextual polarity in phrase-level sentiment analysis. In Proceedings of Human Language Technologies Conference/Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP 2005), Vancouver, Canada, 2005. Google ScholarDigital Library
- M. Wöllmer, F. Eyben, S. Reiter, B. Schuller, C. Cox, E. Douglas-Cowie, and R. Cowie. Abandoning emotion classes - towards continuous emotion recognition with modelling of long-range dependencies. In INTERSPEECH. ISCA, 2008.Google Scholar
- M. Wollmer, B. Schuller, F. Eyben, and G. Rigoll. Combining long short-term memory and dynamic bayesian networks for incremental emotion-sensitive artificial listening. IEEE Journal of Selected Topics in Signal Processing, 4(5), October 2010.Google ScholarCross Ref
- Z. Zeng, M. Pantic, G. I. Roisman, and T. S. Huang. A survey of affect recognition methods: Audio, visual, and spontaneous expressions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(1):39--58, 2009. Google ScholarDigital Library
- Z. Zhihong, M. P. G. Roisman, and T. Huang. A survey of affect recognition methods: Audio, visual, and spontaneous expressions. PAMI, 31(1), 2009. Google ScholarDigital Library
Index Terms
Towards sensing the influence of visual narratives on human affect
Recommendations
A cognitively based approach to affect sensing from text
IUI '06: Proceedings of the 11th international conference on Intelligent user interfacesStudying the relationship between natural language and affective information as well as assessing the underpinned affective qualities of natural language are becoming crucial for improving human computer interaction. Different approaches have already ...
Multimodal Prediction of Affective Dimensions and Depression in Human-Computer Interactions
AVEC '14: Proceedings of the 4th International Workshop on Audio/Visual Emotion ChallengeDepression is one of the most common mood disorders. Technology has the potential to assist in screening and treating people with depression by robustly modeling and tracking the complex behavioral cues associated with the disorder (e.g., speech, ...
Affect-aware tutors: recognising and responding to student affect
Theories and technologies are needed to understand and integrate the knowledge of student affect (e.g., frustration, motivation and self-confidence) into learning models. Our goals are to redress the cognitive versus affective imbalance in teaching ...
Comments