Skip to main content

Semi-Coupled Hidden Markov Model with State-Based Alignment Strategy for Audio-Visual Emotion Recognition

  • Conference paper
Affective Computing and Intelligent Interaction (ACII 2011)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 6974))

Abstract

This paper presents an approach to bi-modal emotion recognition based on a semi-coupled hidden Markov model (SC-HMM). A simplified state-based bi-modal alignment strategy in SC-HMM is proposed to align the temporal relation of states between audio and visual streams. Based on this strategy, the proposed SC-HMM can alleviate the problem of data sparseness and achieve better statistical dependency between states of audio and visual HMMs in most real world scenarios. For performance evaluation, audio-visual signals with four emotional states (happy, neutral, angry and sad) were collected. Each of the invited seven subjects was asked to utter 30 types of sentences twice to generate emotional speech and facial expression for each emotion. Experimental results show the proposed bi-modal approach outperforms other fusion-based bi-modal emotion recognition methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Picard, R.W.: Affective Computing. MIT Press, Cambridge (1997)

    Book  Google Scholar 

  2. Mehrabian, A.: Communication without words. Psychol. Today 2(4), 53–56 (1968)

    Google Scholar 

  3. Ambady, N., Rosenthal, R.: Thin slices of expressive behavior as predictors of interpersonal consequences: A meta-analysis. Psychol. Bull. 111(2), 256–274 (1992)

    Article  Google Scholar 

  4. Tian, Y.I., Kanade, T., Cohn, J.F.: Recognizing action units for facial expression analysis. IEEE Trans. Pattern Anal. Mach. Intell. 23(2), 97–115 (2001)

    Article  Google Scholar 

  5. Wu, C.H., Yeh, J.F., Chuang, Z.J.: Emotion perception and recognition from speech. In: Affective Information Processing, ch. 6, pp. 93–110 (2009)

    Google Scholar 

  6. Wu, C.H., Chuang, Z.J., Lin, Y.C.: Emotion recognition from text using semantic labels and separable mixture models. ACM Transactions on Asian Language Information Processing 5, 165–182 (2006)

    Article  Google Scholar 

  7. Zeng, Z., Pantic, M., Roisman, G.I., Huang, T.S.: A survey of affect recognition methods: audio, visual, and spontaneous expressions. IEEE Trans. Pattern Anal. Mach. Intell. 31(1), 39–58 (2009)

    Article  Google Scholar 

  8. Wu, C.H., Liang, W.B.: Emotion recognition of affective speech based on multiple classifiers using acoustic-prosodic information and semantic labels. IEEE Trans. Affective Computing 2(1), 1–12 (2011)

    Article  Google Scholar 

  9. Schuller, B., Muller, R., Hornler, B., Hothker, A., Konosu, H., Rigoll, G.: Audiovisual recognition of spontaneous interest within conversations. In: Proc. Ninth ACM Int’l. Conf. Multimodal Interfaces (ICMI 2007), pp. 30–37 (2007)

    Google Scholar 

  10. Metallinou, A., Lee, S., Narayanan, S.: Audio-visual emotion recognition using Gaussian mixture models for face and voice. In: Proc. Int’l. Symposium on Multimedia (ISM 2008), pp. 250–257 (2008)

    Google Scholar 

  11. Song, M., You, M., Li, N., Chen, C.: A robust multimodal approach for emotion recognition. Neurocomputing 71(10-12), 1913–1920 (2008)

    Article  Google Scholar 

  12. Brand, M., Oliver, N., Pentland, A.: Coupled hidden Markov models for complex action recognition. In: Proc. Int’l. Conf. Computer Vision Pattern Recognition, pp. 994–999 (1997)

    Google Scholar 

  13. Ananthakrishnan, S., Narayanan, S.: An automatic prosody recognizer using a coupled multi-stream acoustic model and a syntactic-prosodic language model. In: Proc. 30th Int’l. Conf. Acoustics, Speech, and Signal Processing (ICASSP 2005), pp. 269–272 (2005)

    Google Scholar 

  14. Nefian, A.V., Liang, L., Pi, X., Liu, X., Mao, C., Murphy, K.: A coupled HMM for audio-visual speech recognition. In: Proc. 27th Int’l. Conf. Acoustics, Speech, and Signal Processing (ICASSP 2002), pp. 2013–2016 (2002)

    Google Scholar 

  15. Xie, L., Liu, Z.Q.: A coupled HMM approach to video-realistic speech animation. Pattern Recognition 40(8), 2325–2340 (2007)

    Article  MATH  Google Scholar 

  16. Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proc. Int’l. Conf. Computer Vision Pattern Recognition, vol. 1, pp. 511–518 (2001)

    Google Scholar 

  17. Cootes, T.F., Edwards, G.J., Taylor, C.J.: Active appearance models. IEEE Trans. Pattern Anal. Mach. Intell. 23(6), 681–685 (2001)

    Article  Google Scholar 

  18. Boersma, P., Weenink, D.: Praat: doing phonetics by computer (2007), http://www.praat.org/

  19. Brand, M.: Coupled hidden Markov models for modeling interacting processes. MIT Media Lab Perceptual Computing / Learning and Common Sense Technical Report, Boston, MA, pp. 1–28 (1997)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lin, JC., Wu, CH., Wei, WL. (2011). Semi-Coupled Hidden Markov Model with State-Based Alignment Strategy for Audio-Visual Emotion Recognition. In: D’Mello, S., Graesser, A., Schuller, B., Martin, JC. (eds) Affective Computing and Intelligent Interaction. ACII 2011. Lecture Notes in Computer Science, vol 6974. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24600-5_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-24600-5_22

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-24599-2

  • Online ISBN: 978-3-642-24600-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics