Skip to main content

Kernel Fusion of Audio and Visual Information for Emotion Recognition

  • Conference paper
Book cover Image Analysis and Recognition (ICIAR 2011)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 6754))

Included in the following conference series:

Abstract

Effective analysis and recognition of human emotional behavior are important for achieving efficient and intelligent human computer interaction. This paper presents an approach for audiovisual based multimodal emotion recognition. The proposed solution integrates the audio and visual information by fusing the kernel matrices of respective channels through algebraic operations, followed by dimensionality reduction techniques to map the original disparate features to a nonlinearly transformed joint subspace. A hidden Markov model is employed for characterizing the statistical dependence across successive frames, and identifying the inherent temporal structure of the features. We examine the kernel fusion method at both feature and score levels. The effectiveness of the proposed method is demonstrated through extensive experimentation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. De Silva, L.C., Miyasato, T., Nakatsu, R.: ’Facial emotion recognition using multi-modal information’. In: Proceedings of IEEE International Conference on Information, Communications and Signal Processing, vol. 1, pp. 397–401 (1997)

    Google Scholar 

  2. Go, H., Kwak, K., Lee, D., Chun, M.: Emotion recognition from the facial image and speech signal. In: Proceedings of SICE Annual Conference, Japan, vol. 3, pp. 2890–2895 (2003)

    Google Scholar 

  3. Kanluan, I., Grimm, M., Kroschel, K.: Audio-visual emotion recognition using an emotion space concept. In: Proceedings of 16th European Signal Processing Conference, Lausanne, Switzerland (2008)

    Google Scholar 

  4. Metallinou, A., Lee, S., Narayanan, S.: Audio-visual emotion recognition using Gaussian mixture models for face and voice. In: Proceedings of 10th IEEE International Symposium on Multimedia, pp. 250–257 (2008)

    Google Scholar 

  5. Han, M., Hus, J.H., Song, K.T.: A new information fusion method for bimodal robotic emotion recognition. Journal of Computers 3(7), 39–47 (2008)

    Article  Google Scholar 

  6. Song, M., Chen, C., You, M.: Audio-visual based emotion recognition using tripled hidden Markov model. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Montreal, Canada, vol. 5, pp. 877–880 (2004)

    Google Scholar 

  7. Zeng, Z., Tu, J., Pianfetti, B., Huang, T.S.: Audio-visual Affective Expression Recognition through Multi-stream Fused HMM. IEEE Transactions on Multimedia 10(4), 570–577 (2008)

    Article  Google Scholar 

  8. Wang, Y., Guan, L.: Recognizing human emotional state from audiovisual signals. IEEE Transactions on Multimedia 10(5), 936–946 (2008)

    Article  Google Scholar 

  9. Muller, K.R., Mika, S., Ratsch, G., Tsuda, K., Scholkopf, B.: An introduction to kernel-based learning algorithms. IEEE Transactions on Neural Networks 12, 181–201 (2001)

    Article  Google Scholar 

  10. Scholkopf, B., Smola, A., Muller, K.R.: Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation 10(5), 1299–1319 (1998)

    Article  Google Scholar 

  11. Baudat, G., Anouar, F.: Generalized discriminant analysis using a kernel approach. Neural Computing 12(10), 2385–2404 (2000)

    Article  Google Scholar 

  12. Yang, J., Jin, Z., Yang, J.Y., Zhang, D., Frangi, A.F.: Essence of kernel Fisher discriminant: KPCA plus LDA. Pattern Recognition 37(10), 2097–2100 (2004)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wang, Y., Zhang, R., Guan, L., Venetsanopoulos, A.N. (2011). Kernel Fusion of Audio and Visual Information for Emotion Recognition. In: Kamel, M., Campilho, A. (eds) Image Analysis and Recognition. ICIAR 2011. Lecture Notes in Computer Science, vol 6754. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21596-4_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-21596-4_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-21595-7

  • Online ISBN: 978-3-642-21596-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics