Skip to main content

Part of the book series: Studies in Computational Intelligence ((SCI,volume 83))

This work provides a detailed overview of related work on the emotion recognition task. Common definitions for emotions are given and known issues such as cultural dependencies are explained. Furthermore, labeling issues are exempli-fied, and comparable recognition experiments and data collections are introduced in order to give an overview of the state of the art. A comparison of possible data acquisition methods, such as recording acted emotional material, induced emotional data recorded in Wizard-of-Oz scenarios, as well as real-life emotions, is provided. A complete automatic emotion recognizer scenario comprising a possible way of collecting emotional data, a human perception experiment for data quality benchmarking, the extraction of commonly used features, and recognition experiments using multi-classifier systems and RBF ensembles, is included. Results close to human performance were achieved using RBF ensembles, that are simple to implement and trainable in a fast manner.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. E. Douglas-Cowie, R. Cowie, N. Tsapatsoulis, G. Votsis, S. Kollias, W. Fellenz, and J. Taylor “Emotion recognition in human-computer interaction,” IEEE Signal Processing Magazine, vol. 18, no. 1, pp. 32-80, 2001.

    Article  Google Scholar 

  2. C. Darwin, The Expression of Emotions in Man and Animals, Reprinted by Univiersity of Chicago Press, Chicago, 1965.

    Google Scholar 

  3. R. W. Reiber and D. K. Robinson, Wilhelm Wundt in History: The Making of a Scientific Psychology, Kluwer, Dordrecht, 2001.

    Google Scholar 

  4. R. Harre and R. Finlay-Jones, Emotion talk across times, pp.220-233, Blackwell, Oxford, 1986.

    Google Scholar 

  5. H. Morsbach and W. J. Tyler, A Japanese Emotion: Amae, pp. 289-307, Blackwell, Oxford, 1986.

    Google Scholar 

  6. J. R. Averill, Acquisition of Emotion in adulthood, p. 100, Blackwell, Oxford, 1986.

    Google Scholar 

  7. V. Petrushin, “Emotion in speech: Recognition and application to call cen-ters,” in Proceedings of Artificial Neural Networks Engineering, November 1999, pp. 7-10.

    Google Scholar 

  8. S. Yacoub, S. Simske, X. Lin, and J. Burns, “Recognition of emotions in interactive voice response systems,” in Proceedings of Eurospeech, 2003.

    Google Scholar 

  9. F. Dellaert, T. Polzin, and A. Waibel, “Recognizing emotion in speech,” in Proceedings of the ICSLP, 1996, pp. 1970-1973.

    Google Scholar 

  10. C. Lee, S. Narayanan, and R. Pieraccini, “Classifying emotions in human machine spoken dialogs,” in Proceedings of International Conference on Multimedia and Expo (ICME), 2002, vol. 1, pp. 737-740.

    Google Scholar 

  11. F. Yu, E. Chang, X. Yingqing, and H.-Y. Shum, “Emotion detection from speech to enrich multimedia content,” in Proceedings of the Second IEEE Pacific Rim Conference on Multimedia, London, UK, 2001, pp. 550-557, Springer.

    Google Scholar 

  12. C. M. Lee, S. Yildirim, M. Bulut, A. Kazemzadeh, C. Busso, Z. Deng, S. Lee, and S. Narayanan, “Emotion recognition based on phoneme classes,” in Proceedings of ICSLP 2004, 2004.

    Google Scholar 

  13. K. R. Scherer, R. Banse, H. G. Wallbott, and T. Goldbeck, “Vocal cues in emotion encoding and decoding,” Motivation and Emotion, vol. 15, no. 2, pp. 123-148, 1991.

    Article  Google Scholar 

  14. E. Douglas-Cowie, R. Cowie, and C. Cox,“Beyond emotion archetypes: Databases for emotion modeling using neural networks,” Neural Networks, vol. 18, no. 4, pp. 371-388, 2005.

    Article  Google Scholar 

  15. A. Noam, A. Bat-Chen, and G. Ronit, “Perceiving prominence and emotion in speech - a cross lingual study,” in Proceeding of SP-2004, 2004, pp. 375-378.

    Google Scholar 

  16. F. Burkhardt, A. Paeschke, M. Rolfes, W. F. Sendlmeier, and B. Weiss, “A database of german emotional speech,” in Proceedings of Interspeech, 2005.

    Google Scholar 

  17. E. Douglas-Cowie, R. Cowie, and M. Schroeder, “A new emotion database: Considerations, sources and scope,” in Proceedings of the ISCA Workshop on Speech and Emotion, 2000, pp. 39-44.

    Google Scholar 

  18. R. Cowie, “Describing the emotional states expressed in speech,” in Proceedings of the ISCA Workshop on Speech and Emotion, 2000, pp. 11-18.

    Google Scholar 

  19. E. Douglas-Cowie, R. Cowie, and M. Schroeder, “The description of naturally occurring emotional speech,” in 15th International Conference of Phonetic Sciences, 2003, pp. 2877-2880.

    Google Scholar 

  20. I. H. Witten and E. Frank, Data Mining: Practical machine learning tools and techniques, 2nd edition, Morgan Kaufmann, San Francisco, 2005.

    MATH  Google Scholar 

  21. M. B. Arnold, Emotion and Personality: Vol. 2 Physiological Aspects, Columbia University Press, New York, 1960.

    Google Scholar 

  22. P.-M. Strauss, H. Hoffmann, W. Minker, H. Neumann, G. Palm, S. Scherer, F. Schwenker, H. Traue, W. Walter, and U. Weidenbacher, “Wizard-of-oz data collection for perception and interaction in multi-user environments,” in International Conference on Language Resources and Evaluation (LREC), 2006.

    Google Scholar 

  23. C. Stanislavski, An Actor Prepares, Routledge, New York, 1989.

    Google Scholar 

  24. S. T. Jovicic, Z. Kasic, M. Dordevic, and M. Rajkovic, “Serbian emotional speech database: design, processing and evaluation,” in Proceedings of SPECOM-2004, 2004, pp. 77-81.

    Google Scholar 

  25. T. Seppnen, J. Toivanen, and E. Vyrynen, “Mediateam speech corpus: a first large finnish emotional speech database,” in Proceeding of 15th International Congress of Phonetic Sciences, 2003, vol. 3, pp. 2469-2472.

    Google Scholar 

  26. N. Campbell, “The recording of emotional speech; jst/crest database research,” in Proceedings of International Conference on Language Resources and Evaluation (LREC), 2002, vol. 6, pp. 2026-2032.

    Google Scholar 

  27. P. Ekman and W. Friesen, Unmasking the Face, Prentice-Hall, Englewood Cliffs, 1975.

    Google Scholar 

  28. A. Nilsonne, “Speech characteristics as indicators of depressive illness,” Acta Psychiatrica Scandinavica, vol. 77, pp. 253-263, 1988.

    Article  Google Scholar 

  29. R. Cowie, A. Wichmann, E. Douglas-Cowie, P. Hartley, and C. Smith, “The prosodic correlates of expressive reading,” in 14th International Congress of Phonetic Sciences, 1999, pp. 2327-2330.

    Google Scholar 

  30. L. R. Rabiner and R. W. Schafer, Digital Processing of Speech Signals, Prentice-Hall Signal Processing Series, Englewood Cliffs, NJ, 1978.

    Google Scholar 

  31. C. J. Plack, A. J. Oxenham, R. R. Fay, and A. N. Popper, Eds., Pitch - Neural Coding and Perception, Series: Springer Handbook of Auditory Research, vol. 24, Springer, Berlin Heidelberg New York, 2005.

    Google Scholar 

  32. L. Kuncheva, Combining Pattern Classifiers: Methods and Algorithms, Wiley, New York, 2004.

    Book  MATH  Google Scholar 

  33. T. K. Ho, Multiple Classifier Combination: Lessons and Next Steps, chapter 7, World Scientific, Singapore, 2002.

    Google Scholar 

  34. T. Kohonen, Self-Organizing Maps, Springer, Berlin Heidelberg New York, 1995.

    Google Scholar 

  35. F. Schwenker, H. A. Kestler, and G. Palm, “Three learning phases for radial basis function networks,” Neural Networks, vol. 14, pp. 439-458, 2001.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Scherer, S., Schwenker, F., Palm, G. (2008). Emotion Recognition from Speech Using Multi-Classifier Systems and RBF-Ensembles. In: Prasad, B., Prasanna, S.R.M. (eds) Speech, Audio, Image and Biomedical Signal Processing using Neural Networks. Studies in Computational Intelligence, vol 83. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75398-8_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-75398-8_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-75397-1

  • Online ISBN: 978-3-540-75398-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics