Abstract
This paper is dedicated to the description and the study of a new feature extraction approach for emotion recognition. Our contribution is based on the extraction and the characterization of phonemic units such as vowels and consonants, which are provided by a pseudo-phonetic speech segmentation phase combined with a vowel detector. The segmentation algorithm is evaluated on both emotional (Berlin) and non-emotional (TIMIT, NTIMIT) databases. Concerning the emotion recognition task, we propose to extract MFCC acoustic features from these pseudo-phonetic segments (vowels, consonants) and we compare this approach with traditional voice and unvoiced segments. The classification is achieved by the well-known k-nn classifier (k nearest neighbors) on the Berlin corpus.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Athanaselis, T., Bakamidis, S., Dologlou, I., Cowie, R., Douglas-Cowie, E., Cox, C.: ASR for emotional speech: clarifying the issues and enhancing performance. Neural Networks 18, 437–444 (2005)
Plutchik, R.: The psychology and Biology of Emotion, Harper-Collins College, New York (1994)
Sherer, K., et al.: Acoustic correlates of task load and stress. In: Proceedings of ICSLP (2002)
Cowie, R.: Emotion-Oriented Computing: State of the Art and Key Challenges. Humaine Network of Excellence (2005)
Appendix, F.: Labels describing affective states in five major languages. In: Scherer, K. (ed.) Facets of emotion: Recent research, pp. 241–243. Lawrence Erlbaum, Hillsdale (1988) [Version revised by the members of the Geneva Emotion Research Group]
Plutchik, R.: A General Psychoevolutionary Theory of Emotion. In: Plutchik, R., Kellerman, H. (eds.) Emotion: theory, research, and experience, vol. 1, pp. 3–33. Academic press, New York (1980)
Picard, R.: Affective Computing. MIT Press, Cambridge (1997)
Ververidis, D., Kotropoulos, C.: Emotional Speech Recognition, Features and Method. Speech Communication 48(9), 1162–1181 (2006)
Devillers, L., Vidrascu, L., Lamel, L.: Challenges in Real-Life Emotion Annotation and Machine Learning Based Detection. Journal of Neural Networks 18(4), 407–422 (2005)
Spence, C., Sajda, P.: The Role of Feature Selection in Building Patterns Recognizers for Computer-Aided Diagnosis. In: Kenneth, M.H. (ed.) Proceedings of SPIE, vol. 3338, pp. 1434–1441. Springer, Heildberg (1998)
Oudeyer, P.-Y.: The Production and Recognition of Emotions in Speech: Features and Algorithm. International Journal of Human-Computer Studies, special issue on Affective Computing 59(1-2), 157–183 (2002)
Xiao, Z., et al.: Hierarchical Classification of Emotional Speech. IEEE Transactions on Multimedia (submitted to, 2007)
Vogt, T., André, E.: Comparing Feature Sets for Acted and Spontaneous Speech in view of Automatic Emotion Recognition, pp. 474–477. ICME (2005)
Shami, M., Verhelst, W.: An Evaluation of the Robustness of Existing Supervised Machine Learning Approaches to the Classification of Emotions in Speech. Speech Communications 48(9), 201–212 (2007)
Clavel, C., Vasilescu, I., Richard, G., Devillers, L.: Voiced and Unvoiced Content of Fear-type Emotions in the SAFE Corpus. In: Proc. of Speech Prosody (2006)
Garofolo, J.-S., et al.: DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus CDROM. In: NIST (1993)
Jankowski, C., et al.: NTIMIT: A Phonetically Balanced, Continuous Speech, Telephone Bandwidth Speech Database. ICASSP 1, 109–112 (1990)
Burkhardt, F., et al.: A Database of German Emotional Speech. In: Proc. of Interspeech (2005)
Truong, K., Van Leeuwen, D.: An open-set Detection Evaluation Methodology for Automatic Emotion Recognition in Speech. In: Workshop on Paralinguistic Speech - between models and data, pp. 5–10 (2007)
Vogt, T., André, E.: Improving Automatic Emotion Recognition from Speech via Gender Differentiation. In: Proc. of Language Resources and Evaluation Conference (2006)
Datcu, D., Rothkrantz, L., J.-M.: The Recognition of Emotions from Speech using GentleBoost Classifier. A Comparison Approach. CompSysTech Session V (2006)
André-Obrecht, R.: A New Statistical Approach for Automatic Speech Segmentation. IEEE Transaction on ASSP 36(1), 29–40 (1988)
Pellegrino, F., André-Obrecht, R.: Automatic Language Identification: an Alternative Approach to Phonetic Modelling. Signal Processing 80, 1231–1244 (2000)
Rouas, J.-L., Farinas, J., Pellegrino, F., André-Obrecht, R.: Rhythmic Unit Extraction and Modeling for Automatic Language Identification. Speech Communication 47(4), 436–456 (2005)
Pillot, C., Vaissière, J.: Vocal Effectiveness in Speech and Singing: Acoustical, Physiological and Perspective Aspects. Applications in Speech Therapy. Laryngol Otol Rhinol Journal 127(5), 293–298 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ringeval, F., Chetouani, M. (2008). Exploiting a Vowel Based Approach for Acted Emotion Recognition. In: Esposito, A., Bourbakis, N.G., Avouris, N., Hatzilygeroudis, I. (eds) Verbal and Nonverbal Features of Human-Human and Human-Machine Interaction. Lecture Notes in Computer Science(), vol 5042. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70872-8_19
Download citation
DOI: https://doi.org/10.1007/978-3-540-70872-8_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-70871-1
Online ISBN: 978-3-540-70872-8
eBook Packages: Computer ScienceComputer Science (R0)