Abstract
In this paper we are introducing the employment of features extracted from Fujisaki’s parameterization of pitch contour for the task of emotion recognition from speech. In evaluating the proposed features we have trained a decision tree inducer as well as the instance based learning algorithm. The datasets utilized for training the classification models, were extracted from two emotional speech databases. Fujisaki’s parameters benefited all prediction models with an average raise of 9,52% in the total accuracy.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Murray, I.R., Arnott, J.L.: Towards a simulation of emotion in synthetic speech: a review of the literature on human vocal emotion. JASA 93(2), 1097–1108 (1993)
Cowie, R., Douglas-Cowie, E.: Automatic statistical analysis of the signal and prosodic signs of emotion in speech. In: Proc. of ICSLP, Philadelphia, pp. 1989–1992 (1998)
Gobl, C., Chasaide, A.N.: Testing Affective Correlates of Voice Quality through Analysis and Resynthesis. In: Proc. of ISCA Workshop on Emotion and Speech (2000)
Kienast, M., Sendlmeier, W.: Acoustical Analysis of Spectral and Temporal Changes in Emotional Speech. In: ISCA Workshop on Emotion and Speech (2000)
Fujisaki, H., Hirose, K.: Analysis of voice fundamental frequency contours for declarative sentences of Japanese. Journal of the Acoustical Society of Japan (E) 5(4), 233–241 (1984)
Ohman, S.: Word and sentence intonation, a quantitative model. Tech. Rep., Department of Speech Communication, Royal Institute of Technology, KTH (1967)
Zervas, P., Geourga, I., Fakotakis, N., Kokkinakis, G.: Greek Emotional Database: Construction and Linguistic Analysis. In: Proc. of 6th International Conference of Greek Linguistics, Rethymno (2003)
Oatley, K., Gholamain, M.: Emotions and identification: Connections between readers and fiction. In: Hjort, M., Laver, S. (eds.) Emotion and the arts, pp. 263–281. Oxford University Press, New York (1997)
Engberg, I.S., Hansen, A.V.: Documentation of the Danish Emotional Speech Database (DES), AAU report, Person Kommunication Center, Denmark (1996)
Boersma, P., Weenink, D.: Praat: doing phonetics by computer (Version 4.3.01) (Computer program) (2005), retrieved from http://www.praat.org/
Slaney, M.: Auditory Toolbox. Version 2. Technical Report n. 1998-010, Interval Research Corporation
Mixdorff, H.: A novel approach to the fully automatic extraction of Fujisaki model parameters. In: Proc. of ICASSP, vol. 3, pp. 1281–1284. Istanbul, Turkey (2000)
Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
Quinlan, R.: C4.5: programs for machine learning. Morgan Kaufmann, San Francisco (1993)
Aha, D., Kibler, D., Albert, M.: Instance based learning algorithms. Machine Learning 6, 37–66 (1991)
Stone, M.: Cross-validation choice and assessment of statistical predictions. Journal of the Royal Statistical Society 36, 111–147 (1974)
Hammal, Z., Bozkurt, B., Couvreur, L., Unay, D., Caplier, A., Dutoit, T.: Passive versus Active: Vocal Classification System. In: Proc. 13th European Signal Processing Conference, Turkey (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zervas, P., Mporas, I., Fakotakis, N., Kokkinakis, G. (2006). Employing Fujisaki’s Intonation Model Parameters for Emotion Recognition. In: Antoniou, G., Potamias, G., Spyropoulos, C., Plexousakis, D. (eds) Advances in Artificial Intelligence. SETN 2006. Lecture Notes in Computer Science(), vol 3955. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11752912_44
Download citation
DOI: https://doi.org/10.1007/11752912_44
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-34117-8
Online ISBN: 978-3-540-34118-5
eBook Packages: Computer ScienceComputer Science (R0)