Abstract
This paper reports on initial experiments with the creation of a suitable database for training and testing systems for stress detection in speech and first experimental results. Based on the psychological understanding of the concepts of stress and emotion, we operationalized stress as a level of arousal, which can be detected in speech. We describe here a speech database with three levels of “acted stress” and three levels of soothing. For the very first experiment performed on the database we detect different levels of stress using Gaussian mixture models. The accuracy of detecting three levels of stress was 89 % for speakers included in the training database and 73 % for speakers whose recordings were not used during the adaptation of the GMM models.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Macková, L., Čižmár, A., Juhár, J.: A study of acoustic features for emotional speaker recognition in i-vector representation. Acta Electrotechnica et Informatica 15(2), 15–20 (2015)
Vizer, L.M., Zhou, L., Sears, A.: Automated stress detection using keystroke and linguistic features: an exploratory study. Int. J. Hum. Comput. Stud. 67(10), 870–886 (2009)
Kurniawan, H., Maslov, A.V., Pechenizkiy, M.: Stress detection from speech and galvanic skin response signals. In: Computer-Based Medical Systems, pp. 209–214 (2013)
Zhang, C., Hansen, J.H.L.: Analysis and classification of speech mode: whispered throughshouted. In: Interspeech 2007, Antwerp, Belgium, pp. 2289–2292 (2007)
Ruzanski, E., Hansen, J.H., et al.: Effects of phoneme characteristics on TEO feature-based automatic stress detection in speech. In: ICASSP (1), pp. 357–360 (2005)
Womack, B.D., Hansen, J.H.: Classification of speech under stress using target driven features. Speech Commun. 20(1), 131–150 (1996)
McEwen, B.S., Wingfield, J.C.: The concept of allostasis in biology and biomedicine. Horm. Behav. 43(1), 2–15 (2003)
Chrousos, G.P.: Stressors, stress, and neuroendocrine integration of the adaptive response: the 1997 Hans Selye Memorial Lecture. Ann. N. Y. Acad. Sci. 851(1), 311–335 (1998)
Lazarus, R.S.: From psychological stress to the emotions: a history of changing outlooks. Pers. Crit. Concepts Psychol. 4, 179 (1998)
Cannon, W.: The wisdom of the body. Physiol. Rev. 9, 399–431 (1929)
Russell, J.A.: A circumplex model of affect. J. Pers. Soc. Psychol. 39(6), 1161–1178 (1980)
Dougall, A.L., Baum, A.: Stress, coping and immune function. In: Weiner, I.B., et al. (eds.) Handbook of Psychology, vol. 3, pp. 441–456. Wiley, New York (2003, 2009)
Thayer, R.E.: The Activation-Deactivation Adjective Check List (AD ACL). APPENDIX I, The Biopsychology of Mood and Arousal. Oxford University Press, New York (1989)
Hansen, J.H., Patil, S.: Speech under stress: analysis, modeling and recognition. In: Müller, C. (ed.) Speaker Classification 2007. LNCS (LNAI), vol. 4343, pp. 108–137. Springer, Heidelberg (2007)
Šimko, J., Beňuš, Š., Vainio, M.: Hyperarticulation in Lombard speech: global coordination of the jaw, lips and the tongue. J. Acoust. Soc. Am. 139(1), 151–162 (2016)
Rusko, M., Darjaa, S., Trnka, M., Ritomský, M., Sabo, R.: Alert!… Calm Down, There is Nothing to Worry About. Warning and Soothing Speech Synthesis. In: LREC, pp. 1182–1187 (2014)
Scherer, K.R.: Vocal communication of emotion: a review of research paradigms. Speech Commun. 40, 227–256 (2003)
Rusko, M., Trnka, M., Darjaa, S., Hamar, J.: The dramatic piece reader for the blind and visually impaired. In Proceedings of SLPAT 2013, pp. 83–91 (2013)
Gajšek, R., et al.: Gender and affect recognition based on GMM and GMM-UBM modeling with relevance MAP estimation. In: Proceedings of the Interspeech, pp. 2810–2813 (2010)
Panayotov, V., Chen, G., Povey, D., Khudanpur, S.: Librispeech: an ASR corpus based on public domain audio books. In: Acoustics, ICASSP 2015, pp. 5206–5210 (2015)
Acknowledgement
The research leading to the results presented in this paper has received funding from the European Union FP7 under grant agreement no. 312382 (GAMMA - Global ATM Security Management project [22]).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Sabo, R., Rusko, M., Ridzik, A., Rajčáni, J. (2016). Stress, Arousal, and Stress Detector Trained on Acted Speech Database. In: Ronzhin, A., Potapova, R., Németh, G. (eds) Speech and Computer. SPECOM 2016. Lecture Notes in Computer Science(), vol 9811. Springer, Cham. https://doi.org/10.1007/978-3-319-43958-7_82
Download citation
DOI: https://doi.org/10.1007/978-3-319-43958-7_82
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-43957-0
Online ISBN: 978-3-319-43958-7
eBook Packages: Computer ScienceComputer Science (R0)