Abstract
Emotion detection is a hot topic nowadays for its potential application to intelligent systems in different fields such as neuromarketing, dialogue systems, friendly robotics, vending platforms and amiable banking. Nevertheless, the lack of a benchmarking standard makes it difficult to compare results produced by different methodologies, which could help the research community improve existing approaches and design new ones. Besides, there is the added problem of accurate dataset production. Most of the emotional speech databases and associated documentation are either privative or not publicly available. Therefore, in this work, two stress-elicited databases containing speech from male and female speakers were recruited, and four classification methods are compared in order to detect and classify speech under stress. Results from each method are presented to show their quality performance, besides the final scores attained, in what is a novel approach to the field of study.









Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.References
Ekman P (1999) Basic emotions. In: Dalgleish T, Power MJ (eds) Handbook of cognition and emotion. Wiley, Chichester, pp 45–60
Plutchik R (1980) A general psychoevolutionary theory of emotion, in theories of emotion. Academic Press, New York, pp 3–33
Plutchik R (1994) The psychology and biology of emotion. HarperCollins College Publishers, New York
Ortony A, Turner TJ (1990) What’s basic about basic emotions? Psychol Rev 97(3):315–331
Darwin C (1998) The expression of emotions in man and animals, 3rd edn. Oxford University Press, Oxford
National Research Council (US) Committee on Pain and Distress in Laboratory Animals (1992) Recognition and alleviation of pain and distress in laboratory animals. National Academies Press, Washington. https://doi.org/10.17226/1542
Russ TC, Stamatakis E, Hamer M, Starr JM, Kivimaki M, Batty GD (2012) Association between psychological distress and mortality: individual participant pooled analysis of 10 prospective cohort studies. BMJ 345:e4933. https://doi.org/10.1136/bmj.e4933
Gonzalez RC, Woods RE (1992) Digital imaging processing. Addison-Wesley, Upper Saddle River, pp 52–54
Ververidis D, Kotropoulos C (2003) A review of emotional speech databases. In: Proceedings of Panhellenic conference on informatics (PCI), pp 560–574
Koolagudi SG, Rao KS (2012) Emotion recognition from speech: a review. Int J Speech Technol 15(2):99–117
Ayadi MS, Kamel MS, Karray F (2011) Survey on speech emotion recognition: features, classification schemes, and databases. Pattern Recognit 44(3):572–587
Zuo X, Lin L, Fung P (2012) A multilingual database of natural stress emotion. In: Proceedings of the 8th international conference on language resources and evaluation (LREC’12), pp 1174–1178
Fernandez R, Picard RW (2003) Modeling drivers’ speech under stress. Speech Commun 40(1):145–159
Hansen JH, Bou-Ghazale SE, Sarikaya R, Pellom B (1997) Getting started with SUSAS: a speech under simulated and actual stress database. Eurospeech, pp 1743–1746
Sigmund M (2006) Introducing the database ExamStress for speech under stress. In: Proceedings of the 7th Nordic signal processing symposium NORSIG 2006, pp 290–293
Scherer S, Hofmann H, Lampmann M, Pfeil M, Rhinow S, Schwenker F, Palm G (2008) Emotion recognition from speech: stress experiment. In: Proceedings of the 6th international language resources and evaluation (LREC)
Arciuli J, Villar G, Mallard D (2009) Lies, lies and more lies. In: Proceedings of the 31st annual conference of the cognitive science society (CogSci 2009), pp 2329–2334
Rodellar-Biarge V, Palacios-Alonso D, Nieto-Lluis V, Gomez-Vilda P (2015) Analysis of emotional stress in voice for deception detection. In: Proceedings of the international work conference on bioinspired intelligence (IWOBI), pp 127–132
BioMet®Phon: www.glottex.com. Accessed 21 July 2019
Hyvärinen A, Oja H (2000) Independent component analysis: algorithms and applications. Neural Netw 13(4):411–430
Jutten C, Herault J (1991) Blind separation of sources, part I: an adaptive algorithm based on neuromimetic architecture. Signal Proc 24(1):1–10
Martínez-Murcia FJ, Górriz JM, Ramírez J, Puntonet CG, Illán IA, ADNI (2013) Functional activity maps based on significance measures and independent component analysis. Comput Methods Programs Biomed 111(1):255–268
Bartlett MS, Movellan JR, Sejnowski TJ (2002) Face recognition by independent component analysis. IEEE Trans Neural Netw 13(6):1450–1464
Hyvärinen A (1999) Fast and robust fixed-point algorithms for independent component analysis. IEEE Trans Neural Netw 10(3):626–634
Hyvärinen A, Oja E (1997) A fast fixed-point algorithm for independent component analysis. Neural Comput 9(7):1483–1492
Jolliffe IT (2002) Principal component analysis. Springer, New York
López M, Ramírez J, Górriz JM, Álvarez I, Salas-Gonzalez D, Segovia F, Chaves R, Padilla P, Gómez-Río M, ADNI (2011) Principal component analysis-based techniques and supervised classification schemes for the early detection of Alzheimer’s disease. Neurocomputing 74(8):1260–1271
Lee CM, Narayanan S, Pieraccini R (2001) Recognition of negative emotions from the speech signal. In: Proceedings of IEEE workshop on automatic speech recognition and understanding ASRU ‘01, pp 240–243
Vapnik VN, Vapnik V (1998) Statistical learning theory. Wiley, New York
Burges CJ (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Disc 2(2):121–167
Illán IA, Górriz JM, Ramírez J, Salas-Gonzalez D, López MM, Segovia F, Chaves R, Gómez-Rio M, Puntonet CG, ADNI (2011) 18F-FDG PET imaging analysis for computer aided Alzheimer’s diagnosis. Inf Sci 181(4):903–916
Rodellar-Biarge V, Palacios-Alonso D, Nieto-Lluis V, Gómez-Vilda P (2015) Towards the search of detection in speech-relevant features for stress. Expert Syst 32(6):710–718
Tan S, Zhang J (2008) An empirical study of sentiment analysis for Chinese documents. Expert Syst Appl 34(4):2622–2629
Gómez-Vilda P, Nieto-Lluis Rodellar-Biarge V, Álvarez-Marquina A, Mazaira-Fernández LM, Martínez-Olalla R, Muñoz-Mulas C, Fernández-Fernández M, Ramírez-Calvo C (2013) Estimating tremor in vocal fold biomechanics for neurological disease characterization. In: Proceedings of the 18th international conference on digital signal processing (DSP 2013), pp 1–6
Rodellar-Biarge V, Palacios-Alonso D, Bartolomé-Morala E, Gómez-Vilda P (2013) Vocal fold stiffness estimates for emotion description in speech. In: Proceedings of biosignals, pp 112–119
Acknowledgements
This work has been funded by Grants TEC2016-77791-C4-4-R from the Ministry of Economic Affairs and Competitiveness of Spain and Teca-Park/MonParLoc FGCSIC CENIE-0348_CIE_6_E (InterReg Programme) V-A Spain – Portugal (POCTEP) (Grant No. CENIE_TECA-PARK_55_02).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare not having any conflict of interest and that their research has been conducted in compliment with all ethical principles.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Palacios, D., Rodellar, V., Lázaro, C. et al. An ICA-based method for stress classification from voice samples. Neural Comput & Applic 32, 17887–17897 (2020). https://doi.org/10.1007/s00521-019-04549-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-019-04549-3