Speech emotion recognition using FCBF feature selection method and GA-optimized fuzzy ARTMAP neural network

Gharavian, Davood; Sheikhan, Mansour; Nazerieh, Alireza; Garoucy, Sahar

doi:10.1007/s00521-011-0643-1

Speech emotion recognition using FCBF feature selection method and GA-optimized fuzzy ARTMAP neural network

Original Article
Published: 28 May 2011

Volume 21, pages 2115–2126, (2012)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Davood Gharavian^1,2,
Mansour Sheikhan¹,
Alireza Nazerieh¹ &
…
Sahar Garoucy¹

1500 Accesses
69 Citations
Explore all metrics

Abstract

Emotion recognition from speech has noticeable applications in the speech-processing systems. In this paper, the effect of using a rich set of features including formant frequency related, pitch frequency related, energy, and the two first mel-frequency cepstral coefficients (MFCCs) on improving the performance of speech emotion recognition systems is investigated. To do this, the different sets of features are employed, and by using the fast correlation-based filter (FCBF) feature selection method, some efficient feature subsets are determined. Finally, to recognize the emotion from speech, fuzzy ARTMAP neural network (FAMNN) architecture is used. Also, the genetic algorithm (GA) is employed to determine optimum values of the choice parameter (α), the vigilance parameters (ρ _a, ρ _b, and ρ _ab), and the learning rate (β) of FAMNN. Experimental results show the improvement in emotion recognition rate of angry, happiness, and neutral states by using a subset of 25 selected features and the GA-optimized FAMNN-based emotion recognizer.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

MFCC Global Features Selection in Improving Speech Emotion Recognition Rate

Optimizing Fuzzy Inference Systems for Improving Speech Emotion Recognition

Recognition of emotion from speech using evolutionary cepstral coefficients

Article 18 September 2020

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Wang C, Seneff S (2000) Robust pitch tracking for prosodic modeling in telephone speech. In: The proceedings of international conference on acoustics, speech, and signal processing, vol 3, pp 1343–1346
Yang B, Lugger M (2010) Emotion recognition from speech signals using new harmony features. Signal Process 90:1415–1423
Article MATH Google Scholar
Ai H, Litman DJ, Forbes-Riley K, Rotaru M, Tetreault J, Purandare A (2006) Using system and user performance features to improve emotion detection in spoken tutoring systems. In: The proceedings of Interspeech, pp 797–800
Devillers L, Vidrascu L (2006) Real-life emotions detection with lexical and paralinguistic cues on human–human call center dialogs. In: The proceedings of Interspeech, pp 801–804
Lee C-C, Mower E, Busso C, Lee S, Narayanan S (2009) Emotion recognition using a hierarchical binary decision tree approach. In: The proceedings of Interspeech, pp 320–323
Polzehl T, Sundaram S, Ketabdar H, Wagner M, Metze F (2009) Emotion classification in children’s speech using fusion of acoustic and linguistic features. In: The proceedings of Interspeech, pp 340–343
Klein J, Moon Y, Picard RW (2002) This computer responds to user frustration: theory, design and results. Interact Comput 14:119–140
Article Google Scholar
López-Cózar R, Silovsky J, Kroul M (2011) Enhancement of emotion detection in spoken dialogue systems by combining several information sources. Speech Commun (Article in Press, doi:10.1016/j.specom.2011.01.006)
Yacoub S, Simske S, Lin X, Burns J (2003) Recognition of emotions in interactive voice response systems. In: The proceedings of European conference on speech communication and technology, pp 729–732
Gharavian D, Ahadi SM (2005) The effect of emotion on Farsi speech parameters: a statistical evaluation. In: The proceedings of 10th international conference on speech and computer, pp 463–466
Gharavian D, Ahadi SM (2008) Stressed speech recognition using a warped frequency scale. IEICE Electron Express 5:187–191
Article Google Scholar
Oudeyer P-Y (2003) The production and recognition of emotions in speech: features and algorithms. Int J Hum Comput Interact Stud 59:157–183
Google Scholar
Huber R, Batliner A, Buckow J, Nöth E, Warnke V, Niemann H (2000) Recognition of emotion in a realistic dialogue scenario. In: The proceedings of ICSLP, pp 665–668
Lee CM, Narayanan S (2003) Emotion recognition using a data-driven fuzzy inference system. In: The proceedings of Eurospeech, pp 157–160
Litman DJ, Forbes-Riley K (2006) Recognizing student emotions and attitudes on the basis of utterances in spoken tutoring dialogues with both human and computer tutors. Speech Commun 48:559–590
Article Google Scholar
Ang J, Dhillon R, Krupski A, Shriberg E, Stolcke A (2002) Prosody-based automatic detection of annoyance and frustration in human–computer dialog. In: The proceedings of ICSLP, pp 2037–2039
Batliner A, Fischer K, Huber R, Spilker J, Nöth E (2003) How to find trouble in communication. Speech Commun 40:117–143
Article MATH Google Scholar
Sheikhan M, Gharavian D, Ashoftedel F (2011) Using DTW-neural based MFCC warping to improve emotional speech recognition. Neural Comput Appl (Article in Press, doi:10.1007/s00521-011-0620-8)
Kwon OW, Chan K, Hao J, Lee TW (2003) Emotion recognition by speech signals. In: The proceedings of European conference on speech communication and technology, pp 125–128
Lee CM, Narayanan SS (2005) Toward detecting emotions in spoken dialogs. IEEE Trans Speech Audio Process 13:293–303
Article Google Scholar
Ververidis D, Kotropoulos C (2006) Emotional speech recognition: resources, features, and methods. Speech Commun 48:1162–1181
Article Google Scholar
Altun H, Polat G (2007) New frameworks to boost feature selection algorithms in emotion detection for improved human-computer interaction. Brain vision and artificial intelligent. Lect Notes Comput Sci 4729:533–541
Article Google Scholar
Shami M, Verhelst W (2007) An evaluation of the robustness of existing supervised machine learning approaches to the classifications of emotions in speech. Speech Commun 49:201–212
Article Google Scholar
Altun H, Polat G (2009) Boosting selection of speech related features to improve performance of multi-class SVMs in emotion detection. Expert Syst Appl 36:8197–8203
Article Google Scholar
El Ayadi M, Kamel MS, Karray F (2011) Survey on speech emotion recognition: features, classification schemes, and databases. Pattern Recogn 44:572–587
Article MATH Google Scholar
Wu S, Falk TH, Chan W-P (2011) Automatic speech emotion recognition using modulation spectral features. Speech Commun 53:768–785
Article Google Scholar
Luengo I, Navas E, Hernáez I, Sanchez J (2005) Automatic emotion recognition using prosodic parameters. In: The proceedings of Interspeech, pp 493–496
Nwe TL, Foo SV, De Silva LC (2003) Speech emotion recognition using hidden Markov models. Speech Commun 41:603–623
Article Google Scholar
Dellaert F, Polzin T, Waibel A (1996) Recognizing emotion in speech. In: The proceedings of the international conference on spoken language processing, vol 3, pp 1970–1973
Han J, Kamber M (2000) Data mining concepts and techniques. Morgan Kaufman
Ververidis D, Kotropoulos C (2006) Fast sequential floating forward selection applied to emotional speech features estimated on DES and SUSAS data collections. In: The proceedings of European signal processing conference, pp 1–5
Haq S, Jackson PJB, Edge J (2008) Audio-visual feature selection and reduction for emotion classification. In: The proceedings of international conference on auditory-visual speech processing, pp 185–190
Neiberg D, Elenius K, Laskowski K (2006) Emotion recognition in spontaneous speech using GMMs. In: The proceedings of international conference on spoken language processing, pp 809–812
Nicholson J, Takahashi K, Nakatsu R (1999) Emotion recognition in speech using neural networks. In: The proceedings of the international conference on neural information processing, vol 2, pp 495–501
Lee CM, Narayanan S, Pieraccini R (2002) Combining acoustic and language information for emotion recognition. In: The proceedings of the international conference on spoken language processing, pp 873–876
Park CH, Lee DW, Sim KB (2002) Emotion recognition of speech based on RNN. In: The proceedings of the international conference on machine learning and cybernetics, vol 4, pp 2210–2213
Park CH, Sim KB (2003) Emotion recognition and acoustic analysis from speech signal. In: The proceedings of the international joint conference on neural networks, vol 4, pp 2594–2598
Yeh J-H, Pao T-L, Lin C-Y, Tsai Y-W, Chen Y-T (2010) Segment-based emotion recognition from continuous Mandarin Chinese speech. Comput Hum Behav (Article in Press, doi:10.1016/j.chb.2010.10.027)
Schuller B, Rigoll G, Lang M (2004) Speech emotion recognition combining acoustic features and linguistic information in a hybrid support vector machine-belief network architecture. In: The proceedings of the international conference on acoustics, speech, and signal processing, vol 1, pp 577–580
Chuang ZJ, Wu CH (2004) Emotion recognition using acoustic features and textual content. In: The proceedings of the international conference on multimedia and expo, vol 1, pp 53–56
Hoch S, Althoff F, McGlaun G, Rigooll G (2005) Bimodal fusion of emotional data in an automotive environment. In: The proceedings of the international conference on acoustics, speech, and signal processing, vol 2, pp 1085–1088
Kao Y, Lee L (2006) Feature analysis for emotion recognition from Mandarin speech considering the special characteristics of Chinese language. In: The proceedings of the international conference on spoken language processing, pp 1814–1817
Morrison D, Wang R, de Silva LC (2007) Ensemble methods for spoken emotion recognition in call-centers. Speech Commun 49:98–112
Article Google Scholar
Rong J, Li G, Phoebe Chen Y-P (2009) Acoustic feature selection for automatic emotion recognition from speech. Inf Process Manage 45:315–328
Article Google Scholar
Petrushin VA (2000) Emotion recognition in speech signal: experimental study, development, and application. In: The proceedings of the international conference on spoken language processing, pp 222–225
Pao T, Chen Y, Yeh J, Chang Y (2008) Emotion recognition and evaluation of Mandarin speech using weighted D-KNN classification. Int J Innov Comput Inf Control 4:1695–1709
Google Scholar
Väyrynen E, Toivanen J, Seppänen T (2011) Classification of emotion in spoken Finnish using vowel-length segments: increasing reliability with a fusion technique. Speech Commun 53:269–282
Article Google Scholar
Kockmann M, Burget L, Černocky JH (2011) Application of speaker- and language identification state-of-the-art techniques for emotion recognition. Speech Commun (Article in Press, doi:10.1016/j.specom.2011.01.007)
Schuller B, Rigoll G, Lang M (2003) Hidden Markov model-based speech emotion recognition. In: The proceedings of the international conference on acoustics, speech, and signal processing, vol 2, pp 1–4
Bosch L (2003) Emotions, speech and the ASR framework. Speech Commun 40:213–225
Article MATH Google Scholar
Song M, Bu J, Chen C, Li N (2004) Audio-visual based emotion recognition-a new approach. In: The proceedings of IEEE conference on computer vision and pattern recognition, vol 2, pp 1020–1025
Song M, Chen C, You M (2004) Audio-visual based emotion recognition using tripled hidden Markov model. In: The proceedings of the international conference on acoustics, speech, and signal processing, vol 5, pp 877–880
Barra-Chicote R, Fernández F, Lufti S, Lucas-Cuesta JM, Macías-Guarasa J, Montero JM, San-Segundo R, Pardo JM (2009) Acoustic emotion recognition using dynamic Bayesian networks and multi-space distributions. In: The proceedings of Interspeech, pp 336–339
Schuller B, Batliner A, Steidl S, Seppi D (2011) Recognising realistic emotions and affect in speech: state of the art and lessons learnt from the first challenge. Speech Commun (Article in Press, doi:10.1016/j.specom.2011.01.011)
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182
MATH Google Scholar
Sidorova J (2009) Speech emotion recognition with TGI + .2 classifier. In: The proceedings of the EACL, student research workshop, pp 54–60
Schuller B, Steidl S, Batliner A (2009) The INTERSPEECH 2009 emotion challenge. In: The proceedings of Interspeech, pp 1–4
Clavel C, Vasilescu I, Devillers L (2011) Fiction support for realistic portrayals of fear-type emotional manifestations. Comput Speech Lang 25:63–83
Article Google Scholar
Bijankhan M, Sheikhzadegan J, Roohani MR, Samareh Y, Lucas C, Tebiani M (1994) The speech database of Farsi spoken language. In: The proceedings of Australian international conference on speech science and technology, pp 826–831
Carpenter GA, Grossberg S, Markuzon N, Reynolds JH, Rosen DB (1992) Fuzzy ARTMAP: a neural network architecture for incremental supervised learning of analog multidimensional maps. IEEE Trans Neural Netw 3:698–713
Article Google Scholar
Fleuret F (2004) Fast binary feature selection with conditional mutual information. J Mach Learn Res 5:1531–1555
MathSciNet MATH Google Scholar
Goldberg DE (1989) Genetic algorithms in search optimization and learning. Addison Wesley
Albornoz EM, Milone DH, Rufiner HL (2011) Spoken emotion recognition using hierarchical classifiers. Comput Speech Lang 25:556–570
Article Google Scholar
Bitouk D, Verma R, Nenkova A (2010) Class-level spectral features for emotion recognition. Speech Commun 52:613–625
Article Google Scholar

Download references

Acknowledgment

This work is supported by Islamic Azad University-South Tehran Branch under a research project entitled as "Emotion Modelling to Improve Speech Recognition Accuracy in Farsi Language".

Author information

Authors and Affiliations

EE Department, Islamic Azad University, South Tehran Branch, Tehran, Iran
Davood Gharavian, Mansour Sheikhan, Alireza Nazerieh & Sahar Garoucy
EE Department, Shahid Abbaspour University, Tehran, Iran
Davood Gharavian

Authors

Davood Gharavian
View author publications
You can also search for this author inPubMed Google Scholar
Mansour Sheikhan
View author publications
You can also search for this author inPubMed Google Scholar
Alireza Nazerieh
View author publications
You can also search for this author inPubMed Google Scholar
Sahar Garoucy
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Davood Gharavian.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gharavian, D., Sheikhan, M., Nazerieh, A. et al. Speech emotion recognition using FCBF feature selection method and GA-optimized fuzzy ARTMAP neural network. Neural Comput & Applic 21, 2115–2126 (2012). https://doi.org/10.1007/s00521-011-0643-1

Download citation

Received: 15 January 2011
Accepted: 06 May 2011
Published: 28 May 2011
Issue Date: November 2012
DOI: https://doi.org/10.1007/s00521-011-0643-1

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Speech emotion recognition using FCBF feature selection method and GA-optimized fuzzy ARTMAP neural network

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

MFCC Global Features Selection in Improving Speech Emotion Recognition Rate

Optimizing Fuzzy Inference Systems for Improving Speech Emotion Recognition

Recognition of emotion from speech using evolutionary cepstral coefficients

Explore related subjects

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now