Emotions recognition: different sets of features and models

Revathi, A.; Jeyalakshmi, C.

doi:10.1007/s10772-018-9533-6

Emotions recognition: different sets of features and models

Published: 17 July 2018

Volume 22, pages 473–482, (2019)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

A. Revathi¹ &
C. Jeyalakshmi²

250 Accesses
3 Citations
Explore all metrics

Abstract

The better and effective human–machine communication is ensured by performing affective computing. In the recent years, healthy research has been progressing in recognizing emotions by using various databases. This paper mainly emphasizes the effectiveness on the basis of using different sets of features and modeling techniques in evaluating the performance of multiple speaker-independent and speaker-dependent emotion recognition systems. It has become a challenging task to improve the performance of emotion recognition system, since EMO-DB Berlin database used in this work contains only ten speeches uttered by ten speakers in different emotions namely, Anger, Boredom, Disgust, Fear, Happiness, Sadness and Neutral. Speaker dependent and independent emotion recognition is done by creating models using clustering technique, Gaussian mixture modeling (GMM) and continuous density hidden Markov modeling (CDHMM) techniques for all emotions. The emotion recognition system is also evaluated for mel frequency cepstrum (MFCC) and concatenated MFCC with probability & shifted delta cepstrum (SDC), mel frequency linear predictive cepstrum (MFPLPC) and concatenated MFPLPC with probability & SDC and formants for clustering used as a modeling technique. These features provide complementary evidence in assessing the performance of the system based on VQ based clustering technique. This algorithm provides 99 and 100% as overall weighted accuracy recall (WAR) for performance evaluation with respect to correct identification of emotion for any one feature and modeling technique.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Data availability

All relevant data are within the paper and its supporting information files.

References

Anagnostopoulos, C. N., Iliou, T., & Giannoukos, I. (2015). Features and classifiers for emotion recognition from speech: A survey from 2000 to 2011. Artificial Intelligence Review, 43, 155–177.
Article Google Scholar
Hermansky, H., & Morgan, N. (1994). RASTA processing of speech. IEEE Transactions on Speech and Audio Processing, 2(4), 578–589.
Article Google Scholar
Hermansky, H., Morgan, N., Bayya, A., & Kohn, P. (1991). The challenge of inverse E: The RASTA PLP method. Proceeding Twenty-fifth Asilomar conferene on signals, systems and computers (pp. 800–804) Pacific Grove, CA, IEEE. https://ieeexplore.ieee.org/document/186557/.
Hermansky, H., Tsuga, K., Makino, S., & Wakita, H. (1986). Perceptually based processing in automatic speech recognition. Proceedings IEEE international conference on acoustics, speech and signal processing (pp. 1971–1974). https://ieeexplore.ieee.org/document/1168649/.
Iliou, T., & Anagnostopoulos, C. N. (2009). Comparison of different classifiers for emotion recognition. Proceedings of 13th panhellenic conference on informatics (pp. 102–106).
Jeyalakshmi, C., Revathi, A., & Venkataramani, Y. (2016). Integrated models and features based speaker independent emotion recognition. The International Journal of Telemedicine and Clinical Practices, 1(3), 271–291.
Article Google Scholar
Jing, S., Mao, X., & Chen, L. (2018). Prominence features: Effective emotional features for speech emotion. Digital Signal Processing, 72, 216–231.
Article Google Scholar
Kohler, M. A., & Kennedy, M. (2002). Language identification using shifted delta cepstra. IEEE 45th midwest symposium on circuits and systems (pp. 69–72). https://ieeexplore.ieee.org/document/1186972/.
Lee, C. C., Mower, E., Busso, C., Lee, S., & Narayanan, S. (2011). Emotion recognition using a hierarchical binary decision tree approach. Speech Communication, 53, 1162–1171.
Article Google Scholar
Morrison, D., Wang, R., & De Silva, L. C. (2007). Ensemble methods for spoken emotion recognition in call-centres. Speech Communication, 49, 98–112.
Article Google Scholar
Murty, K. S. R., & Yegnanarayana, B. (2006). Combining evidence from residual phase and MFCC features for speaker recognition”. IEEE Signal Processing Letters, 13(1), 52–55.
Article Google Scholar
Nwe, T. L., Foo, S. W., & De Silva, L. C. (2003). Speech emotion recognition using hidden Markov models. Speech Communication, 41, 603–623.
Article Google Scholar
Patel, P., Chaudhari, A., Kale, R., & Pund, M. (2017). Emotion recognition from speech with gaussian mixture models & via boosted GMM. International Journal of Research in Science & Engineering, 3(2), 47–53.
Google Scholar
Rabiner, L., & Juang, B. H. (1993). Fundamentals of speech recognition. NJ: Prentice Hall.
Google Scholar
Rao, K. S., Kumar, T. P., Anusha, K., Leela, B., Bhavana, I., & Gowtham, S.V.S.K. (2012). Emotion recognition from speech. International Journal of Computer Science and Information Technologies, 3(2), 3603–3607.
Google Scholar
Revathi, A., & Venkataramani, Y. (2011). Perceptual features based continuous speech recognition in additive noise environment using various modeling techniques. STM Journals on Current Trends in Signal Processing, 2(3), 1–15.
Google Scholar
Sapra, A., Panwar, N., & Panwar, S. (2013). Emotion recognition from speech. International Journal of Emerging Technology and Advanced Engineering, 3(2), 341–345.
Google Scholar
Shahin, I. (2009). Speaker identification in emotional environments. Iranian Journal of Electrical and Computer Engineering, Winter-Spring 2009, 8(1), 41–46.
Google Scholar
Shashidhar, G. K., Sharma, K., & Rao, K. S. (2012). Speaker recognition in emotional environment. Communications in Computer and Information Science, 305, 117–124.
Article Google Scholar
Shinde, S., & Pande, S. (2012). A survey on: Emotion recognition with respect to database and various recognition techniques. International Journal of Computer Applications, 58(3), 9–12.
Article Google Scholar
Vogt, T., & Andre, E. (2006). Improving automatic emotion recognition from speech via gender differentiation. In Proceedings of language resources and evaluation conference, 2006 (LREC 2006). https://www.informatik.uni-augsburg.de/lehrstuehle/hcm/publications/2006-LREC/lrec06.pdf.
Wua, S., Falk, T. H., & Chan, W. Y. (2011). Automatic speech emotion recognition using modulation spectral features. Speech Communication, 53, 768–785.
Article Google Scholar
Yogesh, C. K., Hariharan, M., Ngadiran, R., Adom, A. H., Yaacob, S., Berkai, C., et al. (2017). A new hybrid PSO assisted biogeography-based optimization for emotion and stress recognition from speech signal. Expert Systems with Applications, 69, 149–158.
Article Google Scholar
Yu, D., & Tashev, I. (2014). Speech emotion recognition using deep neural network and extreme learning machine. INTERSPEECH (pp. 223–226). https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/IS140441.pdf.
Zhang, Z., Coutinho, E., Deng, J., & Schuller, B. (2015). Cooperative learning and its application to emotion recognition from speech. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 23(1), 115–126.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of ECE/SEEE, SASTRA Deemed University, Thanjavur, India
A. Revathi
K.Ramakrishnan College of Engineering, Samayapuram, Trichy, India
C. Jeyalakshmi

Authors

A. Revathi
View author publications
You can also search for this author in PubMed Google Scholar
C. Jeyalakshmi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to C. Jeyalakshmi.

Ethics declarations

Conflict of interest

The authors have declared that no competing interest exists.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Revathi, A., Jeyalakshmi, C. Emotions recognition: different sets of features and models. Int J Speech Technol 22, 473–482 (2019). https://doi.org/10.1007/s10772-018-9533-6

Download citation

Received: 15 July 2017
Accepted: 07 July 2018
Published: 17 July 2018
Issue Date: September 2019
DOI: https://doi.org/10.1007/s10772-018-9533-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Emotions recognition: different sets of features and models

Abstract

Access this article

Similar content being viewed by others

Facial emotion recognition using convolutional neural networks (FERC)

Human emotion recognition from EEG-based brain–computer interface using machine learning: a comprehensive review

Role of machine learning and deep learning techniques in EEG-based BCI emotion recognition system: a review

Data availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Emotions recognition: different sets of features and models

Abstract

Access this article

Similar content being viewed by others

Facial emotion recognition using convolutional neural networks (FERC)

Human emotion recognition from EEG-based brain–computer interface using machine learning: a comprehensive review

Role of machine learning and deep learning techniques in EEG-based BCI emotion recognition system: a review

Data availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation