A novel speech emotion recognition algorithm based on wavelet kernel sparse classifier in stacked deep auto-encoder model

Wei, Pengcheng; Zhao, Yu

doi:10.1007/s00779-019-01246-9

A novel speech emotion recognition algorithm based on wavelet kernel sparse classifier in stacked deep auto-encoder model

Original Article
Published: 24 June 2019

Volume 23, pages 521–529, (2019)
Cite this article

Personal and Ubiquitous Computing Aims and scope Submit manuscript

Pengcheng Wei¹ &
Yu Zhao¹

1376 Accesses
20 Citations
1 Altmetric
Explore all metrics

Abstract

Since the contextual information has an important impact on the speaker’s emotional state, how to use emotion-related context information to conduct feature learning is a key problem. The existing speech emotion recognition algorithms achieve the relatively high recognition rate; these algorithms are not very good application to the real-life speech emotion recognition systems. Therefore, in order to address the abovementioned issues, a novel speech emotion recognition algorithm based on improved stacked kernel sparse deep model is proposed in this paper, which is based on auto-encoder, denoising auto-encoder, and sparse auto-encoder to improve the Chinese speech emotion recognition. The first layer of the structure uses a denoising auto-encoder to learn a hidden feature with a larger dimension than the dimension of the input features, and the second layer employs a sparse auto-encoder to learn sparse features. Finally, a wavelet-kernel sparse SVM classifier is applied to classify the features. The proposed algorithm is evaluated on the testing dataset, which contains the speech emotion data of spontaneous, non-prototypical, and long-term. The experimental results show that the proposed algorithm outperforms the existing state-of-the-art algorithms in speech emotion recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Statistical WavLM Embedding Features with Auto-Encoder for Speech Emotion Recognition

Speech emotion recognition with unsupervised feature learning

Article 13 May 2015

Multi-features Integration for Speech Emotion Recognition

References

Wang K, An N, Li BN, et al (2015) Speech emotion recognition using Fourier parameters. IEEE Trans Affect Comput 6(1):69–75
Article Google Scholar
Fayek HM, Lech M, Cavedon L (2017) Evaluating deep learning architectures for speech emotion recognition. Neural Netw 92:60–68. S089360801730059X
Motamed S, Setayeshi S, Rabiee A (2017) Speech emotion recognition based on a modified brain emotional learning model. Biol Inspired Cogn Architectures 19:32–38
Article Google Scholar
Liu ZT, Wu M, Cao WH, et al (2017) Speech emotion recognition based on feature selection and extreme learning machine decision tree. Neurocomputing 273:271–280. S0925231217313565
Avila AR, Momin Z. Santos AJF, O'Shaughnessy D, Falk TH (2018) Feature pooling of modulation spectrum features for improved speech emotion recognition in the wild. In: IEEE Transactions on Affective Computing. https://doi.org/10.1109/TAFFC.2018.2858255
Mohammadi Z, Frounchi J, Amiri M (2016) Wavelet-based emotion recognition system using eeg signal. Neural Comput Applic 12(2):112–134
Liu ZT, Xie Q, Wu M, et al (2018) Speech emotion recognition based on an improved brain emotion learning model. Neurocomputing 309:145–156. https://doi.org/10.1016/j.neucom.2018.05.005
Darekar RV, Dhande A (2018) Emotion recognition from Marathi speech database using adaptive artificial neural network. Biol Cogn Architectures 23:35–42. S2212683X17301214
Yogesh CK, Hariharan M, Ngadiran R, et al (2017) Hybrid BBO_PSO and higher order spectral features for emotion and stress recognition from natural speech. Appl Soft Comput 56:217–232
Article Google Scholar
Jain N, Kumar S, Kumar A, et al (2018) Hybrid deep neural networks for face emotion recognition. Pattern Recogn Lett 115:101–106. S0167865518301302
He X, Zhang W (2018) Emotion recognition by assisted learning with convolutional neural networks. Neurocomputing 291:187–194. S0925231218302406
Article Google Scholar
Xia R, Liu Y (2017) A multi-task learning framework for emotion recognition using 2D continuous space. In: IEEE Transactions on affective computing, vol 8, no 1, pp 3–14
Xu B, Fu Y, Jiang YG, Li B, Sigal L, et al (2018) Heterogeneous knowledge transfer in video emotion recognition, attribution and summarization. IEEE Trans Affect Comput 9(2):255–270
Article Google Scholar
Li J, Zhang Z, He H (2017) Hierarchical convolutional neural networks for EEG-based emotion recognition. Cogn Comput 10(2):368–380
Article Google Scholar
Torres-Valencia C, álvarez-López M, Orozco-Gutiérrez á (2017) SVM-based feature selection methods for emotion recognition from multimodal data. J Multimodal User Interfaces 11(1):9–23
Article Google Scholar
Mo S, Niu J, Su Y, Das SK, et al (2018) A novel feature set for video emotion recognition. Neurocomputing 291:11–20
Article Google Scholar
Ruiz-Garcia A, Elshaw M, Altahhan A, Palade V, et al (2018) A hybrid deep learning neural approach for emotion recognition from facial expressions for socially assistive robots. Neural Comput Applic 29:359–373
Article Google Scholar
Boubenna H, Lee D (2018) Image-based emotion recognition using evolutionary algorithms. Biol Inspired Cogn Architectures 24:70–76. S2212683X18300185
Zhang T, Zheng W, Cui Z, Zong Y, Li Y (2017) Spatial–temporal recurrent neural network for emotion recognition. IEEE Transactions on Cybernetics. https://doi.org/10.1109/TCYB.2017.2788081
García HF, álvarez MA, Orozco á A (2017) Dynamic facial landmarking selection for emotion recognition using Gaussian processes. J Multimodal User Interfaces 11(4):327–340
Article Google Scholar
Mistry K, Zhang L, Neoh SC, et al (2016) A micro-GA embedded PSO feature selection approach to intelligent facial emotion recognition. IEEE Trans Cybern 47(6):1–14
Google Scholar
Zhong Y, Yongxiong W, Li L, et al (2017) Cross-subject EEG feature selection for emotion recognition using transfer recursive feature elimination. Front Neurorobot 11:19
Google Scholar
Lee SH, Ro YM (2017) Partial matching of facial expression sequence using over-complete transition dictionary for emotion recognition. IEEE Trans Affect Comput 7(4):389–408
Article Google Scholar
Jacob A (2016) Speech emotion recognition based on minimal voice quality features. In: 2016 International Conference on Communication and Signal Processing (ICCSP), IEEE, Melmaruvathur, pp 0886–0890
Schuller B, Rigoll G, Lang M (2003) Hidden Markov model-based speech emotion recognition. In: Proceedings of the 2003 International Conference on Multimedia and Expo - Volume 2, ICME 2003. IEEE Computer Society, pp 401–404
Zhou J, Wang G, Yang Y, Chen P (2006) Speech Emotion Recognition Based on Rough Set and SVM. In: Proceeding of Fifth IEEE International Conference on Cognitive Informatics. IEEE Computer Society Press, Los Alamitos, pp 53–61
Lim W, Jang D, Lee T (2016) Speech emotion recognition using convolutional and Recurrent Neural Networks. In: 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA), pp 1–4
Neumann M, Vu NT (2017) Attentive convolutional neural network based speech emotion recognition: a study on the impact of input deatures, signal length, and acted speech. Comput Therm Sci 12:52
Google Scholar
Huang Z, Xue W, Mao Q, Zhan Y, et al (2017) Unsupervised domain adaptation for speech emotion recognition using PCANet. Multimed Tools Appl 76(5):6785–6799
Article Google Scholar
Mirsamadi S, Barsoum E, Zhang C (2017) Automatic speech emotion recognition using recurrent neural networks with local attention. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp 2227–2231

Download references

Funding

This work was supported by Chongqing Big Data Engineering Laboratory for Children, Chongqing Electronics Engineering Technology Research Center for Interactive Learning, and Project of Science and Technology Research Program of Chongqing Education Commission of China (N0. KJZD-K201801601).

Author information

Authors and Affiliations

School of Mathematics and Information Engineering, Chongqing University of Education, Chongqing, China
Pengcheng Wei & Yu Zhao

Authors

Pengcheng Wei
View author publications
You can also search for this author inPubMed Google Scholar
Yu Zhao
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Pengcheng Wei.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wei, P., Zhao, Y. A novel speech emotion recognition algorithm based on wavelet kernel sparse classifier in stacked deep auto-encoder model. Pers Ubiquit Comput 23, 521–529 (2019). https://doi.org/10.1007/s00779-019-01246-9

Download citation

Received: 20 April 2019
Accepted: 11 June 2019
Published: 24 June 2019
Issue Date: 17 July 2019
DOI: https://doi.org/10.1007/s00779-019-01246-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A novel speech emotion recognition algorithm based on wavelet kernel sparse classifier in stacked deep auto-encoder model

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Statistical WavLM Embedding Features with Auto-Encoder for Speech Emotion Recognition

Speech emotion recognition with unsupervised feature learning

Multi-features Integration for Speech Emotion Recognition

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now