A Pattern Mining Approach in Feature Extraction for Emotion Recognition from Speech

Avci, Umut; Akkurt, Gamze; Unay, Devrim

doi:10.1007/978-3-030-26061-3_6

A Pattern Mining Approach in Feature Extraction for Emotion Recognition from Speech

Umut Avci¹¹,
Gamze Akkurt¹² &
Devrim Unay¹³

Conference paper
First Online: 24 July 2019

1197 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11658))

Abstract

We address the problem of recognizing emotions from speech using features derived from emotional patterns. Because much work in the field focuses on using low-level acoustic features, we explicitly study whether high-level features are useful for classifying emotions. For this purpose, we convert a continuous speech signal to a discretized signal and extract discriminative patterns that are capable of distinguishing distinct emotions from each other. Extracted patterns are then used to create a feature set to be fed into a classifier. Experimental results show that patterns alone are good predictors of emotions. When used to build a classifier, pattern features achieve accuracy gains up to 25% compared to state-of-the-art acoustic features.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Alex, S.B., Babu, B.P., Mary, L.: Utterance and syllable level prosodic features for automatic emotion recognition. In: 2018 (RAICS), pp. 31–35 (2018). https://doi.org/10.1109/RAICS.2018.8635059
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2(3), 27 (2011)
Google Scholar
Chibelushi, C.C., Bourel, F.: Facial expression recognition: a brief tutorial overview. In: CVonline: (OLCCV), vol. 9 (2003)
Google Scholar
Han, K., Yu, D., Tashev, I.: Speech emotion recognition using deep neural network and extreme learning machine. In: INTERSPEECH, pp. 223–227 (2014)
Google Scholar
Hossain, M.S.: Patient state recognition system for healthcare using speech and facial expressions. J. Med. Syst. 40(12), 1–8 (2016). https://doi.org/10.1007/s10916-016-0627-x
Article Google Scholar
Hsu, C.W., Lin, C.J.: A comparison of methods for multiclass support vector machines. IEEE Trans. Neural Netw. 13(2), 415–425 (2002)
Article Google Scholar
Jacob, A.: Speech emotion recognition based on minimal voice quality features. In: 2016 (ICCSP), pp. 0886–0890 (2016). https://doi.org/10.1109/ICCSP.2016.7754275
Khan, A., Roy, U.K.: Emotion recognition using prosodie and spectral features of speech and Naïve Bayes classifier. In: 2017(WiSPNET), pp. 1017–1021 (2017). https://doi.org/10.1109/WiSPNET.2017.8299916
Kim, W., Hansen, J.H.L.: Angry emotion detection from real-life conversational speech by leveraging content structure. In: 2010 IEEE (ICASSP), pp. 5166–5169 (2010). https://doi.org/10.1109/ICASSP.2010.5495021
Liu, Y., Zheng, Y.F.: One-against-all multi-class SVM classification using reliability measures. In: Proceedings of 2005 IEEE International Joint Conference on Neural Networks, vol. 2, pp. 849–854. IEEE (2005)
Google Scholar
Livingstone, S.R., Russo, F.A.: The Ryerson audio-visual database of emotional speech and song (RAVDESS): a dynamic, multimodal set of facial and vocal expressions in north American English. PLoS ONE 13(5), e0196391 (2018)
Article Google Scholar
Milgram, J., Cheriet, M., Sabourin, R.: “one against one” or “one against all”: Which one is better for handwriting recognition with SVMs? In: Tenth International Workshop on (FHR). Suvisoft (2006)
Google Scholar
Neiberg, D., Elenius, K., Laskowski, K.: Emotion recognition in spontaneous speech using GMMs. In: INTERSPEECH, pp. 809–812 (2006)
Google Scholar
Nicholson, J., Takahashi, K., Nakatsu, R.: Emotion recognition in speech using neural networks. Neural Comput. Appl. 9(4), 290–296 (2000). https://doi.org/10.1007/s005210070006
Article MATH Google Scholar
Nwe, T.L., Foo, S.W., Silva, L.C.D.: Speech emotion recognition using hidden Markov models. Speech Commun. 41(4), 603–623 (2003). https://doi.org/10.1016/S0167-6393(03)00099-2
Article Google Scholar
Pervaiz, M., Khan, T.A.: Emotion recognition from speech using prosodic and linguistic features. Int. J. Adv. Comput. Sci. Appl. 7(8), 84–90 (2016)
Google Scholar
Platt, J.C., Cristianini, N., Shawe-Taylor, J.: Large margin DAGs for multiclass classification. In: ANIPS, pp. 547–553 (2000)
Google Scholar
Rabiner, L.R.: Applications of speech recognition in the area of telecommunications. In: 1997 IEEE WASRUP, pp. 501–510 (1997). https://doi.org/10.1109/ASRU.1997.659129
Rieger, S.A., Muraleedharan, R., Ramachandran, R.P.: Speech based emotion recognition using spectral feature extraction and an ensemble of KNN classifiers. In: The 9th International Symposium on Chinese Spoken Language Processing, pp. 589–593 (2014). https://doi.org/10.1109/ISCSLP.2014.6936711
Schmitt, M., Ringeval, F., Schuller, B.: At the border of acoustics and linguistics: bag-of-audio-words for the recognition of emotions in speech. In: Interspeech 2016, pp. 495–499 (2016). https://doi.org/10.21437/Interspeech.2016-1124
Schuller, B., Batliner, A., Steidl, S., Seppi, D.: Emotion recognition from speech: putting ASR in the loop. In: 2009 IEEE (ICASSP), pp. 4585–4588. IEEE (2009)
Google Scholar
Shan, C., Gong, S., McOwan, P.W.: Robust facial expression recognition using local binary patterns. In: IEEE ICIP 2005, vol. 2, p. II-370 (2005). https://doi.org/10.1109/ICIP.2005.1530069
Sundberg, J., Patel, S., Björkner, E., Scherer, K.R.: Interdependencies among voice source parameters in emotional speech. IEEE Trans. Affect. Comput. 2, 162–174 (2011)
Article Google Scholar
Tiwari, A., Falk, T.H.: Fusion of Motif- and spectrum-related features for improved EEG-based emotion recognition. Comput. Intell. Neurosci. 2019, 1–14 (2019). https://doi.org/10.1155/2019/3076324
Article Google Scholar
Wald, M.: Using automatic speech recognition to enhance education for all students: turning a vision into reality. In: PFE 35th Annual Conference, p. S3G (2005). https://doi.org/10.1109/FIE.2005.1612286
Wongthanavasu, T.S.S.: Speech emotion recognition using support vector machines. In: 5th International Conference (KST), pp. 86–91 (2013). https://doi.org/10.1109/KST.2013.6512793
Yang, H., Duan, L., Hu, B., Deng, S., Wang, W., Qin, P.: Mining top-k distinguishing sequential patterns with gap constraint. J. Softw. 26(11), 2994–3009 (2015)
MathSciNet MATH Google Scholar
Zhang, B., Essl, G., Provost, E.M.: Recognizing emotion from singing and speaking using shared models. In: 2015 International Conference on (ACII), pp. 139–145. IEEE (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Engineering, Department of Software Engineering, Yasar University, Bornova, Izmir, Turkey
Umut Avci
Faculty of Engineering, Department of Computer Engineering, Izmir University of Economics, Balcova, Izmir, Turkey
Gamze Akkurt
Faculty of Engineering, Department of Biomedical Engineering, Izmir University of Economics, Balcova, Izmir, Turkey
Devrim Unay

Authors

Umut Avci
View author publications
You can also search for this author in PubMed Google Scholar
Gamze Akkurt
View author publications
You can also search for this author in PubMed Google Scholar
Devrim Unay
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Umut Avci .

Editor information

Editors and Affiliations

Utrecht University, Utrecht, The Netherlands
Albert Ali Salah
St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences, St. Petersburg, Russia
Alexey Karpov
Moscow State Linguistic University, Moscow, Russia
Rodmonga Potapova

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Avci, U., Akkurt, G., Unay, D. (2019). A Pattern Mining Approach in Feature Extraction for Emotion Recognition from Speech. In: Salah, A., Karpov, A., Potapova, R. (eds) Speech and Computer. SPECOM 2019. Lecture Notes in Computer Science(), vol 11658. Springer, Cham. https://doi.org/10.1007/978-3-030-26061-3_6

Download citation

DOI: https://doi.org/10.1007/978-3-030-26061-3_6
Published: 24 July 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-26060-6
Online ISBN: 978-3-030-26061-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics