Language dialect based speech emotion recognition through deep learning techniques

Rajendran, Sukumar; Mathivanan, Sandeep Kumar; Jayagopal, Prabhu; Venkatasen, Maheshwari; Pandi, Thanapal; Sorakaya Somanathan, Manivannan; Thangaval, Muthamilselvan; Mani, Prasanna

doi:10.1007/s10772-021-09838-8

Language dialect based speech emotion recognition through deep learning techniques

Published: 22 April 2021

Volume 24, pages 625–635, (2021)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

Sukumar Rajendran¹,
Sandeep Kumar Mathivanan¹,
Prabhu Jayagopal¹,
Maheshwari Venkatasen¹,
Thanapal Pandi¹,
Manivannan Sorakaya Somanathan¹,
Muthamilselvan Thangaval¹ &
…
Prasanna Mani¹

599 Accesses
10 Citations
Explore all metrics

Abstract

The primordial way of communication is through vocal signals, which pave the way for support between individuals in a social structure. Computer applications provide a way to create Automatic Speech Recognition (ASR) with a combination of Speech Emotion Recognition (SER) to detect and identify emotions in the speech signals. The semantic relatedness of words with abstract concepts proves to be complicated than concrete ideas. An ensemble of different clustering techniques is utilized to automatically segregate sense distinctions in the various dialects of sentences spoken to tackle this issue. The interpretation of word sense of a word may change with time and group of people. The proposed model maps characters to word sense with weights provided by Senticnet with trial-and-error methods and tuning. The proposed model utilizes stop words to distinguish word senses with 72.78% accuracy for regional dialects.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improved Speech Emotion Classification Using Deep Neural Network

Article 29 July 2023

Mariwan Hama Saeed

Emotion Recognition from Speech Using Deep Neural Network

An efficient algorithm for recognition of emotions from speaker and language independent speech using deep learning

Article 20 January 2021

Youddha Beer Singh & Shivani Goel

References

Akçay, M. B., & Oğuz, K. (2020). Speech emotion recognition: Emotional models, databases, features, preprocessing methods, supporting modalities, and classifiers. Speech Communication, 116, 56–76.
Article Google Scholar
Bakhshi, A., Chalup, S., Harimi, A., & Mirhassani, S. M. (2020). Recognition of emotion from speech using evolutionary cepstral coefficients. Multimedia Tools and Applications, 79(2), 1–21.
Google Scholar
Bernard, M., Thiolliere, R., Saksida, A., Loukatou, G. R., Larsen, E., Johnson, M., Fibla, L., Dupoux, E., Daland, R., Cao, X. N., et al. (2020). WordSeg: Standardizing unsupervised word form segmentation from text. Behavior Research Methods, 52(1), 264–278.
Article Google Scholar
Christy, A., Vaithyasubramanian, S., Jesudoss, A., & Praveena, M. D. A. (2020). Multimodal speech emotion recognition and classification using convolutional neural network techniques. International Journal of Speech Technology, 23, 381–388 (2020)
Gaonkar, R., Kwon, H., Bastan, M., Balasubramanian, N., & Chambers, N. (2020). Modeling Label Semantics for Predicting Emotional Reactions. ArXiv Preprint. arXiv:2006.05489.
Grave, E., Bojanowski, P., Gupta, P., Joulin, A., & Mikolov, T. (2018). Learning word vectors for 157 languages. In Proceedings of the international conference on language resources and evaluation (LREC 2018).
Jermsittiparsert, K., Abdurrahman, A., Siriattakul, P., Sundeeva, L. A., Hashim, W., Rahim, R., & Maseleno, A. (2020). Pattern recognition and features selection for speech emotion recognition model using deep learning. International Journal of Speech Technology, 23(4), 1–8.
Article Google Scholar
Kunchukuttan, A., Kakwani, D., Golla, S., Gokul, N. C., Bhattacharyya, A., Khapra, M. M., & Kumar, P. (2020). AI4Bharat-IndicNLP Corpus: Monolingual Corpora and Word Embeddings for Indic Languages. ArXiv Preprint. arXiv:2005.00085.
Moselhy, A. M., & Abdelnaiem, A. A. (2013). LPC and MFCC performance evaluation with artificial neural network for spoken language identification. International Journal of Signal Processing, Image Processing and Pattern Recognition, 6(3), 55.
Google Scholar
Rajendran, S., & Jayagopal, P. (2020). Preserving learnability and intelligibility at the point of care with assimilation of different speech recognition techniques. International Journal of Speech Technology, 23(2), 265–276. https://doi.org/10.1007/s10772-020-09687-x.
Article Google Scholar
Shi, Y., Hwang, M.-Y., & Lei, X. (2019). End-to-end speech recognition using a high rank lstm-ctc based model. In ICASSP 2019–2019 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 7080–7084).
Shivaprasad, S., & Sadanandam, M. (2020). Identification of regional dialects of Telugu language using text independent speech processing models. International Journal of Speech Technology, 23, 251–258 (2020).
Stiennon, N., Ouyang, L., Wu, J., Ziegler, D. M., Lowe, R., Voss, C., Radford, A., Amodei, D., & Christiano, P. (2020). Learning to summarize from human feedback. ArXiv Preprint. arXiv:2009.01325.
Tavares, A. R., Avelar, P., Flach, J. M., Nicolau, M., Lamb, L. C., & Vardi, M. (2020). Understanding Boolean function learnability on deep neural networks. ArXiv Preprint. arXiv:2009.05908
Xu, Q., Likhomanenko, T., Kahn, J., Hannun, A., Synnaeve, G., & Collobert, R. (2020). Iterative pseudo-labeling for speech recognition. Computation and Language. arXiv Preprint. arXiv:2005.09267.
Yang, Y., Yuan, S., Cer, D., Kong, S.-Y., Constant, N., Pilar, P., Ge, H., Sung, Y.-H., Strope, B., & Kurzweil, R. (2018). Learning semantic textual similarity from conversations. ArXiv Preprint. arXiv:1804.07754.
Yoon, S., Byun, S., & Jung, K. (2018). Multimodal speech emotion recognition using audio and text. In 2018 IEEE spoken language technology workshop (SLT) (pp. 112–118).
Yu, C., Kang, M., Chen, Y., Wu, J., & Zhao, X. (2020). Acoustic modeling based on deep learning for low-resource speech recognition: An overview. IEEE Access. https://doi.org/10.1109/ACCESS.2020.3020421.
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Information Technology and Engineering, VIT, Vellore, India
Sukumar Rajendran, Sandeep Kumar Mathivanan, Prabhu Jayagopal, Maheshwari Venkatasen, Thanapal Pandi, Manivannan Sorakaya Somanathan, Muthamilselvan Thangaval & Prasanna Mani

Authors

Sukumar Rajendran
View author publications
You can also search for this author in PubMed Google Scholar
Sandeep Kumar Mathivanan
View author publications
You can also search for this author in PubMed Google Scholar
Prabhu Jayagopal
View author publications
You can also search for this author in PubMed Google Scholar
Maheshwari Venkatasen
View author publications
You can also search for this author in PubMed Google Scholar
Thanapal Pandi
View author publications
You can also search for this author in PubMed Google Scholar
Manivannan Sorakaya Somanathan
View author publications
You can also search for this author in PubMed Google Scholar
Muthamilselvan Thangaval
View author publications
You can also search for this author in PubMed Google Scholar
Prasanna Mani
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sandeep Kumar Mathivanan.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rajendran, S., Mathivanan, S., Jayagopal, P. et al. Language dialect based speech emotion recognition through deep learning techniques. Int J Speech Technol 24, 625–635 (2021). https://doi.org/10.1007/s10772-021-09838-8

Download citation

Received: 28 September 2020
Accepted: 31 March 2021
Published: 22 April 2021
Issue Date: September 2021
DOI: https://doi.org/10.1007/s10772-021-09838-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Language dialect based speech emotion recognition through deep learning techniques

Abstract

Access this article

Similar content being viewed by others

Improved Speech Emotion Classification Using Deep Neural Network

Emotion Recognition from Speech Using Deep Neural Network

An efficient algorithm for recognition of emotions from speaker and language independent speech using deep learning

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Abstract

Access this article

Similar content being viewed by others

Improved Speech Emotion Classification Using Deep Neural Network

Emotion Recognition from Speech Using Deep Neural Network

An efficient algorithm for recognition of emotions from speaker and language independent speech using deep learning

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation