Skip to main content

A Comparative Study on MFCC and Fundamental Frequency Based Speech Emotion Classification

  • Conference paper
  • First Online:
Distributed Computing and Intelligent Technology (ICDCIT 2022)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 13145))

Abstract

Speech emotion recognition and classification is one of the most important and emerging fields in artificial intelligence. It has various uses in different applications starting from medical science to smart home devices. Input feature selection is a very important part of speech processing. Mel Frequency Cepstral Coefficients is the most widely used features in the processing of audio data. In case of processing of emotion related data, the fundamental frequency also plays an important role. In this study a comparative analysis has been conducted to determine the better feature in the field of emotion classification. Emo-Db database was used for the study. For classification task the Support Vector Machine classifier with the radial basis and sigmoid function kernel has been used. The model was trained with both the audio features and the performances were compared. Better performance was observed with Mel Frequency Cepstral Coefficients which ensures the better performing speech features in emotion classification task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Chaspari, T., Dimitriadis, D., Maragos, P.: Emotion classification of speech using modulation features. In: 2014 22nd European Signal Processing Conference (EUSIPCO), pp. 1552–1556 (2014)

    Google Scholar 

  2. de Cheveigné, A., Kawahara, H.: Yin, a fundamental frequency estimator for speech and music. J. Acous. Soc Am. 111(4), 1917–1930 (2002). https://doi.org/10.1121/1.1458024

    Article  Google Scholar 

  3. Dahake, P.P., Shaw, K., Malathi, P.: Speaker dependent speech emotion recognition using MFCC and support vector machine. In: 2016 International Conference on Automatic Control and Dynamic Optimization Techniques (ICACDOT), pp. 1080–1084 (2016). https://doi.org/10.1109/ICACDOT.2016.7877753

  4. Kathiresan, T., Dellwo, V.: Cepstral derivatives in MFCCs for emotion recognition. In: 2019 IEEE 4th International Conference on Signal and Image Processing (ICSIP), pp. 56–60 (2019). https://doi.org/10.1109/SIPROCESS.2019.8868573

  5. Kuchibhotla, S., Vankayalapati, H.D., Vaddi, R.S., Anne, K.R.: A comparative analysis of classifiers in emotion recognition through acoustic features. Int. J. Speech Technol. 17(4), 401–408 (2014). https://doi.org/10.1007/s10772-014-9239-3

    Article  Google Scholar 

  6. McRoberts, G.W., Studdert-Kennedy, M., Shankweiler, D.P.: The role of fundamental frequency in signaling linguistic stress and affect: evidence for a dissociation. Percept. Psychophys. 57(2), 159–174 (1995)

    Google Scholar 

  7. Mohanta, A., Mittal, V.K.: Classifying emotional states using pitch and formants in vowel regions. In: 2016 International Conference on Signal Processing and Communication (ICSC), pp. 458–463 (2016). https://doi.org/10.1109/ICSPCom.2016.7980624

  8. Seehapoch, T., Wongthanavasu, S.: Speech emotion recognition using support vector machines. In: 2013 5th International Conference on Knowledge and Smart Technology (KST), pp. 86–91 (2013). https://doi.org/10.1109/KST.2013.6512793

  9. Solera-Ureña, R., Padrell-Sendra, J., Martín-Iglesias, D., Gallardo-Antolín, A., Peláez-Moreno, C., Díaz-de-María, F.: SVMs for automatic speech recognition: a survey. In: Stylianou, Y., Faundez-Zanuy, M., Esposito, A. (eds.) Progress in Nonlinear Speech Processing. LNCS, vol. 4391, pp. 190–216. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-71505-4_11

    Chapter  Google Scholar 

  10. Williams, C.E., Stevens, K.N.: Emotions and speech: some acoustical correlates. J. Acous. Soc. Am. 52(4B), 1238–1250 (1972)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Shah, A., Bhowmik, T. (2022). A Comparative Study on MFCC and Fundamental Frequency Based Speech Emotion Classification. In: Bapi, R., Kulkarni, S., Mohalik, S., Peri, S. (eds) Distributed Computing and Intelligent Technology. ICDCIT 2022. Lecture Notes in Computer Science(), vol 13145. Springer, Cham. https://doi.org/10.1007/978-3-030-94876-4_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-94876-4_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-94875-7

  • Online ISBN: 978-3-030-94876-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics