Skip to main content
Log in

Application of fuzzy C-means clustering algorithm to spectral features for emotion classification from speech

  • New Trends in data pre-processing methods for signal and image classification
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

In the present study, emotion recognition from speech signals was performed by using the fuzzy C-means algorithm. Spectral features obtained from speech signals were used as features. The spectral features used were Mel frequency cepstral coefficients and linear prediction coefficients. Certain statistical features were extracted from the spectral features obtained in the study. After the selection of the extracted features, cluster centers were identified by using type-1 fuzzy C-means (FCM) algorithm and used as input to the classifier. Supervised classifiers such as ANN, NB, kNN, and SVM were used for classification. In the study, all seven emotions of the EmoDB database were used. Of the features obtained, FCM clustering was applied to Mel coefficients and obtained clusters centers were used as input for classification. The results showed that using FCM for preprocessing aim increased the success rate. The comparison of the classification methods showed that the maximum success rate was obtained as 92.86% using the SVM classifier.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. France DJ, Shiavi RG (2000) Acoustical properties of speech as indicators of depression and suicidal risk. IEEE Trans Biomed Eng 47:829–837. doi:10.1109/10.846676

    Article  Google Scholar 

  2. Ma J, Jin H, Yang LT, Tsai JJ-P (2006) Ubiquitous intelligence and computing: third international conference, UIC 2006, Wuhan, China, September 3–6 proceedings (LNCS). Springer, Secaucus

    Book  Google Scholar 

  3. Nasukawa T, Nasukawa T, Yi J, Yi J (2003) Sentiment analysis: capturing favorability using natural language processing. In: Proceedings of the 2nd international conference on knowledge capture, pp 70–77. doi:10.1145/945645.945658

  4. Sönmez E, Aalbayrak S (2016) A facial component-based system for emotion classification. Turkish J Electr Eng Comput Sci 24:1663–1673

    Article  Google Scholar 

  5. Peters G, Weber R (2016) DCC—a framework for dynamic granular clustering. Granul Comput. doi:10.1007/s41066-015-0012-z

    Google Scholar 

  6. Yao Y (2016) A triarchic theory of granular computing. Granul Comput 1:145–157. doi:10.1007/s41066-015-0011-0

    Article  Google Scholar 

  7. Zhao X, Zhang S (2015) Spoken emotion recognition via locality-constrained kernel sparse representation. Neural Comput Appl 26(3):735–744

    Article  Google Scholar 

  8. Sun Y, Wen G, Wang J (2015) Weighted spectral features based on local Hu moments for speech emotion recognition. Biomed Signal Process Control 18:80–90. doi:10.1016/j.bspc.2014.10.008

    Article  Google Scholar 

  9. Karimi S, Sedaaghi MH (2016) How to categorize emotional speech signals with respect to the speaker’s degree of emotional intensity. Turkish J Electr Eng Comput Sci 24:1306–1324. doi:10.3906/elk-1312-196

    Article  Google Scholar 

  10. Cheng B (2011) Emotion recognition from physiological signals using AdaBoost. Commun Comput Inf Sci 224 CCIS:412–417. doi:10.1007/978-3-642-23214-5_54

    Google Scholar 

  11. Min F, Xu J (2016) Semi-greedy heuristics for feature selection with test cost constraints. Granul Comput 1:199–211. doi:10.1007/s41066-016-0017-2

    Article  Google Scholar 

  12. Eyben F, Wöllmer M, Schuller B (2010) Opensmile: the munich versatile and fast open-source audio feature extractor. Proc ACM Multimed. doi:10.1145/1873951.1874246

    Google Scholar 

  13. Milton A, Selvi ST (2014) Class-specific multiple classifiers scheme to recognize emotions from speech signals. Comput Speech Lang 28:727–742. doi:10.1016/j.csl.2013.08.004

    Article  Google Scholar 

  14. Nwe TL, Foo SW, De Silva LC (2003) Speech emotion recognition using hidden Markov models. Speech Commun 41:603–623. doi:10.1016/S0167-6393(03)00099-2

    Article  Google Scholar 

  15. Hanilçi C (2007) A comparative study of speaker recognition techniques, MSc, Uludag University, Bursa

  16. Albornoz EM, Milone DH, Rufiner HL (2011) Spoken emotion recognition using hierarchical classifiers. Comput Speech Lang 25:556–570. doi:10.1016/j.csl.2010.10.001

    Article  Google Scholar 

  17. Bozkurt E, Erzin E, Erdem ÇE, Erdem AT (2011) Formant position based weighted spectral features for emotion recognition. Speech Commun 53:1186–1197. doi:10.1016/j.specom.2011.04.003

    Article  Google Scholar 

  18. Song M, Wang Y (2016) A study of granular computing in the agenda of growth of artificial neural networks. Granul Comput. doi:10.1007/s41066-016-0020-7

    Google Scholar 

  19. Lingras P, Haider F, Triff M (2016) Granular meta-clustering based on hierarchical, network, and temporal connections. Granul Comput 1:71–92. doi:10.1007/s41066-015-0007-9

    Article  Google Scholar 

  20. El Ayadi M, Kamel MS, Karray F (2011) Survey on speech emotion recognition: features, classification schemes, and databases. Pattern Recognit 44:572–587. doi:10.1016/j.patcog.2010.09.020

    Article  MATH  Google Scholar 

  21. Kotropoulos C (2003) A state of the art review on emotional speech databases. In: 1st Richmedia conference, pp 109–119

  22. Burkhardt F, Paeschke A, Rolfes M et al (2005) A database of German emotional speech. In: 9th European conference on speech communication and technology, pp 3–6

  23. Becchetti C, Ricotti LP (2004) Speech recognition: theory an C++ implementation, 3rd edn. Wiley, New York, pp 125–135

    Google Scholar 

  24. Dunn JC (1973) A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. J Cybern 3:32–57

    Article  MathSciNet  MATH  Google Scholar 

  25. Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithms. Plenum Press, New York, p 4

    Book  MATH  Google Scholar 

  26. Bezdek JC (1983) Pattern recognition with fuzzy objective function algorithms. SIAM Rev 25:442. doi:10.1137/1025116

    Google Scholar 

  27. Bezdek JC, Ehrlich R, Full W (1984) FCM: the fuzzy C-means clustering algorithm. Comput Geosci 10(2–3):191–203

    Article  Google Scholar 

  28. http://home.deib.polimi.it/matteucc/Clustering/tutorial_html/cmeans.html. Access: 30 Sept 2016

  29. Anderson D, Mcneill G (1992) Artificial neural networks technology. Kaman Sciences Corporation, Utica, New York

    Google Scholar 

  30. Baluja S (1995) Artificial neural network evolution: learning to steer a land vehicle. CRC Press Inc

  31. Mitchell TM (1997) Machine learning. McGraw-Hill, Inc., New York

    MATH  Google Scholar 

  32. Ververidis D, Kotropoulos C (2006) Emotional speech recognition: resources, features, and methods. Speech Commun 48:1162–1181. doi:10.1016/j.specom.2006.04.003

    Article  Google Scholar 

  33. Ceylan R, Özbay Y (2007) Comparison of FCM, PCA and WT techniques for classification ECG arrhythmias using artificial neural network. Expert Syst Appl 33:286–295. doi:10.1016/j.eswa.2006.05.014

    Article  Google Scholar 

  34. Chaoui H, Sicard P, Gueaieb W (2009) ANN-based adaptive control of robotic manipulators with friction and joint elasticity. IEEE Trans Ind Electron 56:3174–3187. doi:10.1109/TIE.2009.2024657

    Article  Google Scholar 

  35. Özbay Y, Tezel G (2010) A new method for classification of ECG arrhythmias using neural network with adaptive activation function. Digit Signal Process 20:1040–1049. doi:10.1016/j.dsp.2009.10.016

    Article  Google Scholar 

  36. Oflazoglu C, Yildirim S (2013) Recognizing emotion from Turkish speech using acoustic features. EURASIP J Audio Speech Music Process 2013:26. doi:10.1186/1687-4722-2013-26

    Article  Google Scholar 

  37. Davy M, Gretton A, Doucet A et al (2002) Optimized support vector machines for nonstationary signal classification. Sig Process 9:442–445. doi:10.1109/LSP.2002.806070

    Google Scholar 

  38. Rish I (2001) An empirical study of the naive Bayes classifier. In: Proceedings of IJCAI-01 workshop on Empirical Methods in AI, pp 41–46

  39. Bouckaert RR, Frank E, Hall MA, Holmes G, Pfahringer B, Reutemann P, Witten IH (2010) WEKA-experiences with a java open-source project. J Mach Learn Res 11:2533–2541

    MATH  Google Scholar 

  40. Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software. ACM SIGKDD Explor Newsl 11:10–18

    Article  Google Scholar 

  41. Antonelli M, Ducange P, Lazzerini B, Marcelloni F (2016) Multi-objective evolutionary design of granular rule-based classifiers. Granul Comput 1:37–58. doi:10.1007/s41066-015-0004-z

    Article  Google Scholar 

  42. Wu S, Falk TH, Chan W (2011) Automatic speech emotion recognition using modulation spectral features. Speech Commun 53:768–785. doi:10.1016/j.specom.2010.08.013

    Article  Google Scholar 

  43. Engberg IS, Hansen AV (1996) Documentation of the danish emotional speech database des. Intern AAU report, Cent Pers Kommun, p 22

Download references

Acknowledgements

The authors acknowledge the support of this study provided by Selcuk University Scientific Research Projects. The authors also thank TUBITAK for their support of this study.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Humar Kahramanli.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Demircan, S., Kahramanli, H. Application of fuzzy C-means clustering algorithm to spectral features for emotion classification from speech. Neural Comput & Applic 29, 59–66 (2018). https://doi.org/10.1007/s00521-016-2712-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-016-2712-y

Keywords

Navigation