Skip to main content
Log in

Choice of a classifier, based on properties of a dataset: case study-speech emotion recognition

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

In this paper, the process of selecting a classifier based on the properties of dataset is designed since it is very difficult to experiment the data on n—number of classifiers. As a case study speech emotion recognition is considered. Different combinations of spectral and prosodic features relevant to emotions are explored. The best subset of the chosen set of features is recommended for each of the classifiers based on the properties of chosen dataset. Various statistical tests have been used to estimate the properties of dataset. The nature of dataset gives an idea to select the relevant classifier. To make it more precise, three other clustering and classification techniques such as K-means clustering, vector quantization and artificial neural networks are used for experimentation and results are compared with the selected classifier. Prosodic features like pitch, intensity, jitter, shimmer, spectral features such as mel frequency cepstral coefficients (MFCCs) and formants are considered in this work. Statistical parameters of prosody such as minimum, maximum, mean (\(\mu\)) and standard deviation (\(\sigma\)) are extracted from speech and combined with basic spectral (MFCCs) features to get better performance. Five basic emotions namely anger, fear, happiness, neutral and sadness are considered. For analysing the performance of different datasets on different classifiers, content and speaker independent emotional data is used, collected from Telugu movies. Mean opinion score of fifty users is collected to label the emotional data. To make it more accurate, one of the benchmark IIT-Kharagpur emotional database is used to generalize the conclusions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6), 716–723.

    Article  MathSciNet  MATH  Google Scholar 

  • Anagnostopoulos, C.-N., Iliou, T., & Giannoukos, I. (2012). Features and classifiers for emotion recognition from speech: A survey from 2000 to 2011. Artificial Intelligence Review, 43, 1–23.

    Google Scholar 

  • Ananthapadmanabha, T., & Yegnanarayana, B. (1979). Epoch extraction from linear prediction residual for identification of closed glottis interval. IEEE Transactions on Acoustics, Speech and Signal Processing, 27(4), 309–319.

    Article  Google Scholar 

  • Banse, R., & Scherer, K. R. (1996). Acoustic profiles in vocal emotion expression. Journal of Personality and Social Psychology, 70(3), 614.

    Article  Google Scholar 

  • Bhatti, M. W., Wang, Y., & Guan, L. (2004). A neural network approach for human emotion recognition in speech. In ISCAS’04. Proceedings of the 2004 international symposium on Circuits and systems, 2004, (Vol. 2, pp II–181). IEEE

  • Bishop, C. M., et al. (1995). Neural networks for pattern recognition. New York: Oxford University Press

  • Bitouk, D., Verma, R., & Nenkova, A. (2010). Class-level spectral features for emotion recognition. Speech Communication, 52(7), 613–625.

    Article  Google Scholar 

  • Black, M. J., & Yacoob, Y. (1995). Tracking and recognizing rigid and non-rigid facial motions using local parametric models of image motion. In Proceedings, fifth international conference on Computer vision, 1995, (pp. 374–381). IEEE.

  • Bou-Ghazale, S. E., & Hansen, J. H. L. (2000). A comparative study of traditional and newly proposed features for recognition of speech under stress. IEEE Transactions on Speech and Audio Processing, 8(4), 429–442.

    Article  Google Scholar 

  • Burges, C. J. C. (1998). A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2(2), 121–167.

    Article  Google Scholar 

  • Busso, C., Deng, Z., Yildirim, S., Bulut, M., Lee, C.  M., Kazemzadeh, A., Lee, S., Neumann, U., & Narayanan, S. (2004). Analysis of emotion recognition using facial expressions, speech and multimodal information. In Proceedings of the 6th international conference on Multimodal interfaces (pp. 205–211). ACM

  • Chakraborty, R., Pandharipande, M., & Kopparapu, S. K. (2016). Knowledge-based framework for intelligent emotion recognition in spontaneous speech. Procedia Computer Science, 96, 587–596.

    Article  Google Scholar 

  • Chauhan, A., Koolagudi, S. G., Kafley, S., & Rao, K. S. (2010). Emotion recognition using lp residual. In Students’ technology symposium (TechSym), 2010 IEEE (pp. 255–261). IEEE.

  • Chavhan, Y., Dhore, M. L., & Yesaware, P. (2010). Speech emotion recognition using support vector machine. International Journal of Computer Applications, 1(20), 6–9.

    Article  Google Scholar 

  • Chen, C., You, M., Song, M., Bu, J., Liu, J. (2006). An enhanced speech emotion recognition system based on discourse information. In Computational Science–ICCS 2006 (pp. 449–456). New York: Springer (2006).

  • Chung-Hsien, W., & Liang, W.-B. (2011). Emotion recognition of affective speech based on multiple classifiers using acoustic-prosodic information and semantic labels. IEEE Transactions on Affective Computing, 2(1), 10–21.

    Article  Google Scholar 

  • Cowie, R., & Cornelius, R. R. (2003). Describing the emotional states that are expressed in speech. Speech Communication, 40(1), 5–32.

    Article  MATH  Google Scholar 

  • Dai, K., Fell, H. J., & MacAuslan, J. (2008). Recognizing emotion in speech using neural networks. Telehealth and Assistive Technologies, 31, 38–43.

    Google Scholar 

  • Deller, J. R. P., John G., & Hansen, J. H.L. (2000). Discrete-time processing of speech signals. New York: IEEE.

  • Demšar, J. (2006). Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research, 7, 1–30.

    MathSciNet  MATH  Google Scholar 

  • Deng, J., Xinzhou, X., Zhang, Z., Frühholz, S., & Schuller, B. (2017). Universum autoencoder-based domain adaptation for speech emotion recognition. IEEE Signal Processing Letters, 24(4), 500–504.

    Article  Google Scholar 

  • Deng, J., Zhang, Z., Eyben, F., & Schuller, B. (2014). Autoencoder-based unsupervised domain adaptation for speech emotion recognition. IEEE Signal Processing Letters, 21(9), 1068–1072.

    Article  Google Scholar 

  • El Ayadi, M., Kamel, M. S., & Karray, F. (2011). Survey on speech emotion recognition: Features, classification schemes, and databases. Pattern Recognition, 44(3), 572–587.

    Article  MATH  Google Scholar 

  • El-Yazeed, M. F., El Gamal, M. A., & El Ayadi, M. M. H. (2004). On the determination of optimal model order for gmm-based text-independent speaker identification. EURASIP Journal on Applied Signal Processing, 1078–1087, 2004.

    MATH  Google Scholar 

  • Essa, I. A., & Pentland, A. P. (1997). Coding, analysis, interpretation, and recognition of facial expressions. IEEE transactions on Pattern analysis and machine intelligence, 19(7):757–763.

  • Farrus, M., & Hernando, J. (2009). Using jitter and shimmer in speaker verification. IET Signal Processing, 3(4), 247–257.

    Article  Google Scholar 

  • Firoz, S.A., Raji, S.A., & Babu, A.P. (2009). Automatic emotion recognition from speech using artificial neural networks with gender-dependent databases. In ACT’09. International conference on Advances in computing, control, & telecommunication technologies, 2009, (pp. 162–164). IEEE

  • Foo, S. W., & De Silva, L. C. (2003). Speech emotion recognition using hidden markov models. Speech Communication, 41(4), 603–623.

    Article  Google Scholar 

  • Fu, L., Mao, X., & Chen, L. (2008). Relative speech emotion recognition based artificial neural network. In Computational intelligence and industrial application, 2008. PACIIA’08. Pacific-Asia workshop on (Vol. 2, pp. 140–144). IEEE

  • Giannoulis, Panagiotis, & Potamianos, Gerasimos (2012). A hierarchical approach with feature selection for emotion recognition from speech. In LREC (pp. 1203–1206)

  • Grimm, M., Kroschel, K., & Narayanan, S. (2007). Support vector regression for automatic recognition of spontaneous emotions in speech. In IEEE international conference on acoustics, speech and signal processing, 2007. ICASSP 2007, (vol. 4, pp. IV–1085). IEEE

  • Han, J., & Kamber, M. (2006). Data Mining. Southeast Asia Edition: Concepts and Techniques. Morgan kaufmann.

    MATH  Google Scholar 

  • Han, K., Yu, D., & Tashev, I. (2014). Speech emotion recognition using deep neural network and extreme learning machine. In Fifteenth annual conference of the international speech communication association.

  • Hernando, J., Nadeu, C., & Mariño, J. B. (1997). Speech recognition in a noisy car environment based on lp of the one-sided autocorrelation sequence and robust similarity measuring techniques. Speech Communication, 21(1), 17–31.

    Article  Google Scholar 

  • Hess, W. J. (2008). Pitch and voicing determination of speech with an extension toward music signals. In Springer Handbook of Speech Processing, (pp. 181–212). Berlin: Springer.

  • Heuft, B., Portele, T., & Rauth, M. (1996). Emotions in time domain synthesis. In Proceedings, fourth international conference on Spoken Language, 1996. ICSLP 96, (Vol. 3, pp. 1974–1977). IEEE

  • Huang, J., Yang, W., & Zhou, D. (2012). Variance-based gaussian kernel fuzzy vector quantization for emotion recognition with short speech. In 2012 IEEE 12th international conference on Computer and information technology (CIT), (pp. 557–560). IEEE.

  • Iida, A., Campbell, N., Higuchi, F., & Yasumura, M. (2003). A corpus-based speech synthesis system with emotion. Speech Communication, 40(1), 161–187.

    Article  MATH  Google Scholar 

  • Iida, A., Campbell, N., Iga, S., Higuchi, F., Yasumura, M. (2000). A speech synthesis system with emotion for assisting communication. In ISCA tutorial and research workshop (ITRW) on speech and emotion.

  • Ingale, A. B., & Chaudhari, D. S. (2012). Speech emotion recognition. International Journal of Soft Computing and Engineering (IJSCE), 2(1), 235–238.

    Google Scholar 

  • Jawarkar, N. P., et al. (2007). Emotion recognition using prosody features and a fuzzy min-max neural classifier. The Institution of Electronics and Telecommunication Engineers, 24(5), 369–373.

    Google Scholar 

  • Kaiser, L. (1962). Communication of affects by single vowels. Synthese, 14(4), 300–319.

    Article  Google Scholar 

  • Kenji, M. A. S. E. (1991). Recognition of facial expression from optical flow. IEICE TRANSACTIONS on Information and Systems, 74(10), 3474–3483.

    Google Scholar 

  • Khanchandani, K. B., & Hussain, M. A. (2009). Emotion recognition using multilayer perceptron and generalized feed forward neural network. Journal of Scientific and Industrial Research, 68(5), 367.

    Google Scholar 

  • Khanna, P., & Kumar, M. S. (2011). Application of vector quantization in emotion recognition from human speech. In Information intelligence, systems, technology and management (pp. 118–125). New York: Springer.

  • Kohavi, R., et al. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. Ijcai, 14, 1137–1145.

    Google Scholar 

  • Konar, A., & Chakraborty, A. (2014). Emotion recognition: A pattern analysis approach. Wiley: Hobroken, NJ.

  • Koolagudi, S. G., & Rao, K. S. (2012). Emotion recognition from speech: A review. International Journal of Speech Technology, 15(2), 99–117.

    Article  Google Scholar 

  • Koolagudi, S. G., Maity, S., Kumar, V. A., Chakrabarti, S., & Rao, K. S. (2009). Iitkgp-sesc: Speech database for emotion analysis. In International conference on contemporary computing (pp. 485–492). New York: Springer

  • Koolagudi, S. G., Nandy, S., & Rao, K. S. (2009). Spectral features for emotion classification. In Advance computing conference, 2009. IACC 2009. IEEE International (pp. 1292–1296). IEEE

  • Koolagudi, S. G., Reddy, R., & Rao, K. S. (2010). Emotion recognition from speech signal using epoch parameters. In 2010 international conference on Signal processing and communications (SPCOM), (pp. 1–5). IEEE.

  • Kostoulas, T.P., & Fakotakis, N. (2006). A speaker dependent emotion recognition framework. In Proceedings 5th international symposium, communication systems, networks and digital signal processing (CSNDSP), University of Patras (pp. 305–309)

  • Krothapalli, S. R., & Koolagudi, S. G. (2013). Speech emotion recognition: A review. In Emotion recognition using speech features, pp. 15–34. New York: Springer.

  • Kwon, O.-W., Chan, K., Hao, J., & Lee, T.-W. (2003). Emotion recognition by speech signals. In INTERSPEECH.

  • Le Bouquin, R. (1996). Enhancement of noisy speech signals: Application to mobile radio communications. Speech Communication, 18(1), 3–19.

    Article  Google Scholar 

  • Lee, K.-F., & Hon, H.-W. (1989). Speaker-independent phone recognition using hidden markov models. IEEE transactions on acoustics, speech and signal processing , 37(11), 1641–1648.

    Article  Google Scholar 

  • Lee, C. M., Yildirim, S., Bulut, M., Kazemzadeh, A., Busso, C., Deng, Z., Lee, S., & Narayanan, S. (2004). Emotion recognition based on phoneme classes. In INTERSPEECH (pp. 205–211).

  • Li, Y., & Zhao, Y. (1998). Recognizing emotions in speech using short-term and long-term features. In ICSLP.

  • Li, J. Q. & Barron, A. R. (1999). Mixture density estimation. In Advances in neural information processing systems 12. Citeseer.

  • Li, X., Tao, J., Johnson, M. T., Soltis, J., Savage, A., Leong, K. M., & Newman, J. D. (2007). Stress and emotion classification using jitter and shimmer features. In Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on (Vol. 4, pp. IV–1081). IEEE.

  • Lilliefors, H. W. (1967). On the kolmogorov-smirnov test for normality with mean and variance unknown. Journal of the American Statistical Association, 62(318), 399–402.

    Article  Google Scholar 

  • Lin, Y.-L., & Wei, G. (2005). Speech emotion recognition based on hmm and svm. In Proceedings of 2005 international conference on Machine learning and cybernetics, 2005, (Vol. 8, pp. 4898–4901). IEEE.

  • Linde, Y., Buzo, A., & Gray, R. M. (1980). An algorithm for vector quantizer design. IEEE transactions on Communications, 28(1):84–95

  • Liu, H., & Lei, Y. (2005). Toward integrating feature selection algorithms for classification and clustering. IEEE Transactions on knowledge and data engineering, 17(4), 491–502.

    Article  MathSciNet  Google Scholar 

  • Luengo, I., Navas, E., Hernáez, I., Sánchez, J. (2005). Automatic emotion recognition using prosodic parameters. In INTERSPEECH (pp. 493–496).

  • Mardia., K. V. (1970). Measures of multivariate skewness and kurtosis with applications. Biometrika, 57(3), 519–530.

    Article  MathSciNet  MATH  Google Scholar 

  • Motamed, S., Setayeshi, S., & Rabiee, A. (2017). Speech emotion recognition based on a modified brain emotional learning model. Biologically Inspired Cognitive Architectures, 19, 32–38.

    Article  Google Scholar 

  • Muslea, I., Minton, S., & Knoblock, C. A. (2006). Active learning with multiple views. Journal of Artificial Intelligence Research, 27, 203–233.

    MathSciNet  MATH  Google Scholar 

  • Muthusamy, H., Polat, K., & Yaacob, S. (2015). Improved emotion recognition using gaussian mixture model and extreme learning machine in speech and glottal signals. Mathematical Problems in Engineering, 2015.

  • Neiberg, D., Elenius, K., & Laskowski, K. (2006). Emotion recognition in spontaneous speech using gmms. In INTERSPEECH.

  • Nicholson, J., Takahashi, K., & Nakatsu, R. (2000). Emotion recognition in speech using neural networks. Neural Computing & Applications, 9(4), 290–296.

    Article  MATH  Google Scholar 

  • Nogueiras, A., Moreno, A., Bonafonte, A., & Mariño, J. B. (2001). Speech emotion recognition using hidden markov models. In INTERSPEECH (pp. 2679–2682).

  • Nooteboom, S. (1997). The prosody of speech: Melody and rhythm. The Handbook of Phonetic Sciences, 5, 640–673.

    Google Scholar 

  • Nwe, T. L., Wei, F. S., & De Silva, L. C. (2001). Speech based emotion classification. In TENCON 2001, Proceedings of IEEE region 10 international conference on electrical and electronic technology, IEEE, (Vol. 1, pp. 297–301).

  • Ortony, A. (1990). The cognitive structure of emotions. Cambridge: Cambridge University Press.

    Google Scholar 

  • Pan, Y., Shen, P., & Shen, L. (2012). Speech emotion recognition using support vector machine. International Journal of Smart Home, 6(2), 101–107.

    Google Scholar 

  • Partila, P., & Voznak, M. (2013). Speech emotions recognition using 2-d neural classifier. In Nostradamus 2013: Prediction, modeling and analysis of complex systems (pp. 221–231). New York: Springer.

  • Petrushin, V. A. (2000). Emotion recognition in speech signal: experimental study, development, and application. Studies, 3, 4.

    Google Scholar 

  • Polzin, T. S. & Waibel, A. (1998). Detecting emotions in speech. In Proceedings of the CMC (Vol. 16). Citeseer

  • Rabiner, L. (1989). A tutorial on hidden markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2), 257–286.

    Article  Google Scholar 

  • Rabiner, L. R., & Schafer, R. W. (1978). Digital processing of speech signals (Vol. 100). Englewood Cliffs: Prentice-hall.

  • Rabiner, L. R., & Juang, B.-H. (1993). In Fundamentals of speech recognition (Vol. 14). Englewood Cliffs: PTR Prentice Hall .

  • Rao, S. K., Koolagudi, S. G., & Vempada, R. R. (2013). Emotion recognition from speech using global and local prosodic features. International Journal of Speech Technology, 16(2), 143–160.

    Article  Google Scholar 

  • Rao, K. S., & Koolagudi, S. G. (2012). Emotion recognition using speech features. New York: Springer Science & Business Media.

  • Rao, K. S., Reddy, R., Maity, S., & Koolagudi, S. G. (2010). Characterization of emotions using the dynamics of prosodic. In Proceedings of speech prosody (Vol. 4).

  • Razak, A. A., Komiya, R., Izani, M., & Abidin, Z. (2005). Comparison between fuzzy and nn method for speech emotion recognition. In ICITA 2005. Third international conference on Information technology and applications, 2005, (Vol. 1, pp. 297–302). IEEE

  • Reddy, S. Arundathy, Singh, Amarjot, Kumar, N. Sumanth, & Sruthi, K.S. (2011). The decisive emotion identifier. In 2011 3rd international conference on electronics computer technology (ICECT), (Vol. 2, pp. 28–32). IEEE.

  • Rencher, A. C., & Christensen, W. F. (2012). Methods of multivariate analysis (Vol. 709). New York: Wiley.

  • Reynolds, D. A., Quatieri, T. F., & Dunn, R. B. (2000). Speaker verification using adapted gaussian mixture models. Digital Signal Processing, 10(1), 19–41.

    Article  Google Scholar 

  • Reynolds, D. A., & Rose, R. C. (1995). Robust text-independent speaker identification using gaussian mixture speaker models. IEEE Transactions on speech and audio processing, 3(1), 72–83.

    Article  Google Scholar 

  • Rissanen, J. (1978). Modeling by shortest data description. Automatica, 14(5), 465–471.

    Article  MATH  Google Scholar 

  • Rojas, R. (2013). Neural networks: A systematic introduction. Berlin: Springer Science & Business Media.

    MATH  Google Scholar 

  • Sato, N., & Obuchi, Y. (2007). Emotion recognition using mel-frequency cepstral coefficients. Information and Media Technologies, 2(3), 835–848.

    Google Scholar 

  • Scherer, K. R. (2003). Vocal communication of emotion: A review of research paradigms. Speech Communication, 40(1), 227–256.

    Article  MATH  Google Scholar 

  • Scherer, K. R. (1989). Vocal correlates of emotional arousal and affective disturbance. In Handbook of social psychophysiology (pp. 165–197).

  • Schuller, B., Müller, R., Lang, M., & Rigoll, G. (2005). Speaker independent emotion recognition by early fusion of acoustic and linguistic features within ensembles. In Ninth European Conference on Speech Communication and Technology.

  • Schuller, B., Rigoll, G., & Lang, M. (2003). Hidden markov model-based speech emotion recognition. In Proceedings. (ICASSP’03). 2003 IEEE international conference on acoustics, speech, and signal processing, 2003, (Vol. 2, pp. II–1). IEEE.

  • Schuller, B., Rigoll, G., & Lang, M. (2004). Speech emotion recognition combining acoustic features and linguistic information in a hybrid support vector machine-belief network architecture. In Proceedings (ICASSP’04). IEEE international conference on acoustics, speech, and signal processing, 2004, (Vol. 1, pp. I–577). IEEE.

  • Seehapoch, T., & Wongthanavasu, S. (2013). Speech emotion recognition using support vector machines. In 2013 5th international conference on Knowledge and smart technology (KST) (pp. 86–91). IEEE.

  • Shapiro, S. S., & Wilk, M. B. (1965). An analysis of variance test for normality (complete samples). Biometrika, 52(3/4), 591–611.

    Article  MathSciNet  MATH  Google Scholar 

  • Shen, P., Changjun, Z., & Chen, X. (2011). Automatic speech emotion recognition using support vector machine. In 2011 International conference on electronic and mechanical engineering and information technology (EMEIT), (Vol. 2, pp. 621–625). IEEE.

  • Siqing, W., Falk, T. H., & Chan, W.-Y. (2011). Automatic speech emotion recognition using modulation spectral features. Speech Communication, 53(5), 768–785.

    Article  Google Scholar 

  • Soares, C., & Brazdil, P. B. (2000). Zoomed ranking: Selection of classification algorithms based on relevant performance information. In European conference on principles of data mining and knowledge discovery, (pp. 126–135). New York: Springer

  • Song, P., Jin, Y., Zhao, L., & Xin, M. (2014). Speech emotion recognition using transfer learning. IEICE TRANSACTIONS on Information and Systems, 97(9), 2530–2532.

    Article  Google Scholar 

  • Song, P., Zheng, W., Shifeng, O., Zhang, X., Jin, Y., Liu, J., et al. (2016). Cross-corpus speech emotion recognition based on transfer non-negative matrix factorization. Speech Communication, 83, 34–41.

    Article  Google Scholar 

  • Soong, F. K., Rosenberg, A. E., Juang, B.-H., & Rabiner, L. R. (1987). Report: A vector quantization approach to speaker recognition. AT&T Technical Journal, 66(2):14–26

  • Stuhlsatz, A., Meyer, C., Eyben, F., ZieIke, T., Meier, G., & Schuller, B. (2011). Deep neural networks for acoustic emotion recognition: Raising the benchmarks. In 2011 IEEE international conference on acoustics, speech and signal processing (ICASSP), (pp. 5688–5691). IEEE

  • Takahashi, K. (2004). Remarks on svm-based emotion recognition from multi-modal bio-potential signals. In ROMAN 2004. 13th IEEE international workshop on Robot and human interactive communication, 2004, (pp. 95–100). IEEE.

  • Tang, J., Alelyani, S., & Liu, H. (2014). Feature selection for classification: A review. Data classification: Algorithms and applications, p. 37

  • Tang, H., Chu, S. M., Hasegawa-Johnson, M., & Huang, T.  S. (2009). Emotion recognition from speech via boosted gaussian mixture models. In IEEE international conference on Multimedia and expo, 2009. ICME 2009, (pp. 294–297). IEEE.

  • Tian, Y., Kanade, T., & Cohn, J. F. (2000). Recognizing lower face action units for facial expression analysis. In Proceedings, fourth IEEE international conference on Automatic face and gesture recognition, 2000, (pp. 484–490). IEEE.

  • Traunmüller, H., & Eriksson, A. (1995). The frequency range of the voice fundamental in the speech of male and female adults. Unpublished Manuscript

  • Trigeorgis, G., Ringeval, F., Brueckner, R., Marchi, E., Nicolaou, M. A., Schuller, B., & Zafeiriou, S. (2016). Adieu features? end-to-end speech emotion recognition using a deep convolutional recurrent network. In 2016 IEEE international conference on acoustics, speech and signal processing (ICASSP), (pp. 5200–5204). IEEE

  • Ververidis, D., & Kotropoulos, C. (2006). Emotional speech recognition: Resources, features, and methods. Speech Communication, 48(9), 1162–1181.

    Article  Google Scholar 

  • Ververidis, D., & Kotropoulos, C. (2005). Emotional speech classification using gaussian mixture models. In IEEE international symposium on circuits and systems, 2005. ISCAS 2005, (pp. 2871–2874). IEEE.

  • Vlassis, N., & Likas, A. (1999). A kurtosis-based dynamic approach to gaussian mixture modeling. IEEE Transactions on Systems, Man and Cybernetics, Part A: Systems and Humans, 29(4), 393–399.

    Article  Google Scholar 

  • Vlassis, N., & Likas, A. (2002). A greedy em algorithm for gaussian mixture learning. Neural Processing Letters, 15(1), 77–87.

    Article  MATH  Google Scholar 

  • Vogt, T., André, E., & Bee, N. (2008). EmoVoice—A framework for online recognition of emotions from voice. In Perception in multimodal dialogue systems (pp. 188–199). Springer.

  • Wang, K., An, N., Li, B. N., Zhang, Y., & Li, L. (2015). Speech emotion recognition using fourier parameters. IEEE Transactions on Affective Computing, 6(1), 69–75.

    Article  Google Scholar 

  • Wang, L. (2005). Support vector machines: Theory and applications, (Vol. 177). Springer Science & Business Media.

  • Wenjing, H., Haifeng, L., & Chunyu, G. (2009). A hybrid speech emotion perception method of vq-based feature processing and ann recognition. In WRI global congress on Intelligent systems, 2009. GCIS’09, (Vol. 2, pp. 145–149). IEEE.

  • Womack, B. D., & Hansen, J. H. L. (1999). N-channel hidden markov models for combined stressed speech classification and recognition. IEEE Transactions on Speech and Audio Processing, 7(6), 668–677.

    Article  Google Scholar 

  • Wu, S., Falk, T. H., & Chan, W. Y. (2011). Automatic speech emotion recognition using modulation spectral features. Speech communication, 53(5), 768–785.

    Article  Google Scholar 

  • Xiong, H., Junjie, W., & Chen, J. (2009). K-means clustering versus validation measures: A data-distribution perspective. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 39(2), 318–331.

    Article  Google Scholar 

  • Yacoob, Y., & Davis, L. (1994). Computing spatio-temporal representations of human faces. In 1994 IEEE computer society conference on Computer vision and pattern recognition, 1994. Proceedings CVPR’94, (pp. 70–75). IEEE

  • Yamada, T., Hashimoto, H., & Tosa, N. (1995). Pattern recognition of emotion with neural network. In Proceedings of the 1995 IEEE IECON 21st international conference on Industrial electronics, control, and instrumentation, 1995, (Vol. 1, pp. 183–187). IEEE

  • Yang, B., & Lugger, M. (2010). Emotion recognition from speech signals using new harmony features. Signal Processing, 90(5), 1415–1423.

    Article  MATH  Google Scholar 

  • Yegnanarayana, B. (1994). Artificial neural networks for pattern recognition. Sadhana, 19(2), 189–238.

    Article  MathSciNet  Google Scholar 

  • Yu, C., Tian, Q., Cheng, F., & Zhang, S. (2011). Speech emotion recognition using support vector machines. In Advanced research on computer science and information engineering (pp. 215–220). New York: Springer

  • Zheng, W., Xin, M., Wang, X., & Wang, B. (2014). A novel speech emotion recognition method via incomplete sparse least square regression. IEEE Signal Processing Letters, 21(5), 569–572.

    Article  Google Scholar 

  • Zhou, Y., Sun, Y., Zhang, J., & Yan, Y. (2009). Speech emotion recognition using both spectral and prosodic features. In ICIECS 2009. International conference on Information engineering and computer science, 2009, (pp. 1–4). IEEE.

  • Zhou, J., Wang, G., Yang, Y., & Chen, P. (2006). Speech emotion recognition based on rough set and svm. In 5th IEEE international conference on Cognitive informatics, 2006. ICCI 2006, (Vol. 1, pp. 53–61). IEEE.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Y. V. Srinivasa Murthy.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Koolagudi, S.G., Murthy, Y.V.S. & Bhaskar, S.P. Choice of a classifier, based on properties of a dataset: case study-speech emotion recognition. Int J Speech Technol 21, 167–183 (2018). https://doi.org/10.1007/s10772-018-9495-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10772-018-9495-8

Keywords

Navigation