Abstract
In this paper, we propose example-specific density based matching kernel (ESDMK) for the classification of varying length patterns of long duration speech represented as sets of feature vectors. The proposed kernel is computed between the pair of examples, represented as sets of feature vectors, by matching the estimates of the example-specific densities computed at every feature vector in those two examples. In this work, the number of feature vectors of an example among the K nearest neighbors of a feature vector is considered as an estimate of the example-specific density. The minimum of the estimates of two example-specific densities, one for each example, at a feature vector is considered as the matching score. The ESDMK is then computed as the sum of the matching score computed at every feature vector in a pair of examples. We study the performance of the support vector machine (SVM) based classifiers using the proposed ESDMK for speech emotion recognition and speaker identification tasks and compare the same with that of the SVM-based classifiers using the state-of-the-art kernels for varying length patterns.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Rabiner, L., Juang, B.-H.: Fundamentals of Speech Recognition. Pearson Education, New Jersey (2003)
Reynolds, D.A.: Speaker identification and verification using Gaussian mixture speaker models. Speech Commun. 17, 91–108 (1995)
Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted Gaussian mixture models. Digit. Signal Proc. 10(1–3), 19–41 (2000)
Dileep, A.D., Chandra Sekhar, C.: GMM-based intermediate matching kernel for classification of varying length patterns of long duration speech using support vector machines. IEEE Trans. Neural Netw. Learn. Syst. 25(8), 1421–1432 (2014)
Smith, N., Gales, M., Niranjan, M.: Data-dependent kernels in SVM classification of speech patterns. Technical Report CUED/F-INFENG/TR.387, Engineering Department, Cambridge University, Cambridge, April 2001
Lee, K.-A., You, C.H., Li, H., Kinnunen, T.: A GMM-based probabilistic sequence kernel for speaker verification. In: Proceedings of INTERSPEECH, Antwerp, Belgium, pp. 294–297, August 2007
Campbell, W.M., Sturim, D.E., Reynolds, D.A.: Support vector machines using GMM supervectors for speaker verification. IEEE Signal Process. Lett. 13(5), 308–311 (2006)
You, C.H., Lee, K.A., Li, H.: An SVM kernel with GMM-supervector based on the Bhattacharyya distance for speaker recognition. IEEE Signal Process. Lett. 16(1), 49–52 (2009)
Dileep, A.D., Sekhar Chandra, C.: Speaker recognition using pyramid match kernel based support vector machines. Int. J. Speech Technol. 15(3), 365–379 (2012)
Jaakkola, T., Diekhans, M., Haussler, D.: A discriminative framework for detecting remote protein homologies. J. Comput. Biol. 7(1–2), 95–114 (2000)
Burkhardt, F., Paeschke, A., Rolfes, M., Weiss, W.S.B.: A database of German emotional speech. In: Proceedings of INTERSPEECH, Lisbon, Portugal, pp. 1517–1520, September 2005
Steidl, S.: Automatic classification of emotion-related user states in spontaneous childern’s speech. Ph.D. Thesis, Der Technischen Fakultät der Universität Erlangen-Nürnberg, Germany (2009)
The NIST year 2002 speaker recognition evaluation plan (2002). http://www.itl.nist.gov/iad/mig/tests/spk/2002/
The NIST year 2003 speaker recognition evaluation plan (2003). http://www.itl.nist.gov/iad/mig/tests/sre/2003/
Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 27:1–27:27 (2011). http://www.csie.ntu.edu.tw/cjlin/libsvm
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Sachdev, A., Dileep, A.D., Thenkanidiyoor, V. (2015). Example-Specific Density Based Matching Kernel for Classification of Varying Length Patterns of Speech Using Support Vector Machines. In: Arik, S., Huang, T., Lai, W., Liu, Q. (eds) Neural Information Processing. ICONIP 2015. Lecture Notes in Computer Science(), vol 9489. Springer, Cham. https://doi.org/10.1007/978-3-319-26532-2_20
Download citation
DOI: https://doi.org/10.1007/978-3-319-26532-2_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-26531-5
Online ISBN: 978-3-319-26532-2
eBook Packages: Computer ScienceComputer Science (R0)