Discriminating Speakers by Their Voices — A Fusion Based Approach

Sayoud, Halim; Ouamour, Siham; Hamadache, Zohra

doi:10.1007/978-3-319-66429-3_31

Halim Sayoud¹⁶,
Siham Ouamour¹⁶ &
Zohra Hamadache¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10458))

Included in the following conference series:

International Conference on Speech and Computer

2176 Accesses

Abstract

The task of Speaker Discrimination (SD) consists in checking whether two speech segments belong to the same speaker or not. In this research field, it is often difficult to decide what could be the best classifier in terms of accuracy and robustness. For that purpose, we have implemented 9 classifiers: Support Vector Machines, Linear Discriminant Analysis, Multi-Layer Perceptron, Generalized Linear Model, Self Organizing Map, Adaboost, Second Order Statistical Measures, Linear Regression and Gaussian Mixture Models. Furthermore, a new fusion approach is proposed and experimented in speaker discrimination. Several experiments of speaker discrimination were conducted on Hub4 Broadcast-News with relatively short segments. The obtained results have shown that the best classifier is the SVM and that the proposed fusion approach is quite interesting since it provided the best performances at all.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Rose, P.: Forensic speaker discrimination with australian english vowel acoustics. In: ICPhS XVI Saarbrücken, pp. 6–10 (2007)
Google Scholar
Matrouf, D., Bonastre, J.F.: Accurate Log-Likelihood Ratio Estimation By Using Test Statistical Model For Speaker Verification. In: The Speaker and Language Recognition Workshop (2006)
Google Scholar
Meignier, S., et al.: Step-by-step and integrated approaches in broadcast news speaker diarization. Comput. Speech Lang. 20, 303–330 (2006)
Article Google Scholar
Ouamour, S., Guerti, M., Sayoud, H.: A New Relativistic Vision in Speaker Discrimination. Can. Acoust. J. 36(4), 24–34 (2008)
Google Scholar
Li, M., Xing, Y., Luo, R.: Hierarchical Speaker Verification Based on PCA and Kernel Fisher Discriminant. In: Fourth International Conference on Natural Computation, pp. 152–156 (2008)
Google Scholar
Zhao, Z.D., Zhang, J., Tian, J.F., Lou, Y.Y.: An effective identification method for speaker recognition based on PCA and double VQ. In: Proceedings of the Eighth International Conference on Machine Learning and Cybernetics, Baoding, pp. 1686–1689 (2009)
Google Scholar
Jayakurnar, A., Vimal, K.V.R., Babu Anto, P.: Text dependent speaker recognition using discrete stationary wavelet transform and PCA. In: International Conference on the Current Trends in Information Technology (CTIT), pp. 1–4 (2009)
Google Scholar
Zhou, Y., Zhang, X., Wang, J., Gong, Y.: Research on speaker feature dimension reduction based on CCA and PCA. In: International Conference on Wireless Communications and Signal Processing (WCSP), pp. 1–4 (2010)
Google Scholar
Mehra, A., Kumawat, M., Ranjan, R.: Expert system for speaker identification using lip features with PCA. In: 2nd International Workshop on Intelligent Systems and Applications (ISA), pp. 1–4 (2010)
Google Scholar
Xiao-Chun, L., Jun-Xun, Y.: A text-independent speaker recognition system based on probabilistic principle component analysis. In: 2012 3rd International Conference on System Science, Engineering Design and Manufacturing Informatization, pp. 255–260 (2012)
Google Scholar
Jing, X., Ma, J., Zhao, J., Yang, H.: Speaker recognition based on principal component analysis of LPCC and MFCC, pp. 403–408. IEEE (2014)
Google Scholar
Venables, W.N., Ripley, B.D.: Modern Applied Statistics with S-PLUS, 4th edn. Springer, New York (2002)
Book MATH Google Scholar
Ruihi, W..: AdaBoost for feature selection, classification and its relation with SVM, a review. In: International Conference on Solid State Devices and Materials Science, 1–2, April 2012, vol. 25, pp. 800–807. Physics Procedia, Macao (2012)
Google Scholar
Witten, I.H., Frank, E., Trigg, L., Hall, M., Holmes, G., Cunningham, S.J.: Weka: practical machine learning tools and techniques with Java implementations. In Kasabov, N., Ko, K. (eds.) Proceedings of the ICONIP/ANZIIS/ANNES 1999 Workshop on Emerging Knowledge Engineering and Connectionist-Based Information Systems, pp. 192–196 (1999)
Google Scholar
Keerthi, S.S., Shevade, S.K., Bhattacharyya, C., Murthy, K.R.K.: Improvements to platt’s SMO algorithm for SVM classifier design. Neural Comput. 13, 637–649 (2001)
Article MATH Google Scholar
Sayoud, H.: Automatic speaker recognition–Connexionnist approach. PhD thesis, USTHB University, Algiers (2003)
Google Scholar
Wikipedia, “Linear regression”, From Wikipedia, the free encyclopedia. The web page was last modified on 28 March (2013), http://en.wikipedia.org/wiki/Linear_regression
Huang, X., Pan, W.: Linear regression and two-class classification with gene expression data. Bioinformatics 19(16), 2072–2078 (2003)
Article Google Scholar
Wang, X., Fan, J.: Variable selection for multivariate generalized linear models. J. Appl. Stat. 41(2) (2014)
Google Scholar
Kohonen, T.: The self-organizing map. Proc. IEEE 78(9), 1464–1480 (1990). doi:10.1109/5.58325
Article Google Scholar
Tambouratzis, G., Hairetakis, G., Markantonatou, S., Carayannis, G.: Applying the SOM model to text classification according to register and stylistic content. Int. J. Neural Syst. 13(1), 1–11 (2003)
Article Google Scholar
McLachlan, G.J., Peel, D., Bean, R.W.: Modelling high-dimensional data by mixtures of factor analyzers. Comput. Stat. Data Anal. 41(3–4), 379–388 (2003)
Article MathSciNet MATH Google Scholar
Přibil, J., Přibilová, A., Matoušek, J.: GMM classification of text-to-speech synthesis: identification of original speaker’s voice. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2014. LNCS, vol. 8655, pp. 365–373. Springer, Cham (2014). doi:10.1007/978-3-319-10816-2_44
Google Scholar
Shlens, J.: A Tutorial on Principal Component Analysis–Derivation, Discussion and Singular Value Decomposition. Version number 1 (2003), www.cs.princeton.edu/picasso/mats/PCA-Tutorial-Intuition_jp.pdf
Shayegan, M.A., Aghabozorgi, S.: A new dataset size reduction approach for PCA-based classification in OCR application. Math. Prob. Eng. 2014, 14 (2014), http://dx.doi.org/10.1155/2014/537428
Dasarathy, B.V.: Decision fusion. In: Proceedings of IEEE Computer Society Press, Los Alamitos, CA (1994)
Google Scholar
Verlinde, P.: Contribution à la vérification multimodale d’identité en utilisant la fusion de decisions. PhD thesis, Ecole Nationale Supérieure des Télécommunications, Paris, France, 17 September (1999)
Google Scholar
Jain, A.K., Ross, A., Prabhakar, S.: An Introduction to Biometric Recognition. J. IEEE Trans. Circ. Syst. Video Technol. 14(1), 4–20 (2004)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Electronics and Computer Engineering Faculty, USTHB University, Bab Ezzouar, Algeria
Halim Sayoud, Siham Ouamour & Zohra Hamadache

Authors

Halim Sayoud
View author publications
You can also search for this author in PubMed Google Scholar
Siham Ouamour
View author publications
You can also search for this author in PubMed Google Scholar
Zohra Hamadache
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Halim Sayoud .

Editor information

Editors and Affiliations

SPIIRAS, Saint Petersburg, Russia
Alexey Karpov
Moscow State Linguistic University, Moscow, Russia
Rodmonga Potapova
University of Hertfordshire, Hatfield, United Kingdom
Iosif Mporas

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sayoud, H., Ouamour, S., Hamadache, Z. (2017). Discriminating Speakers by Their Voices — A Fusion Based Approach. In: Karpov, A., Potapova, R., Mporas, I. (eds) Speech and Computer. SPECOM 2017. Lecture Notes in Computer Science(), vol 10458. Springer, Cham. https://doi.org/10.1007/978-3-319-66429-3_31

Download citation

DOI: https://doi.org/10.1007/978-3-319-66429-3_31
Published: 13 August 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-66428-6
Online ISBN: 978-3-319-66429-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics