A New Text Independent Speaker Recognition System with Short Utterances Using SVM

Chakroun, Rania; Frikha, Mondher

doi:10.1007/978-3-030-63396-7_38

Rania Chakroun^9,11 &
Mondher Frikha^9,10

Part of the book series: Lecture Notes in Business Information Processing ((LNBIP,volume 402))

Included in the following conference series:

European, Mediterranean, and Middle Eastern Conference on Information Systems

1806 Accesses

Abstract

Recent advances in the field of speaker recognition have proved to highly outperform algorithms. However this performance degrades when limited data are presented. This paper presents examples on how Support Vector Machines (SVM) can improve speaker recognition for short utterance data duration. The main contribution in this approach is the use of new vectors when training and testing data are limited. We show how different kernels function of SVM can be used to validate the new approach with different speakers from different databases.

R. Chakroun---No academic titles or descriptions of academic positions should be included in the addresses. The affiliations should consist of the author’s institution, town, and country.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Jain, A., Ross, A., Prabhakar, S.: An introduction to biometric recognition. IEEE Trans. Circ. Syst. Video Technol. 14(1), 4–20 (2004)
Article Google Scholar
Reynolds, D.: An overview of automatic speaker recognition technology. In: Proceedings of IEEE International Conference Acoustics Speech Signal Processing (ICASSP), vol. 4, pp. 4072–4075 (2002)
Google Scholar
Togneri, R., Pullella, D.: An overview of speaker identification: accuracy and robustness issues. In: IEEE Circuits And Systems Magazine, vol. 11, no. 2, pp. 23–61 (2011) ISSN: 1531-636X
Google Scholar
Snyder, D., Ghahremani, P., Povey, D., Garcia-Romero, D., Carmiel, Y., Khudanpur, S.: Deep neural network-based speaker embeddings for end-to-end speaker verification. In: 2016 IEEE Spoken Language Technology Workshop (SLT), pp. 165–170. IEEE (December 2016)
Google Scholar
Zhang, S.X, Chen, Z., Zhao, Y., Li, J., Gong, Y.: End-to-end attention based text-dependent speaker verification. arXiv preprint arXiv:1701.00562 (2017)
Variani, E., Lei, X., McDermott, E., Moreno, I.L., Gonzalez-Dominguez, J.: Deep neural networks for small footprint textdependent speaker verification. In: 2014 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp. 4052–4056. IEEE (2014)
Google Scholar
Heigold, G., Moreno, I., Bengio, S., Shazeer, N.: End-to-endtext-dependent speaker verification. In: 2016 IEEE international conference on Acoustics, speech and signal processing (ICASSP), pp 5115–5119. IEEE (2016)
Google Scholar
Cortes, C., Vapnick, V.: Support vector networks. Mach. Learn. 20, 1–25 (1995)
Google Scholar
Kamppari, S.O., Hazen, T. J.: Word and phone level acoustic confidence scoring. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (2000)
Google Scholar
Reynolds, D.A., Quatieri, T.F., Dunn, R.: Speaker verification using adapted gaussian mixture models. Digital Signal Process. 10(1–3), 19–41 (2000)
Article Google Scholar
Keshet, J., Bengio, S.: Automatic Speech and Speaker Recognition: Large Margin and Kernel Methods. Wiley, Hoboken (2009)
Book Google Scholar
Louradour, J., Daoudi, K., Bach, F.: Feature space mahalanobis sequence kernels: application to svm speaker verification. IEEE Trans. Audio Speech Lang. Process. 15(8), 2465–2475 (2007)
Article Google Scholar
Campbell, W.M.: Generalized linear discriminant sequence kernels for speaker recognition. In: Proceedings of the International Conference on Acoustics Speech and Signal Processing. pp. 161–164 (2002)
Google Scholar
Campbell, W.M., Sturim, D.E., Reynolds, D.A.: Support vector machine using GMM supervectors for speaker verification. IEEE Signal Process. Lett. 13(5), 308–311 (2006)
Article Google Scholar
Chung, J.S., Nagrani, A., Zisserman, A.: Voxceleb2: deepspeaker recognition. In: Proceedings of Interspeech 2018, pp. 1086–1090 (2018)
Google Scholar
Reynolds, D.A.: Automatic speaker recognition using gaussian mixture speaker models. Lincoln Lab. J. 8(2), 173–192 (1995)
Google Scholar
Atal, B.: Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification. J. Acoust. Soc. Am. 55, 1304 (1974)
Article Google Scholar
Jourani, R. Reconnaissance automatique du locuteur par des GMM à grande marge”, UT3 Paul Sabatier (2012)
Google Scholar
Dehak, R., Dehak, N., Kenny, P., Dumouchel, P.: Linear and non linear kernel GMM supervector machines for speaker verification. In: Proceedings of Interspeech, Antwerp, Belgium, pp. 302–305 (2007)
Google Scholar
Mammone, R., Zhang, X., Ramachandran, R.: Robust speaker recognition: a feature-based approach. IEEE Signal Process. Mag. 13(5), 58–71 (1996)
Article Google Scholar
Pitsikalis, V., Maragos, P.: Some advances on speech analysis using generalized dimensions. In: ISCA Tutorial and Research Workshop on Non-Linear Speech Processing (NOLISP) (2003)
Google Scholar
Poddar, A., Sahidullah, M., Saha, G.: Speaker verification with short utterances: a review of challenges, trends and opportunities. IET Biometrics 7(2), 91–101 (2017)
Article Google Scholar
Chakroun, R., Frikha, M.: Robust features for text-independent speaker recognition with short utterances. Neural Comput. Appl. 32(17), 13863–13883 (2020). https://doi.org/10.1007/s00521-020-04793-y
Article Google Scholar
Dehak, N., Karam, Z., Reynolds, D., Dehak, R., Campbell, W., Glass, J.: A channel-blind system for speaker verification. In: Proceedings of ICASSP, pp. 4536–4539, Prague, Czech Republic, May 2011
Google Scholar
Dehak, N., Kenny, P., Dehak, R., Dumouchel, P., Ouellet, P.: Front-end factor analysis for speaker verification. IEEE Trans. Audio Speech Lang. Process. 19(4), 788–798 (2011)
Article Google Scholar
Zhang, W.Q., Zhao, J., Zhang, W.L., et al.: Multi-scale kernels for short utterance speaker recognition. In: Proceedings of ISCSLP 2014, pp. 414–417
Google Scholar
McLaren, M., Matrouf, D., Vogt, R., Bonastre, J.-F.: Applying svms and weight-based factor analysis to unsupervised adaptation for speaker verification. Comput. Speech Lang. 25(2), 327–340 (2011)
Article Google Scholar
Rao, W., Mak, M.W.: Construction of discriminative kernels from known and unknown non-targets for PLDA-SVM scoring. In: 2014 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp. 4012–4016. IEEE (2014 May)
Google Scholar
Chakroun, R., Frikha, M.: New approach for short utterance speaker identification. IET Signal Process. 12(7), 873–880 (2018)
Article Google Scholar
Chakroun, R., Frikha, M.: Efficient text-independent speaker recognition with short utterances in both clean and uncontrolled environments. Multimedia Tools Appl. 79, 21279–21298 (2020). https://doi.org/10.1007/s11042-020-08824-7
Article Google Scholar
Kim, C., Stern, R.M.: Power-normalized cepstral coefficients (PNCC) for robust speech recognition. IEEE/ACM Trans. Audio Speech Lang. Process. 24(7), 1315–1329 (2016)
Article Google Scholar
Nayana, P. K., Mathew, D., Thomas, A.: Performance comparison of speaker recognition systems using GMM and i-vector methods with PNCC and RASTA PLP features. In: 2017 International Conference on Intelligent Computing, Instrumentation and Control Technologies (ICICICT), pp. 438–443. IEEE (2017 July)
Google Scholar
Al-Kaltakchi, M.T., Woo, W.L., Dlay, S.S., Chambers, J.A.: Study of fusion strategies and exploiting the combination of MFCC and PNCC features for robust biometric speaker identification. In: 2016 4th International Conference on Biometrics and Forensics (IWBF), pp. 1–6. IEEE (March 2016)
Google Scholar
Shi, X.Y., Jing, X.X., Zeng, M., Yang, H.Y.: Robust speaker recognition based on improved PNCC and i-vector. Comput. Eng. Des. 4, 42 (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

Advanced Technologies for Image and Signal Processing (ATISP) Research Unit, Sfax, Tunisia
Rania Chakroun & Mondher Frikha
National School of Electronics and Telecommunications of Sfax, Sakiet Ezzit, Tunisia
Mondher Frikha
National School of Engineering of Sfax, Sfax, Tunisia
Rania Chakroun

Authors

Rania Chakroun
View author publications
You can also search for this author in PubMed Google Scholar
Mondher Frikha
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rania Chakroun .

Editor information

Editors and Affiliations

Department of Digital Innovation, School of Business, University of Nicosia, Nicosia, Cyprus
Marinos Themistocleous
British University in Dubai, Dubai, United Arab Emirates
Maria Papadaki
School of Strategy and Leadership, Coventry University, Coventry, UK
Muhammad Mustafa Kamal

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chakroun, R., Frikha, M. (2020). A New Text Independent Speaker Recognition System with Short Utterances Using SVM. In: Themistocleous, M., Papadaki, M., Kamal, M.M. (eds) Information Systems. EMCIS 2020. Lecture Notes in Business Information Processing, vol 402. Springer, Cham. https://doi.org/10.1007/978-3-030-63396-7_38

Download citation

DOI: https://doi.org/10.1007/978-3-030-63396-7_38
Published: 21 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-63395-0
Online ISBN: 978-3-030-63396-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics