Skip to main content

A New Text Independent Speaker Recognition System with Short Utterances Using SVM

  • Conference paper
  • First Online:
Information Systems (EMCIS 2020)

Part of the book series: Lecture Notes in Business Information Processing ((LNBIP,volume 402))

  • 1806 Accesses

Abstract

Recent advances in the field of speaker recognition have proved to highly outperform algorithms. However this performance degrades when limited data are presented. This paper presents examples on how Support Vector Machines (SVM) can improve speaker recognition for short utterance data duration. The main contribution in this approach is the use of new vectors when training and testing data are limited. We show how different kernels function of SVM can be used to validate the new approach with different speakers from different databases.

R. Chakroun---No academic titles or descriptions of academic positions should be included in the addresses. The affiliations should consist of the author’s institution, town, and country.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Jain, A., Ross, A., Prabhakar, S.: An introduction to biometric recognition. IEEE Trans. Circ. Syst. Video Technol. 14(1), 4–20 (2004)

    Article  Google Scholar 

  2. Reynolds, D.: An overview of automatic speaker recognition technology. In: Proceedings of IEEE International Conference Acoustics Speech Signal Processing (ICASSP), vol. 4, pp. 4072–4075 (2002)

    Google Scholar 

  3. Togneri, R., Pullella, D.: An overview of speaker identification: accuracy and robustness issues. In: IEEE Circuits And Systems Magazine, vol. 11, no. 2, pp. 23–61 (2011) ISSN: 1531-636X

    Google Scholar 

  4. Snyder, D., Ghahremani, P., Povey, D., Garcia-Romero, D., Carmiel, Y., Khudanpur, S.: Deep neural network-based speaker embeddings for end-to-end speaker verification. In: 2016 IEEE Spoken Language Technology Workshop (SLT), pp. 165–170. IEEE (December 2016)

    Google Scholar 

  5. Zhang, S.X, Chen, Z., Zhao, Y., Li, J., Gong, Y.: End-to-end attention based text-dependent speaker verification. arXiv preprint arXiv:1701.00562 (2017)

  6. Variani, E., Lei, X., McDermott, E., Moreno, I.L., Gonzalez-Dominguez, J.: Deep neural networks for small footprint textdependent speaker verification. In: 2014 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp. 4052–4056. IEEE (2014)

    Google Scholar 

  7. Heigold, G., Moreno, I., Bengio, S., Shazeer, N.: End-to-endtext-dependent speaker verification. In: 2016 IEEE international conference on Acoustics, speech and signal processing (ICASSP), pp 5115–5119. IEEE (2016)

    Google Scholar 

  8. Cortes, C., Vapnick, V.: Support vector networks. Mach. Learn. 20, 1–25 (1995)

    Google Scholar 

  9. Kamppari, S.O., Hazen, T. J.: Word and phone level acoustic confidence scoring. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (2000)

    Google Scholar 

  10. Reynolds, D.A., Quatieri, T.F., Dunn, R.: Speaker verification using adapted gaussian mixture models. Digital Signal Process. 10(1–3), 19–41 (2000)

    Article  Google Scholar 

  11. Keshet, J., Bengio, S.: Automatic Speech and Speaker Recognition: Large Margin and Kernel Methods. Wiley, Hoboken (2009)

    Book  Google Scholar 

  12. Louradour, J., Daoudi, K., Bach, F.: Feature space mahalanobis sequence kernels: application to svm speaker verification. IEEE Trans. Audio Speech Lang. Process. 15(8), 2465–2475 (2007)

    Article  Google Scholar 

  13. Campbell, W.M.: Generalized linear discriminant sequence kernels for speaker recognition. In: Proceedings of the International Conference on Acoustics Speech and Signal Processing. pp. 161–164 (2002)

    Google Scholar 

  14. Campbell, W.M., Sturim, D.E., Reynolds, D.A.: Support vector machine using GMM supervectors for speaker verification. IEEE Signal Process. Lett. 13(5), 308–311 (2006)

    Article  Google Scholar 

  15. Chung, J.S., Nagrani, A., Zisserman, A.: Voxceleb2: deepspeaker recognition. In: Proceedings of Interspeech 2018, pp. 1086–1090 (2018)

    Google Scholar 

  16. Reynolds, D.A.: Automatic speaker recognition using gaussian mixture speaker models. Lincoln Lab. J. 8(2), 173–192 (1995)

    Google Scholar 

  17. Atal, B.: Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification. J. Acoust. Soc. Am. 55, 1304 (1974)

    Article  Google Scholar 

  18. Jourani, R. Reconnaissance automatique du locuteur par des GMM à grande marge”, UT3 Paul Sabatier (2012)

    Google Scholar 

  19. Dehak, R., Dehak, N., Kenny, P., Dumouchel, P.: Linear and non linear kernel GMM supervector machines for speaker verification. In: Proceedings of Interspeech, Antwerp, Belgium, pp. 302–305 (2007)

    Google Scholar 

  20. Mammone, R., Zhang, X., Ramachandran, R.: Robust speaker recognition: a feature-based approach. IEEE Signal Process. Mag. 13(5), 58–71 (1996)

    Article  Google Scholar 

  21. Pitsikalis, V., Maragos, P.: Some advances on speech analysis using generalized dimensions. In: ISCA Tutorial and Research Workshop on Non-Linear Speech Processing (NOLISP) (2003)

    Google Scholar 

  22. Poddar, A., Sahidullah, M., Saha, G.: Speaker verification with short utterances: a review of challenges, trends and opportunities. IET Biometrics 7(2), 91–101 (2017)

    Article  Google Scholar 

  23. Chakroun, R., Frikha, M.: Robust features for text-independent speaker recognition with short utterances. Neural Comput. Appl. 32(17), 13863–13883 (2020). https://doi.org/10.1007/s00521-020-04793-y

    Article  Google Scholar 

  24. Dehak, N., Karam, Z., Reynolds, D., Dehak, R., Campbell, W., Glass, J.: A channel-blind system for speaker verification. In: Proceedings of ICASSP, pp. 4536–4539, Prague, Czech Republic, May 2011

    Google Scholar 

  25. Dehak, N., Kenny, P., Dehak, R., Dumouchel, P., Ouellet, P.: Front-end factor analysis for speaker verification. IEEE Trans. Audio Speech Lang. Process. 19(4), 788–798 (2011)

    Article  Google Scholar 

  26. Zhang, W.Q., Zhao, J., Zhang, W.L., et al.: Multi-scale kernels for short utterance speaker recognition. In: Proceedings of ISCSLP 2014, pp. 414–417

    Google Scholar 

  27. McLaren, M., Matrouf, D., Vogt, R., Bonastre, J.-F.: Applying svms and weight-based factor analysis to unsupervised adaptation for speaker verification. Comput. Speech Lang. 25(2), 327–340 (2011)

    Article  Google Scholar 

  28. Rao, W., Mak, M.W.: Construction of discriminative kernels from known and unknown non-targets for PLDA-SVM scoring. In: 2014 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp. 4012–4016. IEEE (2014 May)

    Google Scholar 

  29. Chakroun, R., Frikha, M.: New approach for short utterance speaker identification. IET Signal Process. 12(7), 873–880 (2018)

    Article  Google Scholar 

  30. Chakroun, R., Frikha, M.: Efficient text-independent speaker recognition with short utterances in both clean and uncontrolled environments. Multimedia Tools Appl. 79, 21279–21298 (2020). https://doi.org/10.1007/s11042-020-08824-7

    Article  Google Scholar 

  31. Kim, C., Stern, R.M.: Power-normalized cepstral coefficients (PNCC) for robust speech recognition. IEEE/ACM Trans. Audio Speech Lang. Process. 24(7), 1315–1329 (2016)

    Article  Google Scholar 

  32. Nayana, P. K., Mathew, D., Thomas, A.: Performance comparison of speaker recognition systems using GMM and i-vector methods with PNCC and RASTA PLP features. In: 2017 International Conference on Intelligent Computing, Instrumentation and Control Technologies (ICICICT), pp. 438–443. IEEE (2017 July)

    Google Scholar 

  33. Al-Kaltakchi, M.T., Woo, W.L., Dlay, S.S., Chambers, J.A.: Study of fusion strategies and exploiting the combination of MFCC and PNCC features for robust biometric speaker identification. In: 2016 4th International Conference on Biometrics and Forensics (IWBF), pp. 1–6. IEEE (March 2016)

    Google Scholar 

  34. Shi, X.Y., Jing, X.X., Zeng, M., Yang, H.Y.: Robust speaker recognition based on improved PNCC and i-vector. Comput. Eng. Des. 4, 42 (2017)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rania Chakroun .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chakroun, R., Frikha, M. (2020). A New Text Independent Speaker Recognition System with Short Utterances Using SVM. In: Themistocleous, M., Papadaki, M., Kamal, M.M. (eds) Information Systems. EMCIS 2020. Lecture Notes in Business Information Processing, vol 402. Springer, Cham. https://doi.org/10.1007/978-3-030-63396-7_38

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-63396-7_38

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-63395-0

  • Online ISBN: 978-3-030-63396-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics