Skip to main content

On the Impact of Non-speech Sounds on Speaker Recognition

  • Conference paper
Text, Speech and Dialogue (TSD 2012)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7499))

Included in the following conference series:

Abstract

This paper investigates the impact of non-speech sounds on the performance of speaker recognition. Various experiments were conducted to check what the accuracy of speaker classification would be if non-speech sounds, such as breaths, were removed from the training and/or testing speech. Experiments were run using the GMM-UBM algorithm and speech taken from the TIMIT speech corpus, either original or transcoded using the G.711 or GSM 06.10 codecs. The results show a remarkable contribution of non-speech sounds to the overall speaker recognition performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ruinskiy, D., Lavner, Y.: An effective algorithm for automatic detection and exact demarcation of breath sounds in speech and song signals. IEEE Transactions on Audio, Speech, and Language Processing 15(3), 838–850 (2007)

    Article  Google Scholar 

  2. Rajnoha, J.: Speaker non-speech event recognition with standard speech datasets. Acta Polytechnica 47(4-5/2007), 107–111 (2008)

    Google Scholar 

  3. Rapcan, V., D’Arcy, S., Reilly, R.B.: Automatic breath sound detection and removal for cognitive studies of speech and language. In: IET Irish Signals and Systems Conference (ISSC 2009), pp. 1–6 (2009)

    Google Scholar 

  4. Sa, R.C., Verbandt, Y.: Automated breath detection on long-duration signals using feedforward backpropagation artificial neural networks. IEEE Transactions on Biomedical Engineering 49(10), 1130–1141 (2002)

    Article  Google Scholar 

  5. Liao, W.H., Lin, Y.K.: Classification of non-speech human sounds: Feature selection and snoring sound analysis. In: IEEE International Conference on Systems, Man and Cybernetics, SMC 2009, pp. 2695–2700 (2009)

    Google Scholar 

  6. Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted gaussian mixture models. In: Digital Signal Processing (2000)

    Google Scholar 

  7. Campbell, W.M., Sturim, D.E., Reynolds, D.A.: Support vector machines using gmm supervectors for speaker verification. IEEE Signal Processing Letters 13, 308–311 (2006)

    Article  Google Scholar 

  8. Janicki, A., Staroszczyk, T.: Speaker Recognition from Coded Speech Using Support Vector Machines. In: Habernal, I., Matoušek, V. (eds.) TSD 2011. LNCS (LNAI), vol. 6836, pp. 291–298. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  9. Hautamäki, V., Tuononen, M., Niemi-Laitinen, T., Fränti, P.: Improving speaker verification by periodicity based voice activity detection. In: Proc. 12th International Conference on Speech and Computer, SPECOM 2007, pp. 645–650 (2007)

    Google Scholar 

  10. Garofolo, J., Lamel, L., Fisher, W., Fiscus, J., Pallett, D., Dahlgren, N., Zue, V.: Timit acoustic-phonetic continuous speech corpus. Linguistic Data Consortium, Philadelphia (1993)

    Google Scholar 

  11. Besacier, L., Grassi, S., Dufaux, A., Ansorge, M., Pellandini, F.: Gsm speech coding and speaker recognition. In: Proc. ICASSP, pp. 1085–1088 (2000)

    Google Scholar 

  12. Jiang, T., Gao, B., Han, J.: Speaker identification and verification from audio coded speech in matched and mismatched conditions. In: Proc. of the IEEE International Conference on Robotics and Biomimetics, ROBIO 2009, pp. 2199–2204 (2009)

    Google Scholar 

  13. Yu, E.W.M., Mak, M.-W., Kung, S.-Y.: Speaker Verification from Coded Telephone Speech Using Stochastic Feature Transformation and Handset Identification. In: Chen, Y.-C., Chang, L.-W., Hsu, C.-T. (eds.) PCM 2002. LNCS, vol. 2532, pp. 598–606. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  14. Cappe, O.: h2m toolkit, http://www.tsi.enst.fr/~cappe/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Janicki, A. (2012). On the Impact of Non-speech Sounds on Speaker Recognition. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2012. Lecture Notes in Computer Science(), vol 7499. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32790-2_69

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-32790-2_69

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-32789-6

  • Online ISBN: 978-3-642-32790-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics