On the Impact of Non-speech Sounds on Speaker Recognition

Janicki, Artur

doi:10.1007/978-3-642-32790-2_69

Artur Janicki²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7499))

Included in the following conference series:

International Conference on Text, Speech and Dialogue

1688 Accesses
2 Citations

Abstract

This paper investigates the impact of non-speech sounds on the performance of speaker recognition. Various experiments were conducted to check what the accuracy of speaker classification would be if non-speech sounds, such as breaths, were removed from the training and/or testing speech. Experiments were run using the GMM-UBM algorithm and speech taken from the TIMIT speech corpus, either original or transcoded using the G.711 or GSM 06.10 codecs. The results show a remarkable contribution of non-speech sounds to the overall speaker recognition performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ruinskiy, D., Lavner, Y.: An effective algorithm for automatic detection and exact demarcation of breath sounds in speech and song signals. IEEE Transactions on Audio, Speech, and Language Processing 15(3), 838–850 (2007)
Article Google Scholar
Rajnoha, J.: Speaker non-speech event recognition with standard speech datasets. Acta Polytechnica 47(4-5/2007), 107–111 (2008)
Google Scholar
Rapcan, V., D’Arcy, S., Reilly, R.B.: Automatic breath sound detection and removal for cognitive studies of speech and language. In: IET Irish Signals and Systems Conference (ISSC 2009), pp. 1–6 (2009)
Google Scholar
Sa, R.C., Verbandt, Y.: Automated breath detection on long-duration signals using feedforward backpropagation artificial neural networks. IEEE Transactions on Biomedical Engineering 49(10), 1130–1141 (2002)
Article Google Scholar
Liao, W.H., Lin, Y.K.: Classification of non-speech human sounds: Feature selection and snoring sound analysis. In: IEEE International Conference on Systems, Man and Cybernetics, SMC 2009, pp. 2695–2700 (2009)
Google Scholar
Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted gaussian mixture models. In: Digital Signal Processing (2000)
Google Scholar
Campbell, W.M., Sturim, D.E., Reynolds, D.A.: Support vector machines using gmm supervectors for speaker verification. IEEE Signal Processing Letters 13, 308–311 (2006)
Article Google Scholar
Janicki, A., Staroszczyk, T.: Speaker Recognition from Coded Speech Using Support Vector Machines. In: Habernal, I., Matoušek, V. (eds.) TSD 2011. LNCS (LNAI), vol. 6836, pp. 291–298. Springer, Heidelberg (2011)
Chapter Google Scholar
Hautamäki, V., Tuononen, M., Niemi-Laitinen, T., Fränti, P.: Improving speaker verification by periodicity based voice activity detection. In: Proc. 12th International Conference on Speech and Computer, SPECOM 2007, pp. 645–650 (2007)
Google Scholar
Garofolo, J., Lamel, L., Fisher, W., Fiscus, J., Pallett, D., Dahlgren, N., Zue, V.: Timit acoustic-phonetic continuous speech corpus. Linguistic Data Consortium, Philadelphia (1993)
Google Scholar
Besacier, L., Grassi, S., Dufaux, A., Ansorge, M., Pellandini, F.: Gsm speech coding and speaker recognition. In: Proc. ICASSP, pp. 1085–1088 (2000)
Google Scholar
Jiang, T., Gao, B., Han, J.: Speaker identification and verification from audio coded speech in matched and mismatched conditions. In: Proc. of the IEEE International Conference on Robotics and Biomimetics, ROBIO 2009, pp. 2199–2204 (2009)
Google Scholar
Yu, E.W.M., Mak, M.-W., Kung, S.-Y.: Speaker Verification from Coded Telephone Speech Using Stochastic Feature Transformation and Handset Identification. In: Chen, Y.-C., Chang, L.-W., Hsu, C.-T. (eds.) PCM 2002. LNCS, vol. 2532, pp. 598–606. Springer, Heidelberg (2002)
Chapter Google Scholar
Cappe, O.: h2m toolkit, http://www.tsi.enst.fr/~cappe/

Download references

Author information

Authors and Affiliations

Institute of Telecommunication, Warsaw University of Technology, ul. Nowowiejska 15/19, 00-665, Warsaw, Poland
Artur Janicki

Authors

Artur Janicki
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Informatics, Department of Computer Graphics and Design, Masaryk University, Botanická 68a, 602 00, Brno, Czech Republic
Petr Sojka
Faculty of Informatics, Department of Information Technologies, Masaryk University, Botanická 68a, 602 00, Brno, Czech Republic
Aleš Horák , Ivan Kopeček & Karel Pala , &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Janicki, A. (2012). On the Impact of Non-speech Sounds on Speaker Recognition. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2012. Lecture Notes in Computer Science(), vol 7499. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32790-2_69

Download citation

DOI: https://doi.org/10.1007/978-3-642-32790-2_69
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32789-6
Online ISBN: 978-3-642-32790-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

On the Impact of Non-speech Sounds on Speaker Recognition