Abstract
In the present paper we report on some of the results obtained by fusion of human assisted speaker verification methods based on formant features and statistics of phone durations. Our experiments on the database of spontaneous speech demonstrate that using segmental durational characteristics leads to better performance, which shows the applicability of these features for the speaker verification task.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Superscript t is omitted for the sake of presentation clarity.
References
Kunzel, H., Masthoff, H., Koster, J.: The relation between speech tempo, loudness, and fundamental frequency: an important issue in forensic speaker recognition. Sci. Justice 35(4), 291–295 (1995)
Nolan, F.: Intonation in speaker identification: an experiment on pitch alignment features. Forensic Linguist. 9(1), 1–21 (2002)
Smirnova, N., et al.: Using parameters of identical pitch contour elements for speaker discrimination. In: Proceedings of the 12th International Conference on Speech and Computer, SPECOM 2007, Moscow, Russia, pp. 361–366 (2007)
Morrison, G.: Likelihood-ratio-based forensic speaker comparison using representations of vowel formant trajectories. J. Acoust. Soc. Am. 125, 2387–2397 (2009)
Nolan, F., Grigoras, C.: A case for formant analysis in forensic speaker identification. J. Speech Lang. Law 12(2), 143–173 (2005)
Rose, P., Osanai, T., Kinoshita, Y.: Strength of forensic speaker identification evidence: multispeaker formant-and cepstrum-based segmental discrimination with a Bayesian likelihood ratio as threshold. Forensic Linguist. 10(2), 179–202 (2003)
Becker, T., Jessen, M., Grigoras, C.: Forensic speaker verification using formant features and Gaussian mixture models. In: Proceedings of the Interspeech 2008 Incorporating SST, International Speech Communication Association, pp. 1505–1508 (2008)
Dellwo, V., Leemann, A., Kolly, M.-J.: Speaker idiosyncratic rhythmic features in the speech signal. In: Proceedings of Interspeech, Portland, USA, 9–13 September, pp. 1584–1587 (2012)
Leemann, A., Kolly, M.-J., Dellwo, V.: Speaker-individuality in suprasegmental temporal features: implications for forensic voice comparison. Forensic Sci. Int. 238, 59–67 (2014)
Van Heerden, C., Barnard, E.: Speaker-specific variability of phoneme durations. S. Afr. Comput. J. (SACJ) 40, 44–50 (2008)
Schwarz, P.: Phoneme recognition based on long temporal context. Ph.D. thesis, Brno University of Technology (2009)
Chernykh, G., Korenevsky, M., Levin, K., Ponomareva, I., Tomashenko, N.: State level control for acoustic model training. In: Ronzhin, A., Potapova, R., Delic, V. (eds.) SPECOM 2014. LNCS, vol. 8773, pp. 435–442. Springer, Heidelberg (2014)
Moreno, P., Joerg C., Van Thong, J.-M., Glickman, O.: A recursive algorithm for the forced alignment of very long audio segments. In: Proceedings of ICSLP 1998, Sydney, Australia, pp. 2711–2714. IEEE Press (1998)
Tomashenko, N.A., Khokhlov, Y.Y.: Fast algorithm for automatic alignment of speech and imperfect text data. In: Železný, M., Habernal, I., Ronzhin, A. (eds.) SPECOM 2013. LNCS, vol. 8113, pp. 146–153. Springer, Heidelberg (2013)
The NIST year 2010 Speaker Recognition Evaluation plan (2010). http://www.itl.nist.gov/iad/mig/tests/sre/2010/NIST_SRE10_evalplan.r6.pdf
Acknowledgments
This work was financially supported by the Government of the Russian Federation, Grant 074-U01.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Bulgakova, E., Sholohov, A., Tomashenko, N., Matveev, Y. (2015). Speaker Verification Using Spectral and Durational Segmental Characteristics. In: Ronzhin, A., Potapova, R., Fakotakis, N. (eds) Speech and Computer. SPECOM 2015. Lecture Notes in Computer Science(), vol 9319. Springer, Cham. https://doi.org/10.1007/978-3-319-23132-7_49
Download citation
DOI: https://doi.org/10.1007/978-3-319-23132-7_49
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23131-0
Online ISBN: 978-3-319-23132-7
eBook Packages: Computer ScienceComputer Science (R0)