Abstract
This paper presents a procedure of acquiring linguistic data from the broadcast media and its use in language recognition. The goal of this work is to answer the question whether the automatically obtained data from broadcasts can replace or augment to the continuous telephone speech. The main challenges are channel compensation issues and great portion of unspontaneous speech in broadcasts. The experimental results are obtained on NIST LRE 2007 evaluation system, using both NIST provided training data and data, obtained from broadcasts.
This work was partly supported by European projects AMIDA (FP6-033812), Caretaker (FP6-027231) and MOBIO (FP7-214324), by Grant Agency of Czech Republic under project No. 102/08/0707 and by Czech Ministry of Education under project No. MSM0021630528. The hardware used in this work was partially provided by CESNET under project No. 162/2005. Lukáš Burget was supported by Grant Agency of Czech Republic under project No. GP102/06/383.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Řezníček, I.: Audiovisual recording system. Diploma thesis, Brno University of Technology FIT (2007)
Matějka, P., Burget, L., Schwarz, P., Černocký, J.: Nist language recognition evaluation 2005. In: Proceedings of NIST LRE 2005, pp. 1–37 (2006)
Burget, L., Matějka, P., Schwarz, P., Glembek, O., Černocký, J.: Analysis of feature extraction and channel compensation in gmm speaker recognition system. IEEE Transactions on Audio, Speech, and Language Processing 15(7), 1979–1986 (2007)
Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted gaussian mixture models. Digital Signal Processing 10(1–3), 19–41 (2000)
Matějka, P., Burget, L., Schwarz, P., Černocký, J.: Brno university of technology system for nist 2005 language recognition evaluation. In: Proc. NIST LRE 2005 Workshop, San Juan, Puerto Rico, June 2006, pp. 57–64 (2005)
Torres-Carrasquillo, P.A., Singer, E., Kohler, M.A., Greene, R.J., Reynolds, D.A., Deller, J.R.: Approaches to language identification using gaussian mixture models and shifted delta cepstral features. In: Proc. 7th International Conference on Spoken Language Processing, Denver, Colorado, USA (September 2002)
Cohen, J., Kamm, T., Andreou, A.G.: Vocal tract normalization in speech recognition: Compensating for systematic speaker variability. The Journal of the Acoustical Society of America 97(5), 3246–3247 (1995)
Reynolds, D.A.: Comparison of background normalization methods for text-independent speaker verification. In: Proc. Eurospeech, Rhodes, Greece, September 1997, pp. 963–966 (1997)
Brummer, N.: Spescom DataVoice NIST 2004 system description. In: Proc. NIST Speaker Recognition Evaluation 2004, Toledo, Spain (June 2004)
The 2007 NIST Language Recognition Evaluation Plan (LRE 2007), http://www.nist.gov/speech/tests/lang/2007/LRE07EvalPlan-v8b.pdf
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Plchot, O., Hubeika, V., Burget, L., Schwarz, P., Matějka, P. (2008). Acquisition of Telephone Data from Radio Broadcasts with Applications to Language Recognition. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2008. Lecture Notes in Computer Science(), vol 5246. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87391-4_61
Download citation
DOI: https://doi.org/10.1007/978-3-540-87391-4_61
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-87390-7
Online ISBN: 978-3-540-87391-4
eBook Packages: Computer ScienceComputer Science (R0)