Abstract
This paper presents some experiments in language identification for Spanish and Basque, both official languages in the Basque Country in the North of Spain. We focus on four methods based on phone decoding, some of which make use of phonotactic knowledge. We run also a comparison between the use of a generic and a task-specific phonotactic model. Despite initial poor performances, significant accuracies are achieved when better phonotactic knowledge is used. The use of a task-specific phonotactic model performs slightly better, but it is only useful when using less expensive methods. Finally, we present a temporal evolution of the accuracies. Results show that 5-6 seconds are enough to achieve similar percentage of correctly classified utterances.
This work was partially supported by the CICYT project TIN2005-08660-C04-03 and by the University of the Basque Country under grant 9/UPV 00224.310-15900/2004.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Itakahashi, S., Du, L.: Language identification based on speech fundamental frequency. In: Proceedings of the EUROSPEECH, Madrid, Spain, vol. 2, pp. 1359–1362 (1995)
Mary, L., Rao, K.S., Yegnanarayana, B.: Neural network classifiers for language identification using phonotactic and prosodic features. In: Proceedings of International Conference on Intelligent Sensing and Information Processing, Chennai, India, pp. 404–408 (2005)
Zissman, M.A., Singer, E.: Automatic language identification of telephone speech messages using phoneme recognition and n-gram modelling. In: Proceedings of ICASSP, Adelaide, Australia, vol. 1, pp. 305–308 (1994)
Singer, E., Torres-Carrasquillo, P.A., Gleason, T.P., Campbell, W.M., Reynolds, D.A.: Acoustic, phonetic and discriminative approaches to automatic language identification. In: Proceedings of the EUROSPEECH, Geneva, Switzerland, pp. 1349–1352 (2003)
Navrátil, J., Zühlke, W.: An efficient phonotactic-acoustic system for language identification. In: Proceedings of the ICASSP, Seattle, USA, vol. 2, pp. 781–784 (1998)
Schultz, T., Rogina, I., Waibel, A.: Lvcsr-based language identification. In: Proceedings of the ICASSP, Atlanta, USA, pp. 781–784 (1996)
Metze, F., Kemp, T., Schaaf, T., Schultz, T., Soltau, H.: Confidence measure based language identification. In: Proceedings of the ICASSP, Istanbul, Turkey (2000)
Young, S.R.: Detecting misrecognitions and out-of-vocabulary words. In: Proceedings of the ICASSP, Adelaide, Australia, vol. 2, pp. 21–24 (1994)
Hieronymus, J.L., Kadambe, S.: Spoken language identification using large vocabulary speech recognition. In: Proceedings of International Conference of Spoken Language Processing, Philadelphia, USA, pp. 1780–1783 (1996)
Guijarrubia, V., Torres, I., Rodríguez, L.: Evaluation of a spoken phonetic database in basque language. In: Proceedings of the 4th International Conference on Language Resources and Evaluation, Lisbon, vol. 6, pp. 2127–2130 (2004)
Moreno, A., Poch, D., Bonafonte, A., Lleida, E., Llisterri, J., Mariño, J.B., Nadeu, C.: Albayzin speech database: Design of the phonetic corpus. In: Proceedings of the EUROSPEECH, Lisbon (1993)
Pérez, A., Torres, I., Casacuberta, F., Guijarrubia, V.: A Spanish-Basque weather forecast corpus for probabilistic speech translation. In: 5th SALTMIL Workshop on Minority Languages, Genoa, Italy, pp. 99–101 (2006)
Torres, I., Varona, A.: k-tss language model in a speech recognition system. Computer Speech and Language 15(2), 127–149 (2001)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Guijarrubia, V.G., Torres, M.I. (2007). Language Identification Based on Phone Decoding for Basque and Spanish. In: Martí, J., Benedí, J.M., Mendonça, A.M., Serrat, J. (eds) Pattern Recognition and Image Analysis. IbPRIA 2007. Lecture Notes in Computer Science, vol 4477. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72847-4_31
Download citation
DOI: https://doi.org/10.1007/978-3-540-72847-4_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72846-7
Online ISBN: 978-3-540-72847-4
eBook Packages: Computer ScienceComputer Science (R0)