Skip to main content

Language Identification Based on Phone Decoding for Basque and Spanish

  • Conference paper
  • 1539 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4477))

Abstract

This paper presents some experiments in language identification for Spanish and Basque, both official languages in the Basque Country in the North of Spain. We focus on four methods based on phone decoding, some of which make use of phonotactic knowledge. We run also a comparison between the use of a generic and a task-specific phonotactic model. Despite initial poor performances, significant accuracies are achieved when better phonotactic knowledge is used. The use of a task-specific phonotactic model performs slightly better, but it is only useful when using less expensive methods. Finally, we present a temporal evolution of the accuracies. Results show that 5-6 seconds are enough to achieve similar percentage of correctly classified utterances.

This work was partially supported by the CICYT project TIN2005-08660-C04-03 and by the University of the Basque Country under grant 9/UPV 00224.310-15900/2004.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Itakahashi, S., Du, L.: Language identification based on speech fundamental frequency. In: Proceedings of the EUROSPEECH, Madrid, Spain, vol. 2, pp. 1359–1362 (1995)

    Google Scholar 

  2. Mary, L., Rao, K.S., Yegnanarayana, B.: Neural network classifiers for language identification using phonotactic and prosodic features. In: Proceedings of International Conference on Intelligent Sensing and Information Processing, Chennai, India, pp. 404–408 (2005)

    Google Scholar 

  3. Zissman, M.A., Singer, E.: Automatic language identification of telephone speech messages using phoneme recognition and n-gram modelling. In: Proceedings of ICASSP, Adelaide, Australia, vol. 1, pp. 305–308 (1994)

    Google Scholar 

  4. Singer, E., Torres-Carrasquillo, P.A., Gleason, T.P., Campbell, W.M., Reynolds, D.A.: Acoustic, phonetic and discriminative approaches to automatic language identification. In: Proceedings of the EUROSPEECH, Geneva, Switzerland, pp. 1349–1352 (2003)

    Google Scholar 

  5. Navrátil, J., Zühlke, W.: An efficient phonotactic-acoustic system for language identification. In: Proceedings of the ICASSP, Seattle, USA, vol. 2, pp. 781–784 (1998)

    Google Scholar 

  6. Schultz, T., Rogina, I., Waibel, A.: Lvcsr-based language identification. In: Proceedings of the ICASSP, Atlanta, USA, pp. 781–784 (1996)

    Google Scholar 

  7. Metze, F., Kemp, T., Schaaf, T., Schultz, T., Soltau, H.: Confidence measure based language identification. In: Proceedings of the ICASSP, Istanbul, Turkey (2000)

    Google Scholar 

  8. Young, S.R.: Detecting misrecognitions and out-of-vocabulary words. In: Proceedings of the ICASSP, Adelaide, Australia, vol. 2, pp. 21–24 (1994)

    Google Scholar 

  9. Hieronymus, J.L., Kadambe, S.: Spoken language identification using large vocabulary speech recognition. In: Proceedings of International Conference of Spoken Language Processing, Philadelphia, USA, pp. 1780–1783 (1996)

    Google Scholar 

  10. Guijarrubia, V., Torres, I., Rodríguez, L.: Evaluation of a spoken phonetic database in basque language. In: Proceedings of the 4th International Conference on Language Resources and Evaluation, Lisbon, vol. 6, pp. 2127–2130 (2004)

    Google Scholar 

  11. Moreno, A., Poch, D., Bonafonte, A., Lleida, E., Llisterri, J., Mariño, J.B., Nadeu, C.: Albayzin speech database: Design of the phonetic corpus. In: Proceedings of the EUROSPEECH, Lisbon (1993)

    Google Scholar 

  12. Pérez, A., Torres, I., Casacuberta, F., Guijarrubia, V.: A Spanish-Basque weather forecast corpus for probabilistic speech translation. In: 5th SALTMIL Workshop on Minority Languages, Genoa, Italy, pp. 99–101 (2006)

    Google Scholar 

  13. Torres, I., Varona, A.: k-tss language model in a speech recognition system. Computer Speech and Language 15(2), 127–149 (2001)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Joan Martí José Miguel Benedí Ana Maria Mendonça Joan Serrat

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Guijarrubia, V.G., Torres, M.I. (2007). Language Identification Based on Phone Decoding for Basque and Spanish. In: Martí, J., Benedí, J.M., Mendonça, A.M., Serrat, J. (eds) Pattern Recognition and Image Analysis. IbPRIA 2007. Lecture Notes in Computer Science, vol 4477. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72847-4_31

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-72847-4_31

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-72846-7

  • Online ISBN: 978-3-540-72847-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics