Skip to main content
Log in

Semantic speech recognition in the Basque context Part II: language identification for under-resourced languages

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

This paper describes the development of a Language Identification (LID) system oriented to robust Multilingual Speech Recognition in the Basque context where coexist three languages: Basque, Spanish and French. The LID system is integrated in GorUP, a Semantic Speech Recognition system for industrial complex environments described in Part I. The work presents hybrid strategies for LID, based on the selection of system elements by several classifiers (Support Vector Machines and Multilayer Perceptron) and Discriminant Analysis improved with robust regularized covariance matrix estimation methods oriented to under-resourced languages and stochastic methods for speech recognition tasks (Hidden Markov Models and n-grams). The LID tool manages the main elements of the Automatic Speech Recognition system (Acoustic Phonetic Decoder, Language Model and Lexicons).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Ambikairajah, L., & Choi, E. (2005). Robust language identification based on fused phonotactic information with MLKSFM ICME. In IEEE international conference on pre-classifier, multimedia and expo.

    Google Scholar 

  • Barroso, N., Ezeiza, A., Gilisagasti, N., Lopez de Ipiña, K., López, A., & López, J. M. (2007). Development of multimodal resources for multilingual information retrieval in the Basque context. In Proc. of Interspeech 2007, Antwerp, Belgium.

    Google Scholar 

  • Barroso, N., Hernández, M., López de Ipiña, K., & Ezeiza, A. (2011a). Covariance matrix estimation methods. www.mathworks.com.

  • Barroso, N., López de Ipiña, K., Ezeiza, A., Hernández, C., Ezeiza, N., Barroso, O., Susperregi, U., & Barroso, S. (2011b). GorUp: an ontology-driven audio information retrieval system that suits the requirements of under-resourced languages. In INTERSPEECH, Florence, Italy.

    Google Scholar 

  • Cosi, P. (2000). Hybrid HMM-NN architectures for connected digit recognition. In Proc. of the international joint conference on neural networks, Vol. 5.

    Google Scholar 

  • Dau-Cheng, L., & Ren-Yuan, L. (2008). Language identification on code-switching utterances using multiple cues. In Proc. of Interspeech.

    Google Scholar 

  • Friedman, J. H. (1989). Regularized discriminant analysis. Journal of the American Statistical Association, 84, 165–175.

    Article  MathSciNet  Google Scholar 

  • Ganapathiraju, A., Hmaker, J., & Picone, J. (2000). Hybrid SVM/HMM architectures for speech recognition. In Proc. of the international conference on spoken language processing (Vol. 4, pp. 504–507).

    Google Scholar 

  • Hoffbeck, J. P., & Landgrebe, D. (1996). Covariance estimation and classification with limited training data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(7), 763–767.

    Article  Google Scholar 

  • Le, V. B., & Besacier, L. (2009). Automatic speech recognition for under-resourced languages: application to Vietnamese language. IEEE Transactions on Audio, Speech, and Language Processing, 17(8), 1471–1482.

    Article  Google Scholar 

  • Li, H., & Ma, B. (2005). A phonotactic language model for spoken LID. In ACL 2005.

    Google Scholar 

  • Lopez de Ipiña, K., Graña, M., Ezeiza, N., Hernández, M., Zulueta, E., Ezeiza, A., & Tovar, C. (2003). Selection of lexical units for CSR of Basque. In Progress in pattern recognition. speech and image analysis, LNCS (Vol. 2003, pp. 244–250). Berlin: Springer.

    Chapter  Google Scholar 

  • Ma, B., & Li, H. (2005). An acoustic segment modeling approach to automatic language identification. In Proc. Interspeech 2005, Lisbon, Portugal (pp. 2829–2832).

    Google Scholar 

  • Martinez, A., & Kak, A. (2001). PCA versus LDA. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(2), 228–233.

    Article  Google Scholar 

  • Matejka, P., Schwarz, P., Cernocky, J., & Chytil, P. (2005). Phonotactic LID using high quality phoneme recognition. In Proc. Interspeech 2005, Lisbon, Portugal (pp. 2237–2240).

    Google Scholar 

  • Nagarajan, T., & Murthy, H. A. (2004). Language identification, using parallel syllable-like unit recognition. In Proc. ICASSP 2004 (Vol. I, pp. 401–404).

    Google Scholar 

  • Padrell, J., Martín-Iglesias, D., & Díaz-de-María, F. (2006). Support vector machines for continuous speech recognition. In 14th European signal processing conference (BSSIPCO 2006), Florence, Italy, September 4–8.

    Google Scholar 

  • Schultz, T., & Kirchhoff, N. (2006). Multilingual speech processing. Amsterdam: Elsevier.

    Google Scholar 

  • Schultz, T., & Waibel, A. (1998). Multilingual and crosslingual speech recognition. In Proceedings of the DARPA broadcast news. Workshop.

    Google Scholar 

  • Seng, S., Sam, S., Le, V. B., Bigi, B., & Besacier, L. (2008). Which units for acoustic and language modeling for Khmer automatic speech recognition. In 1st international conference on spoken language processing for under-resourced languages, Hanoi, Vietnam.

    Google Scholar 

  • Smith, N., & Gales, M. (2002). Speech recognition using SVMs. Advances in neural information processing systems, Vol. 14. Cambridge: MIT Press.

    Google Scholar 

  • Tadjudin, S., & Landgrebe, D. (1998). Classification of high dimensional data with limited training samples (Technical Report TRECE 98-8). School of Electrical and Computer Engineering, Purdue University, West Lafayette, Indiana.

  • Tadjudin, S., & Landgrebe, D. (2000). Covariance estimation with limited training samples. IEEE Transactions on Geoscience and Remote Sensing, 37.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Karmele López de Ipiña.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Barroso, N., López de Ipiña, K., Hernández, C. et al. Semantic speech recognition in the Basque context Part II: language identification for under-resourced languages. Int J Speech Technol 15, 41–47 (2012). https://doi.org/10.1007/s10772-011-9114-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10772-011-9114-4

Keywords

Navigation