Abstract
In this paper, we present the implementation of an Automatic Speech Recognition system (ASR) for southern Quechua language. The software can recognize both continuous speech and isolated words. The ASR was developed using Hidden Markov Model Toolkit (HTK) and the corpus collected by Siminchikkunarayku. A dictionary provides the system with a mapping of vocabulary words to sequences of phonemes; the audio files were processed to extract the speech feature vectors (MFCC) and then, the acoustic model was trained using the MFCC files until its convergence. The paper also describes a detailed architecture of an ASR system developed using HTK library modules and tools. The ASR was tested using the audios recorded by volunteers obtaining a 12.70% word error rate.
This project was supported by CONCYTEC CIENCIACTIVA of the Peruvian government through grant 164-2015-FONDECYT and by PUCP through grant 2017-3-0039/436.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
2017 National Census, https://www.inei.gob.pe/.
- 2.
- 3.
- 4.
- 5.
Ministerial Resolution 1218-1985-ED.
References
Alegre, F.: Aplicación de rna y hmm a la verificación automática de locutor. IEEE Lat. Am. Trans. 5(5), 329–337 (2007)
Amodei, D., et al.: Deep speech 2: End-to-end speech recognition in English and Mandarin. In: International Conference on Machine Learning, pp. 173–182 (2016)
Botha, J.A.: Probabilistic modelling of morphologically rich languages (2015)
Carki, K., Geutner, P., Schultz, T.: Turkish LVCSR: towards better speech recognition for agglutinative languages. In: 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 3, pp. 1563–1566. IEEE (2000)
Cerrón-Palomino, R.: Quechua sureño diccionario unificado quechua-castellano castellano-quechua [unified dictionary of southern quechua, quechua-spanish spanish-quechua]. Biblioteca Nacional del Perú, Lima (1994)
Chuctaya, H.F.C., Mercado, R.N.M., Gaona, J.J.G.: Isolated automatic speech recognition of quechua numbers using MFCC, DTW and KNN
Dua, M., Aggarwal, R., Kadyan, V., Dua, S.: Punjabi automatic speech recognition using htk. Int. J. Comput. Sci. Issues (IJCSI) 9(4), 359 (2012)
Durston, A., Mannheim, B.: Indigenous Languages, Politics, and Authority in Latin America: Historical and Ethnographic Perspectives. University of Notre Dame Press, South Bend (2018)
Giannakopoulos, T.: pyAudioAnalysis: an open-source python library for audio signal analysis. PloS One 10(12), e0144610 (2015)
Graves, A., Jaitly, N.: Towards end-to-end speech recognition with recurrent neural networks. In: International Conference on Machine Learning, pp. 1764–1772 (2014)
Melgarejo, N., Camacho, L.: Implementation of a web platform for the preservation of american native languages. In: 2018 IEEE XXV International Conference on Electronics, Electrical Engineering and Computing (INTERCON), pp. 1–4. IEEE (2018)
Odriozola, I., Serrano, L., Hernaez, I., Navas, E.: The AhoSR automatic speech recognition system. In: Navarro Mesa, J.L., et al. (eds.) IberSPEECH 2014. LNCS (LNAI), vol. 8854, pp. 279–288. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-13623-3_29
Rabiner, L.R.: A tutorial on hidden markov models and selected applications in speech recognition. Proc. IEEE 2, 257–286 (1989)
Rabiner, L.R., Juang, B.H., Rutledge, J.C.: Fundamentals of Speech Recognition, vol. 14. PTR Prentice Hall, Englewood Cliffs (1993)
Rios, A.: A basic language technology toolkit for quechua. Procesamiento del Lenguaje Natural 56, 91–94 (2016)
Torero, A.: Los dialectos quechuas. Univ, Agraria (1964)
Zevallos, R., Camacho, L.: Siminchik: A speech corpus for preservation of southern quechua. In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). European Language Resources Association (ELRA), Paris, France (2018)
Zhao, Y., et al.: Cross-language transfer speech recognition using deep learning. In: 11th IEEE International Conference on Control and Automation (ICCA), pp. 1422–1426. IEEE (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Zevallos, R., Cordova, J., Camacho, L. (2020). Automatic Speech Recognition of Quechua Language Using HMM Toolkit. In: Lossio-Ventura, J.A., Condori-Fernandez, N., Valverde-Rebaza, J.C. (eds) Information Management and Big Data. SIMBig 2019. Communications in Computer and Information Science, vol 1070. Springer, Cham. https://doi.org/10.1007/978-3-030-46140-9_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-46140-9_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-46139-3
Online ISBN: 978-3-030-46140-9
eBook Packages: Computer ScienceComputer Science (R0)