Abstract
In this paper, we present our Amazigh automatic speech recognition system. Its realization is constructed with context-independent phonetic Hidden Markov Models. Many choices are made on this system, such as the number of states of the models, the type of emission probability densities associated with the states, and the representation of the signal by cepstral coefficients. The results of recognition of our system place it at a level of height performance comparable to that achieved by Markovian automatic speech recognition systems. Our system is designed to recognize 43 distinct isolated Amazigh words (33 letters and 10 digits). The recognition rate is then calculated for each digit and letter. The overall accuracy and word recognition rate for the whole database achieved 91.31% after extensive testing and change of the recognition parameters. The results obtained in this work are improved in association with our previous work concerning Amazigh spoken digits and letters automatic speech recognition, using Hidden Markov Model Toolkit.
Similar content being viewed by others
References
Abenaou, F., Allah, A., & Nsiri, B. (2014). Vers un système de reconnaissance automatique de la parole Amazigh basé sur les transformations orthogonales paramétrables. Asinag. 133–145.
Al-Qatab, B. A. Q. & Ainon, R. N. (2010). Arabic speech recognition using Hidden Markov Model Toolkit (HTK). In: International Symposium in Information Technology (ITSim), Kuala Lumpur, pp. 15–17.
Ataa Allah, F. & Boulaknadel, S. (2012). Natural language processing for Amazigh Language: Challenges and future directions. In: Workshop on Language Technology for Normalisation of Less-Resourced Languages (SALTMIL8/AfLaT2012).
Boukous, A. (2009). Phonologie de l’Amazigh. Rabat: Institut royal de la culture Amazigh.
Boukous, A. (2012). The planning of Standardizing Amazigh language. The Moroccan Experience, IR-CAM.
Boumalk, A., & Nait-Zerrad, K. (2009). La Vocabulaire grammatical amazighe, IRCAM, CAL. Rabat: Publications de l’IRCAM.
Chapaneri, S. V. (2012). Spoken digits recognition using Weighted MFCC and improved features for dynamic time warping. International Journal of Computer Applications, 40(3), 6–12.
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum-likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B, 39, 1–38.
Deng, L., Lennig, M., Seitz, F., & Mermelstein, P. (1991). Large vocabulary word recognition using context-dependent allophonic hidden Markov models. Computer Speech & Language, 4, 345–357.
Deshmukh, N., Ganapathiraju, A., Hamaker, J., Picone, J., & Ordowski, M. (1999). A public domain speech-to-text system. In: 6 éme Conférence européenne, communication et technologie de la parole, Budapest, vol. 5, pp. 2127–2130.
El Ghazi, C., & Daoui, N. (2014). Idrissi, Automatic Speech Recognition for Tamazight Enchained Digit. World Journal Control Science and Engineering, 2(1), 1–5.
El Ghazi, C., Daoui, C., Idrissi, N, Fakir, M., & Bouikhalane, B. (2011). Système de reconnaissance automatique de la parole Amazigh à base de la transcription en lettre Tifinagh. Revue Méditerranéenne des Télécommunications 1(2).
El Ouahabi, S., Atounti, M., & Bellouki, M. (2016). Amazigh isolated-word speech recognition system using Hidden Markov Model toolkit (HTK). In: Proceedings of the 2016 International Conference on Information Technology for Organizations Development (IT4OD), Fez, Morocco, pp. 1–7. https://doi.org/10.1109/IT4OD.2016.7479305s.
El Ouahabi, S., Atounti, M., & Bellouki, M. (2016). Building HMM independent isolated speech recognizer system for Amazigh Language, Europe and MENA. In: Cooperation Advances in Information and Communication Technologies. Volume 520 of the series Advances in Intelligent Systems and Computing, pp 299–307. https://doi.org/10.1007/978-3-319-46568-5-31.
El Ouahabi, S., Atounti, M., & Bellouki, M. (2017). A database for amazigh speech recognition research: AMZSRD. In: Proceedings of the 3rd International Conference of Cloud Computing Technologies and Applications (CloudTech), Rabat, pp. 1–5. https://doi.org/10.1109/CloudTech.2017.8284715.
El Ouahabi, S., Atounti, M., & Bellouki, M. (2019a). Amazigh Speech Recognition using triphone modeling and clustering tree decision. Annals of the University of Craiova, Mathematics and Computer Science Series, 46(1), 56.
El Ouahabi, S., Atounti, M., & Bellouki, M. (2019b). Toward an automatic speech recognition system for amazigh-tarifit language. International Journal of Speech Technology, 22(2), 421–432. https://doi.org/10.1007/s10772-019-09617-6
Flahert, M. J., & Sidney, T. (1994). Real Time implementation of HMM speech recognition for telecommunication applications. In: IEEE International Conference on Acustics, Speech, and Signal Processing, (ICASSP), Vol. 6, pp. 145–148.
Huang, X., Alleva, F., Hon, H. W., Hwang, M. Y., & Rosenfeld, R. (1993). The SPHINX-II speech recognition system : an overview. Computer Speech and Language, 7(2), 137–148.
Huang, X. D., Ariki, Y., & Jack, M. A. (1990). Hidden Markov models for speech recognition. Université d’Edimbourg. 1990.
Juang, B. H., Levinson, S. E., & Sondhi, M. M. (1986). Maximum likelihood estimation for mixture multivariate stochastic observations of Markov chains. IEEE International Symposium on Information Theory, 32(2), 307–309.
Lee, K. F., Hon, H. W., & Reddy, R. (1990). An overview of the SPHINX speech recognition system. IEEE Transactions on Acoustics, Speech and Signal Processing, 38(1), 35–45.
Li, X. X., Zhao, Y., Pi, X., Liang, L. H., & Nefian, A.V. (2002). Audio-visual continuous speech recognition using a coupled hidden Markov model. In: 7 éme International Conference. Spoken Language Processing, Denver, CO.
Nimje, K., & Shandilya, M. (2011). Automatic isolated digit recognition system: an approach using HMM. Journal of Scientific and Industrial Research, 70, 270–272.
Outahajala, M., Zenkouar, L., & Rosso, P. (2011). Building an annotated corpus for Amazigh. In: Proceedings of 4th International Conference on Amazigh and ICT, Rabat, Morocco.
Pawar, G. S., & Morade, S. S. (2014). Realization of Hidden Markov Model for English Digit Recognition. International Journal of Computer Applications, 98(17), 37–40.
Picone, J. (1990). Continues speech recognition using Hidden Markov Models. IEEE ASSP Magazine, 7(3), 26–41.
Pokhariya, J. S., & Mathur, S. (2014). Sanskrit speech recognition using Hidden Markov Model Toolkit. International Journal of Engineering Research & Technology (IJERT), 3(10), 93–98.
Rabiner, L. R. (1989). A tutorial on hidden Markov models and select applications in speech recognition [Revue]. Proceedings of IEEE, 77(2), 257–286.
Rabiner, L. R., & Juang, B. H. (1993). Fundamentals of speech recognition. Englewood Cliff, NJ: Prentice-Hall.
Satori, H. & El Haoussi, F. (2014). Investigation Amazigh speech recognition using CMU tools. International Journal of Speech Technology 17, 235. https://doi.org/10.1007/s10772-014-9223-y.
Sneha, G., Hardhika, K., Priya, J., & Gupta, D. (2010). Isolated Kannada speech recognition using HTK-A detailed approach, progress in advanced computing and intelligent engineering. Advances in Intelligent Systems and Computing. https://doi.org/10.1007/978-981-10-6875-1-19
Telmem, M. & Ghanou, Y. (2018). Estimation of the optimal HMM parameters for Amazigh speech recognition system using CMU-Sphinx. In : Proceedings of the First International Conference on Intelligent Computing in Data Sciences, ICDS2017. https://doi.org/10.1016/j.procs.2018.01.102.
Young, S. (1994). The HTK hidden Markov model toolkit: Design and philosophy. Cambridge: University of Cambridge.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
El Ouahabi, S., Atounti, M. & Bellouki, M. Optimal parameters selected for automatic recognition of spoken Amazigh digits and letters using Hidden Markov Model Toolkit. Int J Speech Technol 23, 861–871 (2020). https://doi.org/10.1007/s10772-020-09762-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10772-020-09762-3