Lexical modeling for the development of Amharic automatic speech recognition systems

Tachbelie, Martha Yifiru; Abate, Solomon Teferra

doi:10.1007/s10579-023-09659-y

Lexical modeling for the development of Amharic automatic speech recognition systems

Original Paper
Published: 03 May 2023

Volume 57, pages 963–984, (2023)
Cite this article

Language Resources and Evaluation Aims and scope Submit manuscript

184 Accesses
Explore all metrics

Abstract

Amharic is the second most spoken Semitic language after Arabic. It has its own syllabary writing system, each character representing a consonant and a vowel. Automatic Speech Recognition (ASR) researches for Amharic have been conducted on the basis of grapheme-based pronunciation lexicon, taking advantage of the nature of its writing system. However, the epenthetic vowel and the glottal stop consonant represented in the writing system may not be pronounced in all of their occurrences. Moreover, the writing system does not differentiate geminated and non-geminated forms of consonants. Therefore, the grapheme-based pronunciation lexicon used so far has limitations with regard to these language features. To handle these limitations, we have prepared word- and morpheme-based pronunciation lexicons using data-driven and knowledge-driven experts’ transcription. The data-driven transcription has been used for the preparation of training pronunciation lexicon while the knowledge-driven has been used to prepare morpheme- and word-based pronunciation lexicons for decoding. When morpheme-based knowledge-driven lexicons are used, better ASR performance (compared with the baseline ASR system that used grapheme-based lexicon) has been achieved although the number of phones is much more (60) than the number of phones used in the grapheme-based lexicon (37).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Exploring Generation of Pronunciation Lexicon for Low-Resource Language Automatic Speech Recognition Based on Generic Phone Recognizer

Article 23 April 2024

Improving speech recognition systems for the morphologically complex Malayalam language using subword tokens for language modeling

Article Open access 04 November 2023

Turkish Speech Recognition

Notes

The number of phones used in the IARPA Babel Amharic lexicon is 61. However, it is not due to representing consonant geminations as we did but due to the use of different representations for labiovelars. We have represented labialization as variations of only 5 vowels while in the IARPA Babel Amharic lexicon it is represented as variations of 26 consonants

References

Abate, S.T. (2006). Automatic speech recognition for Amharic. PhD thesis, University of Hamburg, Hamburg
Abate, S.T., Menzel, W. (2007a). Automatic speech recognition for an under-resourced language—Amharic. In: Proceeding of INTERSPEECH, pp. 1541–1544
Abate, S.T., Menzel, W. (2007b). Syllable-based speech recognition for Amharic. In: Proceeding of the 2007 Workshop on Computational Approaches to Semitic Languages: Common Issues and Resources, pp. 33–40
Abate, S.T., Menzel, W., Tafila, B. (2005). An amharic speech corpus for large vocabulary continuous speech recognition. In: Proceeding of INTERSPEECH, pp. 1601–1604
Abate, S.T., Tachbelie, M.Y., Melese, M., et al. (2020a). Large vocabulary read speech corpora for four ethiopian languages : Amharic, tigrigna, oromo and wolaytta. In: LREC 2020
Abate, S.T., Tachbelie, M.Y., Schultz, T. (2020b). Deep neural networks based automatic speech recognition for four ethiopian languages. In: ICASSP 2020
Abate, S.T., Tachbelie, M.Y., Schultz, T. (2020c). Multilingual acoustic and language modeling for ethio-semitic languages. In: Meng. H., Xu. B., Zheng, T.F. (eds) Interspeech. ISCA, pp. 1047–1051
Abate, S.T., Tachbelie, M.Y., Schultz, T. (2021). End-to-end multilingual automatic speech recognition for less-resourced languages: The case of four Ethiopian languages. In: ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp 7013–7017, https://doi.org/10.1109/ICASSP39728.2021.9415020
Appen Buttler Hill Pty Ltd (2012). Speech and language resources 2012. Appen Butler Hill Speech and Language Resources 2012-Product Catalogue
Appleyard, D. (1995). Colloquial Amharic: A complete course for beginners. London: Routledge.
Book Google Scholar
Bender, M. L., Bowen, J. D., Cooper, R. L., et al. (1976). Languages in Ethiopia. London: Oxford University Press.
Google Scholar
Berhanu, S. (2001). Isolated amharic consonant-vowel syllable recognition: An experiment using the hidden markov model. Master’s thesis, School of Information Studies for Africa, Addis Ababa University, Addis Ababa Ethiopia
Besacier, L., Le, V.B., Boitet, C., et al. (2006). Asr and translation for under-resourced languages. In: Proceeding of the IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP 2006), pp. 1221–1224
Bills, A., Conners, T., David, A., et al. (2019). IARPA Babel Amharic Language Pack IARPA-babel307b-v1.0b. https://doi.org/11272.1/AB2/U1H3H7, https://hdl.handle.net/11272.1/AB2/U1H3H7
Choueiter, G., Povey, D., Chen, S.F., et al. (2006). Morpheme-based language modeling for Arabic lvcsr. In: Proceeding of ICCASP 2006
Dribssa, A.E., Tachbelie, M.Y. (2015). Investigating the use of syllable acoustic units for Amharic speech recognition. In: Proceeding of the IEEE AFRICON
Gelas, H., Abate, S.T., Besacier, L., et al. (2011). Quality assessment of crowdsourcing transcriptions for african languages. In: Proceeding of the 12th Annual Conference of the International Speech Communication Association (INTERSPEECH 2011), pp. 3065–3068
Girmaw, M. (2004). An automatic speech recognition system for amharic. Master’s thesis, Department of Signal, Sensor and System, Royal Institute of Technology, Stockholm Sweden
H/Mariam, S., Prahallad, K., Black, A.W., et al. (2004). Unit selection voice for Amharic using Festvox. In: Proceeding 5th ISCA Speech Synthesis Workshop
Hou, W., Dong, Y., Zhuang, B., et al. (2020). Large-scale end-to-end multilingual speech recognition and language identification with multi-task learning. In: Interspeech
Karafiát, M., Baskar, M.K., Matějka, P., et al. (2016). Multilingual blstm and speaker-specific vector adaptation in 2016 but babel system. In: 2016 IEEE Spoken Language Technology Workshop (SLT), pp. 637–643, https://doi.org/10.1109/SLT.2016.7846330
Leslau, W. (1976). Concise Amharic dictionary. Wiesbaden: Otto Harrassowitz.
Google Scholar
Leslau, W. (2000). Introductory grammar of Amharic. Wiesbaden: Harrassowits Verlag.
Google Scholar
Li, X., Dalmia, S., Black, A.W., et al. (2019). Multilingual speech recognition with corpus relatedness sampling. ArXiv arXiv:abs/1908.01060
Pellegrini, T., Lamel, L. (2006). Investigating automatic decomposition for asr in less represented languages. In: Proceeding of INTERSPEECH
Pellegrini, T., & Lamel, L. (2009). Automatic word decompounding for ASR in a morphologically rich language: Application to Amharic. IEEE Transactions on Audio Speech and Language Processing, 17(5), 863–873.
Article Google Scholar
Seid, H., Gambaeck, B. (2005). A speaker independent continuous speech recognizer for Amharic. In: Proceeding of INTERSPEECH, pp. 3349–3352
Seifu, Z. (2003). Hmm based large vocabulary, speaker independent, continuous Amharic speech recognizer. Master’s thesis, School of Information Studies for Africa, Addis Ababa University, Addis Ababa Ethiopia
Stolcke, A. (2002). Srilm-an extensible language modeling toolkit. In: Proceeding of International Conference on Spoken Language Processing, pp. 257–286
Tachbelie, M.Y. (2010). Morphology based language modeling for amharic. PhD thesis, University of Hamburg, Hamburg Germany
Tachbelie, M.Y., Abate, S.T. (2015). Effect of language resources on automatic speech recognition for Amharic. In: Proceeding of IEEE AFRICON
Tachbelie, M.Y., Abate, S.T., Menzel, W. (2009). Automatic speech recognition for an under-resourced language-Amharic. In: Proceding of the 4th Language and Technology Conference (LTC-09), pp. 114–118
Tachbelie, M.Y., Abate, S.T., Menzel, W. (2010). Morpheme-based automatic speech recognition for a morphologically rich language-Amharic. In: Proceeding of Spoken Language Technology for Under-resourced Languages (SLTU 10), pp. 68–73
Tachbelie, M.Y., Abate, S.T., Besacier, L. (2011a). Part-of-speech tagging for under-resourced and morphologically rich languages-the case of amharic. In: Proceeding of Conference on Human Language Technology for Development, Alexiandria Egypt
Tachbelie, M.Y., Abate, S.T., Menzel, W. (2011b). Morpheme-based and factored language modeling for Amharic speech recognition. In: Human Language Technology: Challenges for Computer Science and Linguists, pp. 82–93
Tachbelie, M.Y., Besacier, L., Rossato, S. (2011c). Comparison of syllable and triphone based speech recognition for Amharic. In: Proceeding of Language Technology COnference (LTC 11), Poznan Poland
Tachbelie, M.Y., Besacier, L., Rossato, S. (2012). Syllable- based and hybrid acoustic models for Amharic speech recognition. In: Proceeding of Workshop on Spoken Language Technologies for Under-Resourced Languages (SLTU 12), pp. 5–10
Tachbelie, M. Y., Abate, S. T., & Besacier, L. (2014). Using different acoustic lexical and language modeling units for ASR of an under-resourced language-Amharic. Speech Communication, 56, 181–194.
Article Google Scholar
Tachbelie, M. Y., Abulimiti, A., Abate, S. T., et al. (2020). Dnn-based speech recognition for Globalphone languages. In: ICASSP 2020
Tadesse, K. (2002). Word based amharic speech recognizer: An experiment using hidden Markov model (HMM). Master’s thesis, School of Information Studies for Africa, Addis Ababa University, Addis Ababa Ethiopia
Yifiru, M. (2003). Automatic Amharic speech recognition system to command and control computers. Master’s thesis, School of Information Studies for Africa, Addis Ababa University, Addis Ababa Ethiopia
Yimam, B. (2007). yamarňa sewasew (2nd ed.). Addis Ababa: EMPDE.
Google Scholar
Żelasko, P., Moro-Vel’azquez, L., Hasegawa-Johnson, M.A., et al. (2020). That sounds familiar: An analysis of phonetic representations transfer across languages. In: INTERSPEECH

Download references

Acknowledgements

We are thankful to Google for the Faculty Research Award that enabled us to conduct the research. The result of the preliminary experiments using only 25% of the linguistically transcribed data has been published in the proceedings of the AFRICON Conference Tachbelie and Abate (2015).

Author information

Martha Yifiru Tachbelie and Solomon Teferra Abate have contributed equally to this work.

Authors and Affiliations

School of Informatioin Science, Addis Ababa University, Addis Ababa, Ethiopia
Martha Yifiru Tachbelie & Solomon Teferra Abate

Authors

Martha Yifiru Tachbelie
View author publications
You can also search for this author in PubMed Google Scholar
Solomon Teferra Abate
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Solomon Teferra Abate.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Tachbelie, M.Y., Abate, S.T. Lexical modeling for the development of Amharic automatic speech recognition systems. Lang Resources & Evaluation 57, 963–984 (2023). https://doi.org/10.1007/s10579-023-09659-y

Download citation

Accepted: 31 March 2023
Published: 03 May 2023
Issue Date: September 2023
DOI: https://doi.org/10.1007/s10579-023-09659-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Lexical modeling for the development of Amharic automatic speech recognition systems

Abstract

Access this article

Similar content being viewed by others

Exploring Generation of Pronunciation Lexicon for Low-Resource Language Automatic Speech Recognition Based on Generic Phone Recognizer

Improving speech recognition systems for the morphologically complex Malayalam language using subword tokens for language modeling

Turkish Speech Recognition

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Lexical modeling for the development of Amharic automatic speech recognition systems

Abstract

Access this article

Similar content being viewed by others

Exploring Generation of Pronunciation Lexicon for Low-Resource Language Automatic Speech Recognition Based on Generic Phone Recognizer

Improving speech recognition systems for the morphologically complex Malayalam language using subword tokens for language modeling

Turkish Speech Recognition

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation