Abstract
This paper presents the methods used and results obtained for the creation of a Festival-compatible pronunciation dictionary of above 10k words for the kabyle language. Kabyle is a berber dialect spoken in Northern Algeria. This dictionary will be useful in the design of text-to-speech and automatic speech recognition systems for the kabyle language. It was built using a bootstrapping method in which we incrementally build rules to predict word pronunciations while correcting wrong predictions. We thus obtain a large pronunciation dictionary as well as a set of rules to predict pronunciations for unknown words. The rules are embedded in Classification and Regression Trees and achieve 91,62% of correct prediction rate for entire words and 97,85% for phonemes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Kossmann, M.G., Stroomer, H.J.: Phonologies of Asia and Africa, 1st edn. Eisenbrauns, Winona Lake (1997)
The Unicode Consortium: The Unicode Standard, Version 4.1.0, defined by: The Unicode Standard, Version 4.0, as amended by Unicode 4.0.1 and by Unicode 4.1.0. Addison-Wesley, Boston (2003)
Maskey, S.R., Black, A.W., Tomokiyo, L.M.: Bootstrapping phonetic lexicons for new languages. In: International Conference on Speech and Language Processing 2004, pp. 69–72. ISCA, Jeju Island (2004)
Black, A., Lenzi, K., Pagel, V.: Issues in building general letter to sound rules. In: 3rd ESCA Workshop on Speech Synthesis, pp. 77–80. Jenolan Caves, Australia (2004)
Davel, M., Barnard, E.: Pronunciation prediction with default&refine. Comput. Speech Lang. 22(4), 374–393 (2008)
Bisani, M., Hermann, N.: Joint-sequence models for grapheme-to-phoneme conversion. Speech Commun. 50(5), 434–451 (2008)
Jyothi, P., Hasegawa-Johnson, M.: Low-resource grapheme-to-phoneme conversion using recurrent neural networks. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, New Orleans (2017)
Razavi, M., Rasipuram, R., Doss, M.M.: Pronunciation lexicon development for under-resourced languages using automatically derived subword units: a case study on Scottish Gaelic. In: Proceedings of 4th Biennial Workshop on Less-Resourced Languages, pp. 1–2. Society for Language Resources and Technology, Poznan (2015)
Razavi, M., Rasipuram, R., Doss, M.M.: Towards weakly supervised acoustic subword unit discovery and lexicon development using hidden Markov models. Speech Commun. 96, 168–183 (2018)
The Festival Speech Synthesis System System Documentation. http://www.festvox.org/docs/manual-2.4.0/festival_toc.html. Accessed 31 Mar 2019
Inkpen, D., Frunza, O., Kondrak, G: Automatic identification of cognates and false friends in French and English. In: Proceedings of the International Conference Recent Advances in Natural Language Processing, pp. 251–257. Bulgarian Academy of Sciences, Borovets (2005)
Montalvo, S., Pardo, E.G., Martinez, R., Fresno, V..: Automatic cognate identification based on a fuzzy combination of string similarity measures. In: 2012 IEEE International Conference on Fuzzy Systems. IEEE, Brisbane (2012)
Brew, C., McKelvie, D.: Word-pair extraction for lexicography. In: Oflazer, K., Somers, H. (eds.) Proceedings of the 2nd International Conference on New Methods in Language Processing, Ankara, Bilkent University, pp. 45–55 (1996)
Davel Marelie, H.: Pronunciation Modelling and Bootstrapping. (Thesis), University of Pretoria, p. 99 (2005)
Faizi, R.: Stress systems in Amazigh: a comparative study. Revue Asinag 6, 115–127 (2011)
Roettger, T.B., Bruggeman, A., Grice, M.: Word stress in Tashlhiyt-Post lexical prominence in disguise? In: Proceedings of 18th International Congress on Phonetic Sciences. International Phonetic Association, London/Glasgow (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Lyes, D., Leila, F., Hocine, T. (2019). Building a Pronunciation Dictionary for the Kabyle Language. In: Salah, A., Karpov, A., Potapova, R. (eds) Speech and Computer. SPECOM 2019. Lecture Notes in Computer Science(), vol 11658. Springer, Cham. https://doi.org/10.1007/978-3-030-26061-3_32
Download citation
DOI: https://doi.org/10.1007/978-3-030-26061-3_32
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-26060-6
Online ISBN: 978-3-030-26061-3
eBook Packages: Computer ScienceComputer Science (R0)