Abstract
The development of HLT tools inevitably involves the need for language resources. However, only a handful number of languages possesses such resources. This paper presents the development of HLT tools for the African language Naija (Nigerian Pidgin), spoken in Nigeria. Particularly, this paper is focusing on developing language resources for a tokenizer, an automatic speech system for predicting the pronunciation of the words and their segmentation.
The newly created resources are integrated into SPPAS software tool and distributed under the terms of public licenses.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Source: http://www.internetworldstats.com/af/ng.htm - 2017-10.
- 2.
Source: https://www.ethnologue.com/country/NG - 2017-06.
- 3.
References
Bigi, B.: A multilingual text normalization approach. In: Vetulani, Z., Mariani, J. (eds.) LTC 2011. LNCS (LNAI), vol. 8387, pp. 515–526. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-08958-4_42
Bigi, B.: SPPAS - multi-lingual approaches to the automatic annotation of speech. Phonetician 111–112, 54–69 (2015). http://www.isphs.org/Phonetician/Phonetician_111-112.pdf#page=54
Bigi, B.: A phonetization approach for the forced-alignment task in SPPAS. In: Vetulani, Z., Uszkoreit, H., Kubis, M. (eds.) LTC 2013. LNCS (LNAI), vol. 9561, pp. 397–410. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-43808-5_30
Deuber, D.: Nigerian Pidgin in Lagos: Language Contact, Variation and Change in an African Urban Setting. Battlebridge Publications (2005)
Esizimetor, D., Egbokhare, F.: Naija. Hawai University Web Site. Language Varieties, 4 July 2014. http://www.hawaii.edu/satocenter/langnet/definitions/naija.html
Faraclas, N.: A Grammar of Nigerian Pidgin. Ph.D. thesis, Berkeley University of California (1989)
Le, V., Besacier, L., Seng, S., Bigi, B., Do, T.: Recent advances in automatic speech recognition for Vietnamese. In: International Workshop on Spoken Languages Technologies for Under-resourced languages, pp. 47–52. Hanoi, Vietnam (2008). http://www.lpl-aix.fr/~bigi/Doc/le2008sltu.pdf
Lee, A., Kawahara, T.: Recent development of open-source speech recognition engine Julius. In: Asia-Pacific Signal and Information Processing Association, pp. 131–137. Annual Summit and Conference, International Organizing Committee (2009)
Lewis, M., Gary, F., Charles, D.: Ethnologue: Languages of the World, 18th edn. Dallas, Texas (2015)
Onyenwe, I.: Developing Methods and Resources for Automated Processing of the African Language Igbo. Ph.D. thesis, University of Sheffield (2017)
Schultz, T., Waibel, A.: Language-independent and language-adaptive acoustic modeling for speech recognition. Speech Commun. 35(1), 31–51 (2001)
Wells, J.: SAMPA computer readable phonetic alphabet. In: Handbook of Standards and Resources for Spoken Language Systems, vol. 4 (1997)
Young, S.J., Young, S.: The HTK hidden Markov model toolkit: design and philosophy. University of Cambridge, Department of Engineering (1993)
Acknowledgements
This work was financed by the French “Agence Nationale pour la Recherche” (ANR-16-CE27-0007).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Bigi, B., Abiola, O.S., Caron, B. (2020). Resources and Tools for Automated Speech Segmentation of the African Language Naija (Nigerian Pidgin). In: Vetulani, Z., Paroubek, P., Kubis, M. (eds) Human Language Technology. Challenges for Computer Science and Linguistics. LTC 2017. Lecture Notes in Computer Science(), vol 12598. Springer, Cham. https://doi.org/10.1007/978-3-030-66527-2_12
Download citation
DOI: https://doi.org/10.1007/978-3-030-66527-2_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-66526-5
Online ISBN: 978-3-030-66527-2
eBook Packages: Computer ScienceComputer Science (R0)