Skip to main content

Resources and Tools for Automated Speech Segmentation of the African Language Naija (Nigerian Pidgin)

  • Conference paper
  • First Online:
Human Language Technology. Challenges for Computer Science and Linguistics (LTC 2017)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12598))

Included in the following conference series:

  • 341 Accesses

Abstract

The development of HLT tools inevitably involves the need for language resources. However, only a handful number of languages possesses such resources. This paper presents the development of HLT tools for the African language Naija (Nigerian Pidgin), spoken in Nigeria. Particularly, this paper is focusing on developing language resources for a tokenizer, an automatic speech system for predicting the pronunciation of the words and their segmentation.

The newly created resources are integrated into SPPAS software tool and distributed under the terms of public licenses.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Source: http://www.internetworldstats.com/af/ng.htm - 2017-10.

  2. 2.

    Source: https://www.ethnologue.com/country/NG - 2017-06.

  3. 3.

    http://naijasyncor.huma-num.fr/.

References

  1. Bigi, B.: A multilingual text normalization approach. In: Vetulani, Z., Mariani, J. (eds.) LTC 2011. LNCS (LNAI), vol. 8387, pp. 515–526. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-08958-4_42

    Chapter  Google Scholar 

  2. Bigi, B.: SPPAS - multi-lingual approaches to the automatic annotation of speech. Phonetician 111–112, 54–69 (2015). http://www.isphs.org/Phonetician/Phonetician_111-112.pdf#page=54

  3. Bigi, B.: A phonetization approach for the forced-alignment task in SPPAS. In: Vetulani, Z., Uszkoreit, H., Kubis, M. (eds.) LTC 2013. LNCS (LNAI), vol. 9561, pp. 397–410. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-43808-5_30

    Chapter  Google Scholar 

  4. Deuber, D.: Nigerian Pidgin in Lagos: Language Contact, Variation and Change in an African Urban Setting. Battlebridge Publications (2005)

    Google Scholar 

  5. Esizimetor, D., Egbokhare, F.: Naija. Hawai University Web Site. Language Varieties, 4 July 2014. http://www.hawaii.edu/satocenter/langnet/definitions/naija.html

  6. Faraclas, N.: A Grammar of Nigerian Pidgin. Ph.D. thesis, Berkeley University of California (1989)

    Google Scholar 

  7. Le, V., Besacier, L., Seng, S., Bigi, B., Do, T.: Recent advances in automatic speech recognition for Vietnamese. In: International Workshop on Spoken Languages Technologies for Under-resourced languages, pp. 47–52. Hanoi, Vietnam (2008). http://www.lpl-aix.fr/~bigi/Doc/le2008sltu.pdf

  8. Lee, A., Kawahara, T.: Recent development of open-source speech recognition engine Julius. In: Asia-Pacific Signal and Information Processing Association, pp. 131–137. Annual Summit and Conference, International Organizing Committee (2009)

    Google Scholar 

  9. Lewis, M., Gary, F., Charles, D.: Ethnologue: Languages of the World, 18th edn. Dallas, Texas (2015)

    Google Scholar 

  10. Onyenwe, I.: Developing Methods and Resources for Automated Processing of the African Language Igbo. Ph.D. thesis, University of Sheffield (2017)

    Google Scholar 

  11. Schultz, T., Waibel, A.: Language-independent and language-adaptive acoustic modeling for speech recognition. Speech Commun. 35(1), 31–51 (2001)

    Article  Google Scholar 

  12. Wells, J.: SAMPA computer readable phonetic alphabet. In: Handbook of Standards and Resources for Spoken Language Systems, vol. 4 (1997)

    Google Scholar 

  13. Young, S.J., Young, S.: The HTK hidden Markov model toolkit: design and philosophy. University of Cambridge, Department of Engineering (1993)

    Google Scholar 

Download references

Acknowledgements

This work was financed by the French “Agence Nationale pour la Recherche” (ANR-16-CE27-0007).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Brigitte Bigi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bigi, B., Abiola, O.S., Caron, B. (2020). Resources and Tools for Automated Speech Segmentation of the African Language Naija (Nigerian Pidgin). In: Vetulani, Z., Paroubek, P., Kubis, M. (eds) Human Language Technology. Challenges for Computer Science and Linguistics. LTC 2017. Lecture Notes in Computer Science(), vol 12598. Springer, Cham. https://doi.org/10.1007/978-3-030-66527-2_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-66527-2_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-66526-5

  • Online ISBN: 978-3-030-66527-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics