Skip to main content
Log in

Towards an unrestricted domain TTS system for African tone languages

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

In this paper we discuss the procedural problems, issues and challenges involved in developing a generic speech synthesizer for African tone languages. We base our development methodology on the “MultiSyn” unit-selection approach, supported by Festival Text-To-Speech (TTS) Toolkit for Ibibio, a Lower Cross subgroup of the (New) Benue-Congo language family widely spoken in the southeastern region of Nigeria. We present in a chronological order, the several levels of infrastructural and linguistic problems as well as challenges identified in the Local Language Speech Technology Initiative (LLSTI) during the development process (from the corpus preparation and refinement stage to the integration and synthesis stage). We provide solutions to most of these challenges and point to possible outlook for further refinement. The evaluation of the initial prototype shows that the synthesis system will be useful to non-literate communities and a wide spectrum of applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Black, A., & Taylor, P. (1997). Festival speech synthesis system: system documentation (1.1.1). Human Communication Research Centre, Technical report. HCRC/TR-83.

  • Black, A., Taylor, P., & Caley, R. (1999). The festival speech synthesis system. System Documentation (1.4.0), www.cstr.ed.ac.uk/projects/festival/manual/.

  • Clark, R., Richmond, K., & King, S. (2004). Festival 2: build your own general purpose unit selection speech synthesizer. In 5th ISCA speech synthesis work shop, Pittsburgh, PA (pp. 173–178).

  • Dutoit, T. (1999). An introduction to text-to-speech synthesis. Berlin: Springer.

    Google Scholar 

  • Essien, O. (1990). A grammar of the Ibibio language. Ibadan: University Press Limited.

    Google Scholar 

  • Gibbon, D. (1981). A new look at intonation syntax and semantics. In A. James & P. Westney (Eds.), New linguistics impulses in foreign language teaching. Tübingen: Gunter Narr

    Google Scholar 

  • Gibbon, D. (1987). Finite state processing of tone systems. In Proceedings of the European chapter of ACL, Copenhagen (pp. 291–297).

  • Gibbon, D. (2001). Finite state prosodic analysis of African corpus resources. In 7th EUROSPEECH conference, Aalborg, Denmark (pp. 83–86).

  • Gibbon, D., & Urua, E. (2006). Computational morphotonology in Niger-Congo languages. In Proceedings of speech prosody 2006, Dresden, Germany.

  • Gibbon, D., Urua, E., & Ekpenyong, M. (2004). Data creation for Ibibio speech synthesis. LLSTI Progress Report, Third Partners Workshop, Lisbon.

  • Gibbon, D., Urua, E.-A., & Ekpenyong, M. (2006). Problems and solutions in African tone language text-to-speech. In MULTILING 2006 ISCA Tutorial and Research Workshop (ITRW), Stallenbosch, South Africa.

  • Gut, U., & Gibbon, D. (Eds.) (2002). Typology of African prosodic systems. Bielefeld occasional papers on typology 1. Universitaet Bielefeld, Germany.

  • Hamza, W., Bakis, R., Shuang, Z., & Zen, H. (2005). On building a concatenative speech synthesis system for blizzard challenge speech databases. In INTERSPEECH 2005, Lisbon.

  • Hiroya, F. (1988). A note on the physiological and physical basis for the phrase and accent components in the voice fundamental frequency contour. In O. Fugimura (Ed.), Vocal physiology: voice production, mechanisms and functions (pp. 347–355). New York: Raven Press.

    Google Scholar 

  • Hunt, A., & Black, A. (1996). Unit selection in a concatenative speech synthesis system using a large speech database. In Proceedings of ICASSP, 1, Atlanta, Georgia (pp. 373–376).

  • Kaufman, E. (1985). Ibibio dictionary. Cross River State University and Ibibio Language Board, Nigeria, in cooperation with African Studies Centre, Leiden, The Netherlands.

  • Klabbers, E., Stoeber, K., Veldhuis, R., & Breuer, S. (2001). Speech synthesis development made easy: the Bonn open synthesis system. In Proceedings of Eurospeech, Aalborg (pp. 521–524).

  • Martin, J. (1998). A two-level take on Tianjin tone. In G.-J. Kruijff & I. Kruijff-Korbayová (Eds.), Proceedings of the third ESSLLI student session, 10th European summer school on logic, language and information, Saarbruecken, Germany (pp. 162–174).

  • Mizuno, H., Asano, H., Isoyai, M., Hasebe, M., & Abe, M. (2004). Text-to-speech synthesis technology using corpus-based approach. NTT Technical Review (Vol. 2, No. 3, pp. 70–75).

  • Olive, J. (1977). Rule synthesis of speech from diadic units. In Proceedings of ICASSP-77 (pp. 568–570).

  • Pierrehumbert, J. (1980). The phonology and phonetics of English intonation. Diss. Massachusetts Institute of Technology.

  • Reich, P. (1969). The finiteness of natural language. Language, 45, 831–843.

    Article  Google Scholar 

  • Schroeter, J. (2006). Text-to-speech (TTS) synthesis. In R. Dorf (Ed.), Circuits, signals and speech and language processing. http://www.research.att.com/~ttsweb/tts/papers/2005_EEHandbook/tts.pdf.

  • Shalonova, K., & Tucker, R. (2004). Issues in porting TTS to minority languages. In SALTMIL workshop on minority languages, LREC 2004, Lisbon.

  • Talikdar, P. (2004). Optimal text selection module version 0.2. LLSTI Progress Report, Third Partners Workshop, Lisbon.

  • Taylor, P., Black, A., & Caley, R. (1998). The architecture of the festival speech synthesis system. In 3rd ESCA workshop on speech synthesis (pp. 147–151), Jenolan Caves, Australia.

  • ‘t Hart, J., & Cohen, A. (1973). Intonation by rule, a perceptual quest. Journal of Phonetics, 1, 309–327.

    Google Scholar 

  • Tucker, R., & Shalonova, K. (2005). Supporting the creation of TTS for local language voice information systems. In INTERSPEECH-2005 (pp. 453–456).

  • Urua, E. (2000). Ibibio phonetics and phonology. Cape Town: Centre for Advanced Studies of African Society.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Moses E. Ekpenyong.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ekpenyong, M.E., Urua, EA. & Gibbon, D. Towards an unrestricted domain TTS system for African tone languages. Int J Speech Technol 11, 87–96 (2008). https://doi.org/10.1007/s10772-009-9037-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10772-009-9037-5

Keywords

Navigation