Skip to main content
Log in

Profivox—A Hungarian Text-to-Speech System for Telecommunications Applications

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

The latest Hungarian text-to-speech (TTS) system developed for telephone-based applications is described. The main features are intelligible human-like voice; robust software designed for continuous running; fully automatic conversion of declarative (short and very long) sentences and questions; and real time parallel operation, running on minimum 30 channels. The concept of prosody generation and sound duration processing is introduced. Also, the development environment of Profivox is presented. The market-leader Hungarian mobile service provider applies the TTS system in an automatic e-mail reading application.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Adriaens, H. (1991). Ein modell deutscher Intonation. University of Leiden, Ph.D. Thesis.

  • Allen, J., Hunnicut, S., and Klatt, D.H. (1987). From Text to Speech: the MITalk System. Cambridge, U.K., Cambridge University, Press.

    Google Scholar 

  • Ferenczi, T., Németh, G., Olaszy, G., and Gáspár, Z. (1997). A flexible client-server model for multilingual CTS/TTS development. In Proceedings of Eurospeech ’97, Rhodes, Greece, pp. 693–696.

  • Hallahan, W.I. (1995). DECtalk software: text-to-speech technology and implementation. Digital Technical Journal, 7:5–19.

    Google Scholar 

  • Kiss, G. and Olaszy, G. (1984). A HUNGAROVOX magyar nyelvű szótár nélküli valósidejű párbeszédes beszédszintetizáló rendszer. (Hungarovox, a Hungarian real time TTS synthesizer.) Információ Elektronika, 2:98–112.

    Google Scholar 

  • Koutny, I. (1999). Parsing Hungarian sentences in order to determine their Prosodic structure in a multilingual TTS system. In Proceedings of Eurospeech ’99, pp. 2091–2094.

  • Koutny, I. and Olaszy, G. (2000). Stress, focus and tempo in Hungarian sentences for TTS conversion, W. Jassem (Ed.), Speech and language technology, Poznan, Poland, pp. 57–70.

  • Németh, G., Zainkó, Cs., Olaszy, G., and Prószéky, G. (1999). Problems of creating a flexible e-mail reader for Hungarian. In Proceedings of Eurospeech ’99, pp. 939–942.

  • Olaszy, G. (1982). Some rules for the formant synthesis of Hungarian. In Proceedings of the 8th Acoustic Colloquium, Budapest, pp. 204–210.

  • Olaszy, G. (1989). MULTIVOX—A flexible text-to-speech system for Hungarian, Finnish, German, Esperanto, Italian and other languages for IBM PC. In Proceedings of the European Conference on Speech Communication and Technology, pp. 525–529.

  • Olaszy, G., Gordos, and G., Németh, G. (1992). The Multivox multilingual text-to-speech converter. In G. Bailly, C. Benoit, and T.R. Sawallis (Eds.), Talking Machines: Theories, Models, and Designs, Amsterdam, Elsevier, pp. 385–411.

    Google Scholar 

  • Olaszy, G. (1994). Hangidőtartam-módosító kisérletek a gépi beszéd ritmusának javítására. (Experiment on sound duration changes to prove the rhythm of synthesized speech.) In M. Gósy (Ed.), Beszédkutatás 1994, pp. 140–151. ssss

  • Olaszy, G. and Németh, G. (1997). Prosody generation for German concept-to-speech systems. (From theoretical intonation patterns to practical realisation.) Speech Communication 21, pp. 37–60.

  • Olaszy, G. and Olaszi, P. (1998). Hangidőtartamok mesterséges változtatása periódusok kivágásával, megismétlésével. (Changing the sound duration by inserting and deleting pitch periods.) In Beszédkutatás’98M. Gósy (Ed.), MTA Nyelvtudományi Intézet, Budapest, pp. 151–162.

    Google Scholar 

  • Olaszy, G., Németh, G., Olaszi, P., and Gordos, G. (1999). Interactive TTS supported speech message composer for large, limited but open information systems. In Proceedings of Eurospeech ’99, pp. 943–946.

  • Olaszy, G. (2000). A magyar beszéd-hangok specifikus időtartamainak meghatàrozàsa folyamatos beszèdre. (The definition of the specific sound durations of Hungarian for continuous speech). In Beszédkutatás ’2000M. Gósy (Ed.), MTA Nyelvtudományi Intézet, Budapest, Hungary, pp. 93–109.

    Google Scholar 

  • Prószéky, G. and Tihanyi, L. (1993). Humor: High-speed unification morphology and its applications for agglutinative languages. La tribune des industries de la langue, No. 10, OFIL, Paris, pp. 28–29.

    Google Scholar 

  • van Santen, J.P.H., Shih, C., and Möbius, B. (1998). Intonation. In R. Sproat (Ed.), Multilingual text-to-speech synthesis: The Bell Labs Approach, New York, Kluwer Academic Publishers, pp. 142–189.

    Google Scholar 

  • van Santen, J.P.H. (1998). Timing. In R. Sproat (Ed.), Multilingual text-to-speech synthesis: The Bell Labs Approach, New York, Kluwer Academic Publishers, pp. 115–139.

    Google Scholar 

  • Venditti, J.J. and van Santen, J.P.H. (1998). Modelling vowel duration for Japanese text-to-speech synthesis. In Proceedings of the 5th International Conference on Spoken Language Processing, Sydney, pp. 2043–2046.

  • Zellner, B. (1994). Pauses and the temporal structure of speech. In E. Keller (Ed.), Fundamentals of Speech Synthesis and Speech Recognition, New York, John Wiley & Sons, pp. 42–62.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Olaszy, G., Németh, G., Olaszi, P. et al. Profivox—A Hungarian Text-to-Speech System for Telecommunications Applications. International Journal of Speech Technology 3, 201–215 (2000). https://doi.org/10.1023/A:1026558915015

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1026558915015

Navigation