Profivox—A Hungarian Text-to-Speech System for Telecommunications Applications

Olaszy, G.; Németh, G.; Olaszi, P.; Kiss, G.; Zainkó, Cs.; Gordos, G.

doi:10.1023/A:1026558915015

Profivox—A Hungarian Text-to-Speech System for Telecommunications Applications

Published: December 2000

Volume 3, pages 201–215, (2000)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

G. Olaszy¹,
G. Németh¹,
P. Olaszi¹,
G. Kiss¹,
Cs. Zainkó¹ &
…
G. Gordos¹

107 Accesses
17 Citations
Explore all metrics

Abstract

The latest Hungarian text-to-speech (TTS) system developed for telephone-based applications is described. The main features are intelligible human-like voice; robust software designed for continuous running; fully automatic conversion of declarative (short and very long) sentences and questions; and real time parallel operation, running on minimum 30 channels. The concept of prosody generation and sound duration processing is introduced. Also, the development environment of Profivox is presented. The market-leader Hungarian mobile service provider applies the TTS system in an automatic e-mail reading application.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automatic speech recognition: a survey

Article 10 November 2020

Mishaim Malik, Muhammad Kamran Malik, … Imran Makhdoom

Large-Language-Models (LLM)-Based AI Chatbots: Architecture, In-Depth Analysis and Their Performance Evaluation

A deep learning approaches in text-to-speech system: a systematic review and recent research perspective

Article 29 September 2022

Yogesh Kumar, Apeksha Koul & Chamkaur Singh

References

Adriaens, H. (1991). Ein modell deutscher Intonation. University of Leiden, Ph.D. Thesis.
Allen, J., Hunnicut, S., and Klatt, D.H. (1987). From Text to Speech: the MITalk System. Cambridge, U.K., Cambridge University, Press.
Google Scholar
Ferenczi, T., Németh, G., Olaszy, G., and Gáspár, Z. (1997). A flexible client-server model for multilingual CTS/TTS development. In Proceedings of Eurospeech ’97, Rhodes, Greece, pp. 693–696.
Hallahan, W.I. (1995). DECtalk software: text-to-speech technology and implementation. Digital Technical Journal, 7:5–19.
Google Scholar
Kiss, G. and Olaszy, G. (1984). A HUNGAROVOX magyar nyelvű szótár nélküli valósidejű párbeszédes beszédszintetizáló rendszer. (Hungarovox, a Hungarian real time TTS synthesizer.) Információ Elektronika, 2:98–112.
Google Scholar
Koutny, I. (1999). Parsing Hungarian sentences in order to determine their Prosodic structure in a multilingual TTS system. In Proceedings of Eurospeech ’99, pp. 2091–2094.
Koutny, I. and Olaszy, G. (2000). Stress, focus and tempo in Hungarian sentences for TTS conversion, W. Jassem (Ed.), Speech and language technology, Poznan, Poland, pp. 57–70.
Németh, G., Zainkó, Cs., Olaszy, G., and Prószéky, G. (1999). Problems of creating a flexible e-mail reader for Hungarian. In Proceedings of Eurospeech ’99, pp. 939–942.
Olaszy, G. (1982). Some rules for the formant synthesis of Hungarian. In Proceedings of the 8th Acoustic Colloquium, Budapest, pp. 204–210.
Olaszy, G. (1989). MULTIVOX—A flexible text-to-speech system for Hungarian, Finnish, German, Esperanto, Italian and other languages for IBM PC. In Proceedings of the European Conference on Speech Communication and Technology, pp. 525–529.
Olaszy, G., Gordos, and G., Németh, G. (1992). The Multivox multilingual text-to-speech converter. In G. Bailly, C. Benoit, and T.R. Sawallis (Eds.), Talking Machines: Theories, Models, and Designs, Amsterdam, Elsevier, pp. 385–411.
Google Scholar
Olaszy, G. (1994). Hangidőtartam-módosító kisérletek a gépi beszéd ritmusának javítására. (Experiment on sound duration changes to prove the rhythm of synthesized speech.) In M. Gósy (Ed.), Beszédkutatás 1994, pp. 140–151. ssss
Olaszy, G. and Németh, G. (1997). Prosody generation for German concept-to-speech systems. (From theoretical intonation patterns to practical realisation.) Speech Communication 21, pp. 37–60.
Olaszy, G. and Olaszi, P. (1998). Hangidőtartamok mesterséges változtatása periódusok kivágásával, megismétlésével. (Changing the sound duration by inserting and deleting pitch periods.) In Beszédkutatás’98M. Gósy (Ed.), MTA Nyelvtudományi Intézet, Budapest, pp. 151–162.
Google Scholar
Olaszy, G., Németh, G., Olaszi, P., and Gordos, G. (1999). Interactive TTS supported speech message composer for large, limited but open information systems. In Proceedings of Eurospeech ’99, pp. 943–946.
Olaszy, G. (2000). A magyar beszéd-hangok specifikus időtartamainak meghatàrozàsa folyamatos beszèdre. (The definition of the specific sound durations of Hungarian for continuous speech). In Beszédkutatás ’2000M. Gósy (Ed.), MTA Nyelvtudományi Intézet, Budapest, Hungary, pp. 93–109.
Google Scholar
Prószéky, G. and Tihanyi, L. (1993). Humor: High-speed unification morphology and its applications for agglutinative languages. La tribune des industries de la langue, No. 10, OFIL, Paris, pp. 28–29.
Google Scholar
van Santen, J.P.H., Shih, C., and Möbius, B. (1998). Intonation. In R. Sproat (Ed.), Multilingual text-to-speech synthesis: The Bell Labs Approach, New York, Kluwer Academic Publishers, pp. 142–189.
Google Scholar
van Santen, J.P.H. (1998). Timing. In R. Sproat (Ed.), Multilingual text-to-speech synthesis: The Bell Labs Approach, New York, Kluwer Academic Publishers, pp. 115–139.
Google Scholar
Venditti, J.J. and van Santen, J.P.H. (1998). Modelling vowel duration for Japanese text-to-speech synthesis. In Proceedings of the 5th International Conference on Spoken Language Processing, Sydney, pp. 2043–2046.
Zellner, B. (1994). Pauses and the temporal structure of speech. In E. Keller (Ed.), Fundamentals of Speech Synthesis and Speech Recognition, New York, John Wiley & Sons, pp. 42–62.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Telecommunications and Telematics, Budapest University of Technology and Economics, H-1117, Budapest, Pázmány Péter sétány 1/d, Hungary
G. Olaszy, G. Németh, P. Olaszi, G. Kiss, Cs. Zainkó & G. Gordos

Authors

G. Olaszy
View author publications
You can also search for this author in PubMed Google Scholar
G. Németh
View author publications
You can also search for this author in PubMed Google Scholar
P. Olaszi
View author publications
You can also search for this author in PubMed Google Scholar
G. Kiss
View author publications
You can also search for this author in PubMed Google Scholar
Cs. Zainkó
View author publications
You can also search for this author in PubMed Google Scholar
G. Gordos
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Olaszy, G., Németh, G., Olaszi, P. et al. Profivox—A Hungarian Text-to-Speech System for Telecommunications Applications. International Journal of Speech Technology 3, 201–215 (2000). https://doi.org/10.1023/A:1026558915015

Download citation

Issue Date: December 2000
DOI: https://doi.org/10.1023/A:1026558915015

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Profivox—A Hungarian Text-to-Speech System for Telecommunications Applications

Abstract

Access this article

Similar content being viewed by others

Automatic speech recognition: a survey

Large-Language-Models (LLM)-Based AI Chatbots: Architecture, In-Depth Analysis and Their Performance Evaluation

A deep learning approaches in text-to-speech system: a systematic review and recent research perspective

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Profivox—A Hungarian Text-to-Speech System for Telecommunications Applications

Abstract

Access this article

Similar content being viewed by others

Automatic speech recognition: a survey

Large-Language-Models (LLM)-Based AI Chatbots: Architecture, In-Depth Analysis and Their Performance Evaluation

A deep learning approaches in text-to-speech system: a systematic review and recent research perspective

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation