Software text-to-speech

Hallahan, William I.; Vitale, Anthony J.

doi:10.1007/BF02277193

William I. Hallahan¹ &
Anthony J. Vitale²

132 Accesses
3 Altmetric
Explore all metrics

Abstract

This paper describes the port to software of a mature text-to-speech synthesis technology that has been sold as a series of hardware products for over ten years. Originally developed as an alternative to a character cell terminal and for telephony applications, today it is also used to provide people with visually disabilities access to information. The quality of text-to-speech is extremely high in both intelligibility and naturalness and uses a digital formant synthesizer to simulate the human vocal tract. Prior to very high speed processors, the computational demands of this synthesizer placed an extreme load on a workstation. This study used a Digital Equipment AlphaModel 600 workstation to simultanoeusly convert many text streams to speech. The power of modern RISC processors allows applications to freely use speech for output. This capability has prompted the need for a text-to-speech application programming interface (API). The API that we have developed for TTS software is supported on multiple platforms and multiple operating systems. This paper presents a description of the TTS software architecture. The API is also specified. Finally, our experience in porting the TTS code base from the previous hardware platforms is described.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Allen, J., Hunnicutt, S., and Klatt, D.H., (1987).From Text to Speech: The MITalk System, Cambridge: Cambridge University Press.
Google Scholar
Bruckert, E., Minow, M., and Tetschner, W. (1983). Three-tiered software and VLSI aid development system to read text aloud,Electronics.
Divay, M. and Vitale, A.J. (forthcoming). Algorithms for Grapheme-Phoneme Translation in French and English
Fant, G. (1960).Acoustic Theory of Speech Production, Netherlands: Mouton and Co. N. V.
Google Scholar
Flanagan, J.L. (1972).Analysis, Synthesis, and Perception. 2nd Ed. New York: Springer-Verlag.
Google Scholar
Fromkin, V. and Rodman, R. (1994).An Introduction to Language, 5th Ed. New York: Holt, Rinehart and Winston.
Google Scholar
Klatt, D.H. (1980). Software for a Cascade/Parallel Formant Synthesizer.Journal of the Acoustical Society of America, 67:971–975.
Article Google Scholar
Klatt, D.H. (1987). Review of Text-to-Speech Conversion for English.Journal of the Acoustical Society of America, 82(3):737–793.
Article PubMed Google Scholar
Klatt, D.H. and Klatt, L.C. (1990). Analysis, Synthesis, and Perception of Voice Quality Variations among Female and Male Talkers.Journal of the Acoustical Society of America, 87:820–857.
Article PubMed Google Scholar
Psioni, D.B., Nusbaum, H.C., and Greene, B.G. (1985). Perception of Synthetic Speech Generated by Rule.Proceedings of the IEEE, 73(11):1665–1676.
Google Scholar
Rabiner, L.R. and Gold, B. (1975).Theory And Application of Digital Signal Processing. London: Prentice Hall.
Google Scholar
Rabiner, L.R. and Schafer, R.W. (1978).Digital Processing of Speech Signals. London: Prentice Hall.
Google Scholar
Schmandt, C. (1994).Voice Communication with Computers, New York: Van Nostrand Reinhold.
Google Scholar
Tierney, J. (1975). Digital Frequency Synthesizers, Chapter V. In J. Gorski-Popel (Ed.),Frequency Synthesis: Techniques and Applications, N.Y.: IEEE Press.
Google Scholar

Download references

Author information

Authors and Affiliations

Digital Equipment Corporation, 110 Spitbrook Rd. (ZKO1-1/E37), Nashua, New Hampshire
William I. Hallahan
Digital Equipment Corporation, 200 Forest St. (MRO1-1/L31), 01752-3011, Marlborough, MA
Anthony J. Vitale

Authors

William I. Hallahan
View author publications
You can also search for this author in PubMed Google Scholar
Anthony J. Vitale
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hallahan, W.I., Vitale, A.J. Software text-to-speech. Int J Speech Technol 1, 121–134 (1997). https://doi.org/10.1007/BF02277193

Download citation

Received: 01 September 1995
Accepted: 18 September 1996
Issue Date: March 1997
DOI: https://doi.org/10.1007/BF02277193

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Software text-to-speech

Abstract

Access this article

Similar content being viewed by others

OCR with Tesseract, Amazon Textract, and Google Document AI: a benchmarking experiment

Usability Evaluation of Artificial Intelligence-Based Voice Assistants: The Case of Amazon Alexa

Apple Inc.

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Software text-to-speech

Abstract

Access this article

Similar content being viewed by others

OCR with Tesseract, Amazon Textract, and Google Document AI: a benchmarking experiment

Usability Evaluation of Artificial Intelligence-Based Voice Assistants: The Case of Amazon Alexa

Apple Inc.

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation