Abstract
This paper describes an experimental real-time recognizer of isolated word dictation implemented at the IBM Thomas J. Watson Research Center, on a system of commercially available computers and array processors. The recognizer’s intended use is creation of office memoranda. It is based on a 5000-word vocabulary. A specially designed workstation enables the user to correct and edit the transcribed speech.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
F. Jelinek, “Continuous speech recognition by statistical methods,” Proc. IEEE, vol. 64, no. 4, pp. 532–556, Apr. 1976.
L. R. Bahl, F. Jelinek, and R. L. Mercer, “A maximum likelihood approach to continuous speech recognition,” IEEE Trans. Pattern Anal. Machine Intell., vol. PAMI-5, no. 2, pp. 179–190, Mar. 1983.
J. D. Gould, J. Conti, and T. Hovanyecz, “Composing letters with a simulated listening typewriter,” Commun. ACM, vol. 26, no. 4, pp. 295–308, Apr. 1983.
E. Goldwasser, an unpublished memorandum, 1980.
J. R. Cohen, “Application of a sensor—Neural model to speech recognition,” to be published.
H. Abut, R. M. Gray, and G. Rebolledo, “Vector quantization of speech and speech-like waveforms,” IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-30, no. 3, pp. 423–435, June 1982.
A. Nadas, R. L. Mercer, L. R. Bahl, R. Bakis, P. S. Cohen, A. G. Cole, F. Jelinek, and B. L. Lewis, “Continuous speech recognition with automatically selected acoustic prototypes obtained by either bootstrapping or clustering,” in Proc. Int. Conf. on Acoustics, Speech, and Signal Processing ( Atlanta, GA, Apr. 1981 ), pp. 1153–1155.
J. D. Ferguson, “Hidden Markov analysis: An introduction,” in J. D. Ferguson, Ed., Hidden Markov Models for Speech. Princeton, NJ: IDA-CRD, Oct. 1980, pp. 8–15.
L. E. Baum, “An inequality and associated maximization technique in statistical estimation of probabilistic functions of Markov processes,” Inequalities, vol. 3, no. 1, pp. 1–8, 1972.
S. Katz, “Recursive M-Gram language model via a smoothing of Turing’s formula,” a forthcoming paper.
I.J. Good, The Estimation of Probabilities: An Essay on Modem Bayesian Methods. Cambridge, MA: MIT Press, Mar. 1965.
F. Jelinek, “A fast sequential decoding algorithm using a stack,” IBM J. Res. Devel., vol. 13, pp. 675–685, Nov. 1969.
D. P. Huttenlocher and V. W. Zue, “A model of lexical access from partial phonetic information,” in Proc. ICAASP84, vol. 2, pp. 26.4.1–26,4. 4, Mar. 1984.
T. Kaneko and N. R. Dixon, “A hierarchical decision approach to large vocabulary discrete utterance recognition,” IEEE Trans. Accoust., Speech, Signal Processing, vol. ASSP-31, no. 5, pp. 1061–1066, Oct. 1983.
A. Averbuch et al., “A real-time, isolated-word, speech recognition system for dictation transcription,” in Proc. Int. Conf. on Acoustics, Speech, and Signal Processing ( Tampa, FL, Mar. 1985 ).
F. Jelinek, R. L. Mercer, L. R. Bahl, and J. K. Baker, “Perplexity —A measure of difficulty of speech recognition tasks,” presented at the 94th Meet. Acoustical Society of America, Miami Beach, FL, Dec. 15, 1977.
M. M. Sondhi and S. E. Levinson, “Computing relative redundancy to measure grammatical constraint in speech recognition tasks,” in Proc. Int. Conf. on Acoustics, Speech, and Signal Processing ( Tulsa, OK, Apr. 1978 ), pp. 409–412.
W. N. Francis and H. Kucera, Frequency Analysis of English Usage. Boston, MA: Houghton-Mifflin, 1982.
J. B. Carroll, P. Davies, and B. Richman, Word Frequency Book. New York, NY: American Heritage, 1971.
H. Kucera, personal communication.
R. L. Mercer, personal communication.
F. Jelinek, Probabilistic Information Theory. New York, NY: McGraw-Hill, 1968.
C. E. Shannon, “Prediction and entropy of printed English,” Bell Syst. Tech. J., vol. 30, pp. 50–64, 1951.
T. M. Cover and R. C. King, “A convergent gambling estimate of the entropy of English,” IEEE Trans. Informat. Theory, vol. IT-24, no. 4, pp. 413–420, July 1978.
S. Delia Pietra and V. Delia Pietra, personal communication.
J. D. Gould and S. J. Boies, “Human factors challenges in creating a principal support office system—The speech filing system approach,” ACM Trans. Office Inform. Syst., vol. 1, no. 4, pp. 273–298, Oct. 1983.
27], “Writing, dictating, and speaking letters,” Science, vol. 201, pp. 1145–1147, 1978.
L. R. Bahl and F. Jelinek, “Decoding for channels with insertions, deletions, and substitutions with applications to speech recognition,” IEEE Trans. Informat. Theory, vol. IT-21, no. 4, pp. 404–411, July 1975,
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1986 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jelinek, F. (1986). The Development of an Experimental Discrete Dictation Recognizer. In: Hommel, G., Schindler, S. (eds) Informatik-Anwendungen — Trends und Perspektiven. Informatik-Fachberichte, vol 126. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-71388-0_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-71388-0_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-16813-3
Online ISBN: 978-3-642-71388-0
eBook Packages: Springer Book Archive