Skip to main content
Log in

Letter constraints within words in printed English

  • Published:
Kybernetik Aims and scope Submit manuscript

Summary

Due to the manner in which the English language is used, words exhibit strong internal constraints on letters, but some additional constraint may be imposed by the context in which words appear. In order to estimate the internal constraints of words and the overall effect of context, an experiment was carried out using 225 human subjects who predicted letters in each of the first four positions within words, both with and without context prior to the words. It was found that as more letters at the beginning of words are given, prediction of the following letters increases monotonically, but the increase is not smooth. Prediction of the third letter of words given the first two letters is only a little better than prediction of the second letter given only the first. This effect may be explained by the probable combinations of vowels and consonants at the beginning of words. Letters in the first two positions show no improvement due to long context but prediction of later letters is increased by such context so that prediction rises smoothly from the initial letter to the fourth letter. Also, the type of word in which the letters are to be predicted affects the prediction, function words showing more constraint on letters than content words. The difference between function and content words does not take effect, however, until the first two letters of the word are given. Using the prediction data from words preceded by long context, extrapolations of constraint out to the tenth letter were obtained. From the values of constraint at the first ten letter positions it was possible to estimate the maximum unilateral sequential constraint in English. A value of about 48% was obtained which compares with previous estimates of 50%. A further evaluation of the overall effect of context indicates that about 81% of the constraint in English is contained within the words themselves, and the other 19% is due to any additional context.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Bibliography

  • Aborn, M., H. Rubenstein and T. D. Sterling: Sources of contextual constraint upon words in sentences. J. exp. Psychol. 57, 171–180 (1959).

    Google Scholar 

  • Bell, D. A.: Information theory and its engineering applications, p. 126–129. London: Sir Isaac Pitman and Sons 1953.

    Google Scholar 

  • Burton, N. G., and J. C. B. Licklider: Long-range constraints in the statistical structure of printed English. Amer. J. Psychol. 68, 650–653 (1955).

    Google Scholar 

  • Feller, W.: An introduction to probability theory and its applications. New York: John Wiley and Sons, Inc. 1957.

    Google Scholar 

  • Fries, C.C.: The structure of English. New York-Harcourt-Brace: 1952.

  • Garner, W. R., and D. H. Carson: A multivariate solution of the redundancy of printed English. Psychol. Rep. 6, 123–141 (1960).

    Google Scholar 

  • Harris, Z.: From phoneme to morpheme. Lang 31, 190–223 (1955).

    Google Scholar 

  • IRE Standards Committee on Information Theory, Definition of terms. Proc. Ire 46, 1646–1648 (1958).

  • Mandelbrot, B.: On recurrent noise limiting coding. In E. Weber, Symposium on information networks, p. 205–221. Brooklyn and New York: Polytechnic Institute of Brooklyn 1954.

    Google Scholar 

  • Miller, G. A., and E. A. Friedman: The reconstruction of mutilated English texts. Inform. Control 1, 38–55 (1957).

    Google Scholar 

  • Miller, G. A., E. B. Newman and E. A. Friedman: Length-frequency statistics for written English. Inform. Control. 1, 370–389 (1958).

    Google Scholar 

  • Newman, E. B.: The pattern of vowels and consonants in various languages. Amer. J. Psychol. 64, 369–379 (1951).

    Google Scholar 

  • Newman, E. B., and L. S. Gerstman: A new method for analyzing printed English. J. exp. Psychol. 44, 114–125 (1952).

    Google Scholar 

  • Roberts, P.: Fries'group D. Lang 31, 20–25 (1955).

    Google Scholar 

  • Shannon, C. E.: A mathematical theory of communication. Bell Syst. techn. J. 27, 379–423 (1948).

    Google Scholar 

  • Prediction and entropy of printed English. Bell Syst. techn. J. 30, 50–64 (1951).

    Google Scholar 

  • Thorndike E. L., and I. Lorge: The teacher's wordbook of 30,000 words Teachers college. New York: Columbia University Press 1944

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

This paper is based on a dissertation submitted to the Department of Psychology, The Johns Hopkins University, in partial fulfillment of the requirements for the Ph. D. degree. The research was done under Contract Nonr-248(55) between the Office of Naval Research and The Johns Hopkins University. This is Report No. 13 under that contract. Reproduction in whole or in part is permitted for any purpose of the United States Government.

During the period of this investigation the author was a National Institutes of Health Fellow. The author wishes to thank Wendell R. Garner for his encouragement and advice.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Carson, D.H. Letter constraints within words in printed English. Kybernetik 1, 46–54 (1961). https://doi.org/10.1007/BF00293854

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF00293854

Keywords

Navigation