Abstract
A key word with regard to a sub-corpus is a word of which the frequency in that sub-corpus is significantly higher than expected under the hypothesis that its use and the variable “part of the corpus” are mutually independent. A study in literary statistics almost invariably includes a chapter devoted to key words. However, a strong attack has been recently launched upon the way stylometry has been modelling texts since the classical works of Herdan, Guiraud or Muller. In fact statistical modelling seems as valid in stylistics as in any other field of the humanities and social sciences. What is questionable is the fact that many studies in literary statistics are more satisfied with the easy identification of monsters, i.e. literary phenomena unexplained by wrong models, than with the laborious research of models fitting the textual data well. A short examination of the mentioned controversy and the quantitative analysis of an example provided by Laclos' novelLes Liaisons dangereuses endeavour to support this argument.
Similar content being viewed by others
References
Brunet, Étienne.Le vocabulaire français de 1789 à nos jours d'aprés les données du Trésor de la Langue Française. 3 vol. Genève-Paris: Slatkine-Champion, 1981.
Brunet, Étienne. “L'hydre de l'urne ou Réponse à un acte d'accusation.”Cahiers de lexicologie, XLIII (1983), 3–31. Also published as “Le viol de l'urne.” InLa recherche française par ordinateur en langue et littérature. Éd. Colette Charpentier et Jean David. Geneve-Paris: Slatkine-Chamption, 1985, pp. 253-64 (+ discussion, pp. 270-71).
Damerau, Fred J. “The Use of Function Word Frequencies as Indicators of Style.”Computers and the Humanities, 3 (1975), 271–80.
Delcourt, Christian. “La statistique littéraire.” InMéthodes du texte. Éd. Maurice Delcroix et Fernand Hallyn. Paris-Gembloux: Duculot, 1987, pp. 132–47 and 365-66.
Dixon, W. J. et al.BMDP Statistical Software. Revised Printing. Berkeley-Los Angeles-London: University of California Press, 1983.
Geffroy, Annie et Pierre Lafon. “L'insécurité dans les grands ensembles. Aperçu critique surLe vocabulaire français de 1789 à nos jours d'Étienne Brunet.”Mots, 5 (1982), 129–41.
Guilbaud, George Th. “Fréquences et probabilités”. InL'analisi delle frequenze. Problemi di lessicologia. Ed. M. Fattori e M. Bianchi. Roma: Edizioni dell'Ateneo, 1982, pp. 39–61. Republished inInformatique et Sciences Humaines, 14 (1984), 49–67.
Kemp, Kenneth W. “Personal Observations on the Use of Statistical Methods in Quantitative Linguistics.” InThe Computer in Literary and Linguistic Studies. Eds. Alan Jones and R. F. Churchhouse. Cardiff: The University of Wales Press, 1976a, pp. 59–77.
Kemp, Kenneth W. “Aspects of the Statistical Analysis and Effective Use of Linguistic Data.”ALLC Bulletin, 4 (1976b), 14–22.
Lafon, Pierre. “Note à propos de l'article de S. Lusignan, ‘Textes, Corpus et Modèle Probabiliste’.”Informatique et Sciences Humaines, 14 (1984), 27–29.
Lusignan, Serge. “Textes, corpus et modèle probabiliste.”Informatique et Sciences Humaines, 14 (1984), 5–24.
Lusignan, Serge. “Quelques Réflexions sur le Statut Épistémologique du Texte Électronique.”Computers and the Humanities, 19 (1985), 209–12.
McCullagh, P. and J. A. Nelder.Generalized Linear Models. London-New York: Chapman and Hall, 1983.
Versini, Laurent.Laclos. Oeuvres complètes. Paris: Gallimard, 1979.
Author information
Authors and Affiliations
Additional information
Christian Delcourt is a senior lecturer in the Department of Romance Philology at the University of Liége.
Rights and permissions
About this article
Cite this article
Delcourt, C. Where have all the key words gone?. Comput Hum 23, 285–291 (1989). https://doi.org/10.1007/BF02176633
Issue Date:
DOI: https://doi.org/10.1007/BF02176633