Skip to main content

Quantitative Text Analysis Using L-, F- and T-Segments

  • Conference paper
Data Analysis, Machine Learning and Applications

Abstract

It is shown that word length and other properties of linguistic units display a lawful behavior not only in form of distributions but also with respect to their syntagmatic arrangement in a text. Based on L-segments (units of constant or increasing lengths), F-segments, and T-segments (units of constant or increasing frequency or polytextuality respectively), the dynamic behavior of segment patterns is investigated. Theoretical models are derived on the basis of plausible assumptions on influences of the properties of individual units on the properties of their constituents in the text. The corresponding hypotheses are tested on data from 66 German texts of four authors and two different genres. Experiments with various characteristics show promising properties which could be useful for author and/or genre discrimination.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • ALTMANN, G. and KÖHLER, R. (1996): "Language Forces? and Synergetic Modelling of Language Phenomena. In: P. Schmidt [Ed.]: Glottometrika 15. Issues in General Linguis-tic Theory and The Theory of Word Length. WVT, Trier, 62-76.

    Google Scholar 

  • ANDERSEN, S. (2005): Word length balance in texts: Proportion constancy and word-chain-lengths in Proust’s longest sentence. Glottometrics 11, 32-50.

    Google Scholar 

  • BORODA, M. (1982): Häufigkeitsstrukturen musikalischer Texte. In: J. Orlov, M. Boroda, G. Moisei and I. Nadarejŝvili [Eds.]: Sprache, Text, Kunst. Quantitative Analysen. Brock-meyer, Bochum, 231-262.

    Google Scholar 

  • HERDAN, G. (1966): The advanced Theory of Language as Choice and Chance. Springer, Berlin et al., 423.

    Google Scholar 

  • KÖHLER, R. (1999): Syntactic Structures. Properties and Interrelations. Journal of Quantita-tive Linguistics 6, 46-57.

    Article  Google Scholar 

  • KÖHLER, R. (2000): A study on the informational content of sequences of syntactic units. In: L.A. Kuz’min [Ed.]: Jazyk, glagol, predlo?enie. K 70-letiju G. G. Sil’nitskogo. Smolensk, S. 51-61.

    Google Scholar 

  • KÖHLER, R. and G. ALTMANN (2000): Probability Distributions of Syntactic Units and Properties. Journal of Quantitative Linguistics 7/3, S.189-200.

    Article  Google Scholar 

  • KÖHLER, R. (2006b): Word length in text. A study in the syntagmatic dimension. To appear.

    Google Scholar 

  • KÖHLER, R. (2006a): The frequency distribution of the lengths of length sequences. In: J. Genzor and M. Bucková [Eds.]: Favete linguis. Studies in honour of Victor Krupa. Slovak Academic Press, Bratislava, 145-152.

    Google Scholar 

  • UHLÍHOVÁ, L. (2007): Word frequency and position in sentence. To appear.

    Google Scholar 

  • WIMMER, G. and ALTMANN, G. (1999): Thesaurus of Univariate Discrete Probability Distributions. Stamm, Essen.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Köhler, R., Naumann, S. (2008). Quantitative Text Analysis Using L-, F- and T-Segments. In: Preisach, C., Burkhardt, H., Schmidt-Thieme, L., Decker, R. (eds) Data Analysis, Machine Learning and Applications. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78246-9_75

Download citation

Publish with us

Policies and ethics