Skip to main content

CorpusExplorer: Supporting a Deeper Understanding of Linguistic Corpora

  • Conference paper
  • 925 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 6815))

Abstract

Word trees are a common way of representing frequency information obtained by analyzing natural language data. This article explores their usage and possibilities, and addresses the development of an application to visualize the relative frequencies of 2-grams and 3-grams in Google’s ”English One Million” corpus using a two-sided word tree and sparklines to show usage trends through time. It also discusses how the raw data was processed and trimmed to speed up access to it.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   54.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Callaghan, T.: Dimensional interaction of hue and brightness in preattentive field segregation. Attention, Perception, and Psychophysics 36, 25–34 (1984), doi 10.3758/BF03206351

    Article  Google Scholar 

  2. Culy, C., Lyding, V.: Double tree: an advanced kwic visualization for expert users. In: 14th International Conference Information Visualisation, pp. 98–103 (2010)

    Google Scholar 

  3. Healey, C., Enns, J.: Large datasets at a glance: combining textures and colors in scientific visualization. IEEE Transactions on Visualization and Computer Graphics 5(2), 145–167 (1999)

    Article  Google Scholar 

  4. Healey, C.G.: Perception in visualization, http://www4.ncsu.edu/healey/pp/index.html

  5. Kosara, R.: Blur and uncertainty visualization, http://eagereyes.org/techniques/blur-and-uncertainty

  6. Treisman, A.M., Gelade, G.: A feature-integration theory of attention. Cognitive Psychology 12(1), 97–136 (1980)

    Article  Google Scholar 

  7. Wattenberg, M., Viegas, F.: The word tree, an interactive visual concordance. IEEE Transactions on Visualization and Computer Graphics 14(6), 1221–1228 (2008)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Esteban, A., Therón, R. (2011). CorpusExplorer: Supporting a Deeper Understanding of Linguistic Corpora. In: Dickmann, L., Volkmann, G., Malaka, R., Boll, S., Krüger, A., Olivier, P. (eds) Smart Graphics. SG 2011. Lecture Notes in Computer Science, vol 6815. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22571-0_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-22571-0_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-22570-3

  • Online ISBN: 978-3-642-22571-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics