Abstract
Word trees are a common way of representing frequency information obtained by analyzing natural language data. This article explores their usage and possibilities, and addresses the development of an application to visualize the relative frequencies of 2-grams and 3-grams in Google’s ”English One Million” corpus using a two-sided word tree and sparklines to show usage trends through time. It also discusses how the raw data was processed and trimmed to speed up access to it.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Callaghan, T.: Dimensional interaction of hue and brightness in preattentive field segregation. Attention, Perception, and Psychophysics 36, 25–34 (1984), doi 10.3758/BF03206351
Culy, C., Lyding, V.: Double tree: an advanced kwic visualization for expert users. In: 14th International Conference Information Visualisation, pp. 98–103 (2010)
Healey, C., Enns, J.: Large datasets at a glance: combining textures and colors in scientific visualization. IEEE Transactions on Visualization and Computer Graphics 5(2), 145–167 (1999)
Healey, C.G.: Perception in visualization, http://www4.ncsu.edu/healey/pp/index.html
Kosara, R.: Blur and uncertainty visualization, http://eagereyes.org/techniques/blur-and-uncertainty
Treisman, A.M., Gelade, G.: A feature-integration theory of attention. Cognitive Psychology 12(1), 97–136 (1980)
Wattenberg, M., Viegas, F.: The word tree, an interactive visual concordance. IEEE Transactions on Visualization and Computer Graphics 14(6), 1221–1228 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Esteban, A., Therón, R. (2011). CorpusExplorer: Supporting a Deeper Understanding of Linguistic Corpora. In: Dickmann, L., Volkmann, G., Malaka, R., Boll, S., Krüger, A., Olivier, P. (eds) Smart Graphics. SG 2011. Lecture Notes in Computer Science, vol 6815. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22571-0_11
Download citation
DOI: https://doi.org/10.1007/978-3-642-22571-0_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-22570-3
Online ISBN: 978-3-642-22571-0
eBook Packages: Computer ScienceComputer Science (R0)