Abstract
We applied well-known WEBSOM method (based on two layer architecture) to categorization of Czech written documents. Our research was focused on the syntactic and semantic relationship within word categories of word category map (WCM). The document classification system was tested on a subset of 100 documents (manual work was necessary) from the corpus of Czech News Agency documents. The result confirmed that WEBSOM method could be hardly evaluated because humans have problems with natural language semantics and determination of semantic domains from word categories.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Semantic Web, http://www.w3.org/2001/sw
Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Preliminary draft. Cambridge University Press, Cambridge (2007)
Kohonen, T.: Self-Organizing map. Springer, Heidelberg (2001)
Fausset, L.V.: Fundamentals of neural networks. Prentice Hall, Engelwood Cliffs (1994)
Kaski, S., Honkela, T., Lagus, K., Kohonen, T.: WEBSOM – Self-Organizing Maps of Document Collections. Neurocomputer, 101–117 (1998)
Ritter, H., Kohonen, T.: Self-organizing semantic maps. Biological Cybernetics 61, 241–254 (1989)
Kohonen, T., Hynninen, J., Kangas, J., Laaksonen, J.: SOM-PAK, The self-organizing map program package (1996)
Vesanto, J., Himberg, J., Alhoniemi, E., Parhankangas, J.: SOM Toolbox for Matlab (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Mouček, R., Mautner, P. (2009). WEBSOM Method - Word Categories in Czech Written Documents. In: Matoušek, V., Mautner, P. (eds) Text, Speech and Dialogue. TSD 2009. Lecture Notes in Computer Science(), vol 5729. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04208-9_15
Download citation
DOI: https://doi.org/10.1007/978-3-642-04208-9_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04207-2
Online ISBN: 978-3-642-04208-9
eBook Packages: Computer ScienceComputer Science (R0)