Abstract
Text collections may be regarded as an almost perfect application arena for unsupervised neural networks. This because many operations computers have to perform on text documents are classification tasks based on noisy patterns. In particular we rely on self-organizing maps which produce a map of the document space after their training process. Prom geography, however, it is known that maps are not always the best way to represent information spaces. For most applications it is better to provide a hierarchical view of the underlying data collection in form of an atlas where starting from a map representing the complete data collection different regions are shown at finer levels of granularity. Using an atlas, the user can easily “zoom” into regions of particular interest while still having general maps for overall orientation. We show that a similar display can be obtained by using hierarchical feature maps to represent the contents of a document archive. These neural networks have a layered architecture where each layer consists of a number of individual self-organizing maps. By this, the contents of the text archive may be represented at arbitrary detail while still having the general maps available for global orientation.
Preview
Unable to display preview. Download preview PDF.
References
M. A. Hearst and J. O. Pedersen. Reexamining the cluster hypothesis: Scatter/Gather on retrieval results. In Proc Int'l ACM SIGIR Conf on R&D in Information Retrieval (SIGIR'96), Zurich, Switzerland, 1996.
T. Kohonen. Self-organized formation of topologically correct feature maps. Biological Cybernetics, 43, 1982.
T. Kohonen. Self-organizing maps. Springer-Verlag, Berlin, 1995.
T. Kohonen, S. Kaski, K. Lagus, and T. Honkela. Very large two-level SOM for the browsing of newsgroups. In Proc Int'l Conf on Artificial Neural Networks (ICANN'96), Bochum, Germany, 1996.
K. Lagus, T. Honkela, S. Kaski, and T. Kohonen. Self-organizing maps of document collections: A new approach to interactive exploration. In Proc Int'l Conf on Knowledge Discovery and Data Mining (KDD-96), Portland, OR, 1996.
X. Lin, D. Soergel, and G. Marchionini. A self-organizing semantic map for information retrieval. In Proc Int'l ACM SIGIR Conf on R&D in Information Retrieval (SIGIR'91), Chicago, IL, 1991.
D. Merkl. A connectionist view on document classification. In Proc Australasian Database Conf (ADC'95), Adelaide, SA, 1995.
D. Merkl. Exploration of document collections with self-organizing maps: A novel approach to similarity representation. In Proc European Symp on Principles of Data Mining and Knowledge Discovery (PKDD'97), Trondheim, Norway, 1997.
D. Merkl. Exploration of text collections with hierarchical feature maps. In Proc Int'l ACM SIGIR Conf on R&D in Information Retrieval (SIGIR'97), Philadelphia, PA, 1997.
R. Miikkulainen. Script recognition with hierarchical feature maps. Connection Science, 2, 1990.
G. Salton. Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer. Addison-Wesley, Reading, MA, 1989.
P. Willet. Recend trends in hierarchic document clustering: A critical review. Information Processing & Management, 24, 1988.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Merkl, D., Rauber, A. (1998). CIA's view of the world and what neural networks learn from it: A comparison of geographical document space representation metaphors. In: Quirchmayr, G., Schweighofer, E., Bench-Capon, T.J. (eds) Database and Expert Systems Applications. DEXA 1998. Lecture Notes in Computer Science, vol 1460. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0054537
Download citation
DOI: https://doi.org/10.1007/BFb0054537
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64950-2
Online ISBN: 978-3-540-68060-4
eBook Packages: Springer Book Archive