Abstract
This paper introduces a new type of Self-Organizing Map (SOM) for Text Categorization and Semantic Browsing. We propose a “hyperbolic SOM” (HSOM) based on a regular tesselation of the hyperbolic plane, which is a non-euclidean space characterized by constant negative gaussian curvature. This approach is motivated by the observation that hyperbolic spaces possess a geometry where the size of a neighborhood around a point increases exponentially and therefore provides more freedom to map a complex information space such as language into spatial relations. These theoretical findings are supported by our experiments, which show that hyperbolic SOMs can successfully be applied to text categorization and yield results comparable to other state-of-the-art methods. Furthermore we demonstrate that the HSOM is able to map large text collections in a semantically meaningful way and therefore allows a “semantic browsing” of text databases.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
D.S. Bradburn. Reducing transmission error effects using a self-organizing network. In Proc. of the IJCNN89, volume II, pages 531–538, San Diego, CA, 1989.
H. S. M. Coxeter. Non Euclidean Geometry. Univ. of Toronto Press, Toronto, 1957.
R. Fricke and F. Klein. Vorlesungen über die Theorie der automorphen Funktionen, volume 1. Teubner, Leipzig, 1897. Reprinted by Johnson Reprint, New York, 1965.
T. Joachims. Text categorization with support vector machines: Learning with many relevant features. Technical Report LS8-Report 23, Universität Dortmund, 1997.
T. Joachims. Text categorization with support vector machines: learning with many relevant features. In Proceedings of ECML-98, 10th European Conference on Machine Learning, number 1398, pages 137–142, Chemnitz, DE, 1998.
F. Klein and R. Fricke. Vorlesungen über die Theorie der elliptischen Modulfunktionen. Teubner, Leipzig, 1890. Reprinted by Johnson Reprint, New York, 1965.
T. Kohonen. Self-Organizing Maps. Springer Series in Information Sciences. Springer, second edition edition, 1997.
Pasi Koikkalainen and Erkki Oja. Self-organizing hierarchical feature maps. In Proc. of the IJCNN 1990, volume II, pages 279–285, 1990.
John Lamping and Ramana Rao. Laying out and visualizing large trees using a hyperbolic space. In Proceedings of UIST’94, pages 13–14, 1994.
John Lamping, Ramana Rao, and Peter Pirolli. A focus+content technique based on hyperbolic geometry for viewing large hierarchies. In Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems, Denver, May 1995. ACM.
W. Magnus. Noneuclidean Tesselations and Their Groups. Academic Press, 1974.
Charles W. Misner, J. A. Wheeler, and Kip S. Thorne. Gravitation. Freeman, 1973.
Frank Morgan. Riemannian Geometry: A Beginner’s Guide. Jones and Bartlett Publishers, Boston, London, 1993.
H. Ritter, T. Martinetz, and K. Schulten. Neural Computation and Self-organizing Maps. Addison Wesley Verlag, 1992.
Helge Ritter. Self-organizing maps in non-euclidian spaces. In E. Oja and S. Kaski, editors, Kohonen Maps, pages 97–108. Amer Elsevier, 1999.
G. Salton and C. Buckley. Term-weighting approaches in automatic text retrieval. Information Processing and Management, 24(5):513–523, 1988.
F. Sebastiani. Machine learning in automated text categorisation: a survey. Technical Report IEI-B4-31-1999, Istituto di Elaborazione dell’Informazione, Consiglio Nazionale delle Ricerche, Pisa, IT, 1999.
F. Sebastiani, A. Sperduti, and N. Valdambrini. An improved boosting algorithm and its application to automated text categorization. In Proceedings of CIKM-00, 9th ACM International Conference on Information and Knowledge Management, pages 78–85, 2000.
Karl Strubecker. Differentialgeometrie III: Theorie der Flachenkrummung. Walter de Gruyter & Co, Berlin, 1969.
J.A. Thorpe. Elementary Topics in Differential Geometry. Springer-Verlag, New York, Heidelberg, Berlin, 1979.
Y. Yang. An evaluation of statistical approaches to text categorization. Information Retrieval, 1–2(1):69–90, 1999.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ontrup, J., Ritter, H. (2001). Text Categorization and Semantic Browsing with Self-Organizing Maps on Non-euclidean Spaces. In: De Raedt, L., Siebes, A. (eds) Principles of Data Mining and Knowledge Discovery. PKDD 2001. Lecture Notes in Computer Science(), vol 2168. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44794-6_28
Download citation
DOI: https://doi.org/10.1007/3-540-44794-6_28
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42534-2
Online ISBN: 978-3-540-44794-8
eBook Packages: Springer Book Archive