Abstract
In this research paper we pinpoint at the need of redesigning of the WebSOM document map creation algorithm. We insist that the SOM clustering should be preceded by identifying major topics of the document collection. Furthermore, the SOM clustering should be preceded by a pre-clustering process resulting in creation of groups of documents with stronger relationships; the groups, not the documents, should be subject of SOM clustering. We propose appropriate algorithms and report on achieved improvements.
Research partialy supported under KBN research grant 4 T11C 026 25 “Maps and intelligent navigation in WWW using Bayesian networks and artificial immune systems”
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
B. Fritzke, A growing neural gas network learns topologies, in: G. Tesauro, D.S. Touretzky, and T.K. Leen (Eds.) Advances in Neural Information Processing Systems 7, MIT Press Cambridge, MA, 1995, pp. 625–632.
B. Fritzke, Some competitive learning methods, draft available from http://www.neuroinformatik.ruhr-unibochum.de/ini/VDM/research/gsn/JavaPaper
B. Fritzke, A self-organizing network that can follow non-stationary distributions, in: Proceeding of the International Conference on Artificial Neural Networks’ 97, Springer, 1997, pp.613–618
H.S. Loos, B. Frizke, DemoGNG v.1.5, 1998
J.C. Bezdek, S.K. Pal, Fuzzy Models for Pattern Recognition: Methods that Search for Structures in Data, IEEE, New York, 1992
K. Ciesielski, M. Dramiński, M. Kłopotek, M. Kujawiak, S. Wierzchoń: Architecture for graphical maps of Web contents. Proc. WISIS’2004, Warsaw
M. Kłopotek, M. Draminski, K. Ciesielski, M. Kujawiak, S.T. Wierzchon: Mining Document Maps. Proc. W1 — Statistical Approaches to Web Mining (SAWM) of PKDD’04, M. Gori, M. Celi, M. Nanni eds., Pisa, Italy, September 20–24, pp.87–98
K. Ciesielski, M. Dramiński, M. Kłopotek, M. Kujawiak, S. Wierzchoń: Mapping document collections in non-standard geometries. B. De Beats, R. De Caluwe, G. de Tre, J. Fodor, J. Kacprzyk, S. Zadrony (eds): Current Issues in Data and Knowledge Engineering Akademicka Oficyna Wydawnicza EXIT Warszawa 2004. pp.122–132.
K. Ciesielski, M. Dramiński, M. Kłopotek, M. Kujawiak, S. Wierzchoń: Clustering medical and biomedical texts — document map based approach. Proc. Sztuczna Inteligencja w Inynierii Biomedycznej SIIB’04, 19.10.2004, Krakw. ISBN-83-919051-5-2
D. Dubois, H. Prade, Fuzzy Sets and Systems. Theory and Applications, Academic Press, 1980
T. Hoffmann, Probabilistic Latent Semantic Analysis, in: Proceedings of the 15th Conference on Uncertainty in AI, 1999, pages 289–296
M.A. Kłopotek: Intelligent information retrieval on the Web. in: Szczepaniak, Piotr S.; Segovia, Javier; Kacprzyk, Janusz; Zadeh, Lotfi A. (Eds.): (2003) Intelligent Exploration of the Web Springer-Verlag ISBN 3-7908-1529-2, pp. 57–73
M.A. Kłopotek: A New Bayesian Tree Learning Method with Reduced Time and Space Complexity. Fundamenta Informaticae, 49 (no 4) 2002, IOS Press, pp. 349–367
K. Lagus, Text Mining with WebSOM, PhD Thesis, Helsinki University of Technology, 2000
S.T. Wierzchoń: Artificial immune systems. Theory and applications (in Polish). EXIT Academic Publishing House, Warsaw. 2001.
R.R. Yager, D.P. Filev, Approximate clustering via mountain method, IEEE Trans. on Systems, Man and Cybernetics, 24:1279–1284, 1994
C.T. Zahn, Graph-Theoretical Methods for Detecting and Describing Gestalt Clusters, IEEE Transactions on Computers, vol. C-20, no.1, 1971
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ciesielski, K., Dramiński, M., Kłopotek, M.A., Kujawiak, M., Wierzchoń, S.T. (2005). On Some Clustering Algorithms for Document Maps Creation. In: Kłopotek, M.A., Wierzchoń, S.T., Trojanowski, K. (eds) Intelligent Information Processing and Web Mining. Advances in Soft Computing, vol 31. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-32392-9_27
Download citation
DOI: https://doi.org/10.1007/3-540-32392-9_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25056-2
Online ISBN: 978-3-540-32392-1
eBook Packages: EngineeringEngineering (R0)