Skip to main content

On Some Clustering Algorithms for Document Maps Creation

  • Conference paper
Intelligent Information Processing and Web Mining

Abstract

In this research paper we pinpoint at the need of redesigning of the WebSOM document map creation algorithm. We insist that the SOM clustering should be preceded by identifying major topics of the document collection. Furthermore, the SOM clustering should be preceded by a pre-clustering process resulting in creation of groups of documents with stronger relationships; the groups, not the documents, should be subject of SOM clustering. We propose appropriate algorithms and report on achieved improvements.

Research partialy supported under KBN research grant 4 T11C 026 25 “Maps and intelligent navigation in WWW using Bayesian networks and artificial immune systems”

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. B. Fritzke, A growing neural gas network learns topologies, in: G. Tesauro, D.S. Touretzky, and T.K. Leen (Eds.) Advances in Neural Information Processing Systems 7, MIT Press Cambridge, MA, 1995, pp. 625–632.

    Google Scholar 

  2. B. Fritzke, Some competitive learning methods, draft available from http://www.neuroinformatik.ruhr-unibochum.de/ini/VDM/research/gsn/JavaPaper

    Google Scholar 

  3. B. Fritzke, A self-organizing network that can follow non-stationary distributions, in: Proceeding of the International Conference on Artificial Neural Networks’ 97, Springer, 1997, pp.613–618

    Google Scholar 

  4. H.S. Loos, B. Frizke, DemoGNG v.1.5, 1998

    Google Scholar 

  5. J.C. Bezdek, S.K. Pal, Fuzzy Models for Pattern Recognition: Methods that Search for Structures in Data, IEEE, New York, 1992

    Google Scholar 

  6. K. Ciesielski, M. Dramiński, M. Kłopotek, M. Kujawiak, S. Wierzchoń: Architecture for graphical maps of Web contents. Proc. WISIS’2004, Warsaw

    Google Scholar 

  7. M. Kłopotek, M. Draminski, K. Ciesielski, M. Kujawiak, S.T. Wierzchon: Mining Document Maps. Proc. W1 — Statistical Approaches to Web Mining (SAWM) of PKDD’04, M. Gori, M. Celi, M. Nanni eds., Pisa, Italy, September 20–24, pp.87–98

    Google Scholar 

  8. K. Ciesielski, M. Dramiński, M. Kłopotek, M. Kujawiak, S. Wierzchoń: Mapping document collections in non-standard geometries. B. De Beats, R. De Caluwe, G. de Tre, J. Fodor, J. Kacprzyk, S. Zadrony (eds): Current Issues in Data and Knowledge Engineering Akademicka Oficyna Wydawnicza EXIT Warszawa 2004. pp.122–132.

    Google Scholar 

  9. K. Ciesielski, M. Dramiński, M. Kłopotek, M. Kujawiak, S. Wierzchoń: Clustering medical and biomedical texts — document map based approach. Proc. Sztuczna Inteligencja w Inynierii Biomedycznej SIIB’04, 19.10.2004, Krakw. ISBN-83-919051-5-2

    Google Scholar 

  10. D. Dubois, H. Prade, Fuzzy Sets and Systems. Theory and Applications, Academic Press, 1980

    Google Scholar 

  11. T. Hoffmann, Probabilistic Latent Semantic Analysis, in: Proceedings of the 15th Conference on Uncertainty in AI, 1999, pages 289–296

    Google Scholar 

  12. M.A. Kłopotek: Intelligent information retrieval on the Web. in: Szczepaniak, Piotr S.; Segovia, Javier; Kacprzyk, Janusz; Zadeh, Lotfi A. (Eds.): (2003) Intelligent Exploration of the Web Springer-Verlag ISBN 3-7908-1529-2, pp. 57–73

    Google Scholar 

  13. M.A. Kłopotek: A New Bayesian Tree Learning Method with Reduced Time and Space Complexity. Fundamenta Informaticae, 49 (no 4) 2002, IOS Press, pp. 349–367

    MathSciNet  Google Scholar 

  14. K. Lagus, Text Mining with WebSOM, PhD Thesis, Helsinki University of Technology, 2000

    Google Scholar 

  15. S.T. Wierzchoń: Artificial immune systems. Theory and applications (in Polish). EXIT Academic Publishing House, Warsaw. 2001.

    Google Scholar 

  16. R.R. Yager, D.P. Filev, Approximate clustering via mountain method, IEEE Trans. on Systems, Man and Cybernetics, 24:1279–1284, 1994

    Article  Google Scholar 

  17. C.T. Zahn, Graph-Theoretical Methods for Detecting and Describing Gestalt Clusters, IEEE Transactions on Computers, vol. C-20, no.1, 1971

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ciesielski, K., Dramiński, M., Kłopotek, M.A., Kujawiak, M., Wierzchoń, S.T. (2005). On Some Clustering Algorithms for Document Maps Creation. In: Kłopotek, M.A., Wierzchoń, S.T., Trojanowski, K. (eds) Intelligent Information Processing and Web Mining. Advances in Soft Computing, vol 31. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-32392-9_27

Download citation

  • DOI: https://doi.org/10.1007/3-540-32392-9_27

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-25056-2

  • Online ISBN: 978-3-540-32392-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics