Abstract
We present a novel approach to incremental document maps creation, which relies upon partition of a given collection of documents into a hierarchy of homogeneous groups of documents represented by different sets of terms. Further each group (defining in fact separate context) is explored by a modified version of the aiNet immune algorithm to extract its inner structure. The immune cells produced by the algorithm become reference vectors used in preparation of the final document map. Such an approach proves to be robust in terms of time and space requirements as well as the quality of the resulting clustering model.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Baraldi, A., Blonda, P.: A survey of fuzzy clustering algorithms for pattern recognition. IEEE Trans. on Systems, Man and Cybernetics 29B, 786–801 (1999)
Becks, A.: Visual Knowledge Management with Adaptable Document Maps. GMD research series 15 (2001) ISBN 3-88457-398-5
Berry, M.W., Drmač, Z., Jessup, E.R.: Matrices, vector spaces and information retrieval. SIAM Review 41(2), 335–362
Bezdek, J.C., Pal, S.K.: Fuzzy Models for Pattern Recognition: Methods that Search for Structures in Data. IEEE, New York (1992)
Boulis, C., Ostendorf, M.: Combining multiple clustering systems. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) PKDD 2004. LNCS, vol. 3202, pp. 63–74. Springer, Heidelberg (2004)
Ciesielski, K., et al.: Adaptive document maps. In: Proceedings of the Intelligent Information Processing and Web Mining (IIS:IIPWM 2006), Ustron (2006)
de Castro, L.N., von Zuben, F.J.: An evolutionary immune network for data clustering. In: SBRN 2000. IEEE Computer Society Press, Los Alamitos (2000)
de Castro, L.N., Timmis, J.: Artificial Immune Systems: A New Computational Intelligence Approach. Springer, Heidelberg (2002)
Fritzke, B.: Some competitive learning methods. Draft available from, http://www.neuroinformatik.ruhr-uni-bochum.de/ini/VDM/research/gsn/JavaPaper
Gilchrist, M.: Taxonomies for business: Description of a research project. In: 11 Nordic Conference on Information and Documentation, Reykjavik, Iceland, May 30 – June 1 (2001), http://www.bokis.is/iod2001/papers/Gilchrist_paper.doc
Hung, C., Wermter, S.: A constructive and hierarchical self-organising model in a non-stationary environment. In: Int. Joint Conference in Neural Networks (2005)
Kłopotek, M., Dramiński, M., Ciesielski, K., Kujawiak, M., Wierzchoń, S.T.: Mining document maps. In: Gori, M., Celi, M., Nanni, M. (eds.) Proceedings of Statistical Approaches to Web Mining Workshop (SAWM) at PKDD 2004, Pisa, pp. 87–98 (2004)
Kłopotek, M., Wierzchoń, S., Ciesielski, K., Dramiński, M., Czerski, D.: Conceptual Maps and Intelligent Navigation in Document Space (in Polish). Akademicka Oficyna Wydawnicza EXIT Publishing, Warszawa (to appear, 2006)
Kohonen, T.: Self-Organizing Maps. Springer Series in Information Sciences, vol. 30. Springer, Heidelberg (2001)
Lagus, K., Kaski, S., Kohonen, T.: Mining massive document collections by the WEBSOM method Information Sciences, vol. 163(1-3), pp. 135–156 (2004)
van Rijsbergen, C.J.: Information Retrieval. Butterworths, London (1979), http://www.dcs.gla.ac.uk/Keith/Preface.html
Wilson, D.R., Martinez, T.R.: Reduction techniques for instance-based learning algorithms. Machine Learning 38, 257–286 (2000)
Zhang, T., Ramakrishan, R., Livny, M.: BIRCH: Efficient data clustering method for large databases. In: Proc. ACM SIGMOD Int. Conf. on Data Management (1997)
Zhao, Y., Karypis, G.: Criterion functions for document clustering: Experiments and analysis, http://www-users.cs.umn.edu/~karypis/publications/ir.html
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ciesielski, K., Wierzchoń, S.T., Kłopotek, M.A. (2006). An Immune Network for Contextual Text Data Clustering. In: Bersini, H., Carneiro, J. (eds) Artificial Immune Systems. ICARIS 2006. Lecture Notes in Computer Science, vol 4163. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11823940_33
Download citation
DOI: https://doi.org/10.1007/11823940_33
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-37749-8
Online ISBN: 978-3-540-37751-1
eBook Packages: Computer ScienceComputer Science (R0)