Abstract
The self-organizing and autonomous behavior of social insects such as ants presents an interesting and powerful metaphor for applications in the retrieval and management of large and fast growing amount of online information. The explosive growth of web documents has increasingly made more difficult and costly the manual task of organizing the documents into meaningful categories by human experts. Hence, it is desirable that some degree of automation be incorporated into the classification process to enable better scalability and prevent human classifiers from being overwhelmed by the deluge of information. This paper presents a preliminary investigation of applying a homogeneous multi-agent clustering system based on the self-organization behavior of the ants to the high-dimensional problem of web document categorization. A description of the text processing needed to obtain significant document features is included. The system will be evaluated on multi-class online English documents obtained from a popularly used search engine.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Lawrence, S., Giles, C.L.: Accessibility of Information on the Web. Nature, 400. (1999)
Cyveillance: Sizing the Internet. A Cyveillance Study (2000)
Yahoo! Web Directory: http://www.yahoo.com
Baeza-Yates, R., Ribeiro-Yates, B.: Modern Information Retrieval. ACM, NY (1999)
Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis. Wiley, NY (1973)
Bonabeau, E., Dorigo, M., Theraulaz, G.: Swarm Intelligence: From Natural to Artificial Systems. Oxford University Press, New York, NY (1999)
Deneubourg, J. L., Goss, S., Franks, N.R., Sendova-Franks, A., Detrain, C., Chretien, L.: The Dynamics of Collective Sorting: Robot-like Ants and Ant-like Robots. Proc. Int. Conf. Sim. of Adap. Behavior: From Animals to Animats. MIT, MA (1990)
Lumer, E.D., Faieta, B.: Diversity and Adaptation in Populations of Clustering Ants. Proc. Int. Conf. Sim. of Adap. Behavior: From Animals to Animats. MIT, MA (1994)
Steinbach, M., Karypis, G., Kumar, V.: A Comparison of Document Clustering Techniques. KDD Workshop on Text Mining (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hoe, K.M., Lai, W.K., Tai, T.S.Y. (2002). Homogeneous Ants for Web Document Similarity Modeling and Categorization. In: Dorigo, M., Di Caro, G., Sampels, M. (eds) Ant Algorithms. ANTS 2002. Lecture Notes in Computer Science, vol 2463. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45724-0_24
Download citation
DOI: https://doi.org/10.1007/3-540-45724-0_24
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44146-5
Online ISBN: 978-3-540-45724-4
eBook Packages: Springer Book Archive