Skip to main content

Homogeneous Ants for Web Document Similarity Modeling and Categorization

  • Conference paper
  • First Online:
Book cover Ant Algorithms (ANTS 2002)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2463))

Included in the following conference series:

Abstract

The self-organizing and autonomous behavior of social insects such as ants presents an interesting and powerful metaphor for applications in the retrieval and management of large and fast growing amount of online information. The explosive growth of web documents has increasingly made more difficult and costly the manual task of organizing the documents into meaningful categories by human experts. Hence, it is desirable that some degree of automation be incorporated into the classification process to enable better scalability and prevent human classifiers from being overwhelmed by the deluge of information. This paper presents a preliminary investigation of applying a homogeneous multi-agent clustering system based on the self-organization behavior of the ants to the high-dimensional problem of web document categorization. A description of the text processing needed to obtain significant document features is included. The system will be evaluated on multi-class online English documents obtained from a popularly used search engine.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Lawrence, S., Giles, C.L.: Accessibility of Information on the Web. Nature, 400. (1999)

    Google Scholar 

  2. Cyveillance: Sizing the Internet. A Cyveillance Study (2000)

    Google Scholar 

  3. Yahoo! Web Directory: http://www.yahoo.com

  4. Baeza-Yates, R., Ribeiro-Yates, B.: Modern Information Retrieval. ACM, NY (1999)

    Google Scholar 

  5. Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis. Wiley, NY (1973)

    Google Scholar 

  6. Bonabeau, E., Dorigo, M., Theraulaz, G.: Swarm Intelligence: From Natural to Artificial Systems. Oxford University Press, New York, NY (1999)

    MATH  Google Scholar 

  7. Deneubourg, J. L., Goss, S., Franks, N.R., Sendova-Franks, A., Detrain, C., Chretien, L.: The Dynamics of Collective Sorting: Robot-like Ants and Ant-like Robots. Proc. Int. Conf. Sim. of Adap. Behavior: From Animals to Animats. MIT, MA (1990)

    Google Scholar 

  8. Lumer, E.D., Faieta, B.: Diversity and Adaptation in Populations of Clustering Ants. Proc. Int. Conf. Sim. of Adap. Behavior: From Animals to Animats. MIT, MA (1994)

    Google Scholar 

  9. Steinbach, M., Karypis, G., Kumar, V.: A Comparison of Document Clustering Techniques. KDD Workshop on Text Mining (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Hoe, K.M., Lai, W.K., Tai, T.S.Y. (2002). Homogeneous Ants for Web Document Similarity Modeling and Categorization. In: Dorigo, M., Di Caro, G., Sampels, M. (eds) Ant Algorithms. ANTS 2002. Lecture Notes in Computer Science, vol 2463. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45724-0_24

Download citation

  • DOI: https://doi.org/10.1007/3-540-45724-0_24

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-44146-5

  • Online ISBN: 978-3-540-45724-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics