Skip to main content

Topic Discovery from Document Using Ant-Based Clustering Combination

  • Conference paper
Web Technologies Research and Development - APWeb 2005 (APWeb 2005)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3399))

Included in the following conference series:

Abstract

This paper presents a topic discovery approach based on multi-ant colonies clustering combination. The algorithm consists of three parts. First, each document is represented as a vector of features in a vector space model. Then a hypergraph model is used to combine the clusterings produced by three kinds of ant-based algorithms with different moving speed. Finally, the topic of each cluster is extracted by re-computing the term weights. Test results show that the number of topics can be adaptively determined and clustering combination can improve the system performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Berkhin, P.: Survey of Clustering Data Mining Techniques. Accrue Software Research Paper (2002) [Online], Available http://www.accrue.com/products/researchpapers.htm

  • Deneubourg, J.L., Goss, S., Franks, N., Sendova-Franks, A., Detrain, C., Chretien, L.: The Dynamics of Collective Sorting: Robot-like Ant and Ant-like Robot. In: Meyer, J.A., Wilson, S.W. (eds.) Proc. First Conference on Simulation of Adaptive Behavior: From Animals to Animats, pp. 356–365. MIT Press, Cambridge (1991)

    Google Scholar 

  • Lumer, E., Faieta, B.: Diversity and Adaptation in Populations of Clustering Ants. In: Proc. Third International Conference on Simulation of Adaptive Behavior: From Animals to Animats, vol. 3, pp. 499–508. MIT Press, Cambridge (1994)

    Google Scholar 

  • Ramos, V., Merelo, J.J.: Self-organized Stigmergic Document Maps: Environment as a Mechanism for Context Learning. In: Alba, E., Herrera, F., Merelo, J.J. (eds.) AEB 2002 – 1st Spanish Conference on Evolutionary and Bio-Inspired Algorithms, Centro Univ. de Mérida, Mérida, Spain, pp. 284–293 (2002)

    Google Scholar 

  • Monmarché, N., Slimane, M., Venturini, G.: Antclass: Discovery of Clusters in Numeric Data by a Hybridization of an Ant Colony with the Kmeans Algorithm. Internal report No. 213, Laboratoire d’Informatique de l’Université de Tours, E3i Tours [Online], Available http://www.antsearch.univ-tours.fr/publi/MonSliVen99b.pdf

  • Wu, B., Zheng, Y., Liu, S., Shi, Z.: CSIM: a Document Clustering Algorithm Based on Swarm Intelligence. IEEE World Congress on Computational Intelligence, 477–482 (2002)

    Google Scholar 

  • Yang, Y., Kamel, M.: Clustering Ensemble Using Swarm Intelligence. In: IEEE Swarm Intelligence Symposium, pp. 65–71 (2003)

    Google Scholar 

  • Strehl, A., Ghosh, J.: Cluster Ensembles – a Knowledge Reuse Framework for Combining Partitionings. In: Proc. of AAAI, Edmonton, Canada, pp. 93–98. AAAI/MIT Press, Cambridge (2002)

    Google Scholar 

  • Ayad, H., Kamel, M.: Topic Discovery from Text Using Aggregation of Different Clustering Methods. In: Cohen, R., Spencer, B. (eds.) Advances in Artificial Intelligence, 15th Conference of the Canadian Society for Computational Studies of Intelligence, Calgary, Canada, pp. 161–175 (2002)

    Google Scholar 

  • Wu, K.J., Chen, M.C., Sun, Y.: Automatic Topics Discovery from Hyperlinked Documents. Information Processing and Management 40, 239–255 (2004)

    Article  Google Scholar 

  • Porter, M.F.: An Algorithm for Suffix Stripping. Program 14(3), 130–137 (1980)

    Google Scholar 

  • Salton, G., Wong, A., Yang, C.: A Vector Space Model for Automatic Indexing. Communications of the ACM 18(11), 613–620 (1975)

    Article  MATH  Google Scholar 

  • Salton, G., Buckley, C.: Term Weighting Approaches in Automatic Text Retrieval. Information processing and Management 24(5), 513–523 (1988)

    Article  Google Scholar 

  • Larsen, B., Aone, C.: Fast and Effective Text Mining Using Linear-time Document Clustering. In: Chaudhuri, S., Madigan, D. (eds.) Proc. of fifth ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, pp. 16–22 (1999)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Yang, Y., Kamel, M., Jin, F. (2005). Topic Discovery from Document Using Ant-Based Clustering Combination. In: Zhang, Y., Tanaka, K., Yu, J.X., Wang, S., Li, M. (eds) Web Technologies Research and Development - APWeb 2005. APWeb 2005. Lecture Notes in Computer Science, vol 3399. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-31849-1_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-31849-1_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-25207-8

  • Online ISBN: 978-3-540-31849-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics