Skip to main content

Document Clustering Based on Vector Quantization and Growing-Cell Structure

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2718))

Abstract

In this paper, we proposed a new hybrid clustering algorithm based on Vector Quantization (VQ) and Growing-Cell Structure (GCS). The basic idea is using VQ to refine the GCS clustering results and thus to improve the clustering performance. Moreover, the output of the proposed clustering algorithm has a graph structure which is generated gradually during the incremental self-learning process. We evaluate the proposed method on real collections of text documents and the experimental results show that our method achieves better performance comparing with others.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Imielinski, T. and Swami, A., Mining associations between sets of items in massive databases. In Buneman, P., Jajodia, S. (eds). Proceedings of the ACM SIGMOD international conference on management of data, Washington DC, May 1993, pp 207–216

    Google Scholar 

  2. Fritzke, B., Growing cell structures-a self-organizing network for unsupervised and supervised learning. Neural Networks, 7(9):-1460, 1994.

    Article  Google Scholar 

  3. Kohonen, T., Self-organized formation of topologically correct feature maps. Biological Cybernetics, pp 43–69, 1982

    Google Scholar 

  4. Kohonen, T., Learning vector quantization. In M. Arbib, editor, The Handbook of Brain Theory and Neural Networks, pages 537–540. MIT Press, 1995.

    Google Scholar 

  5. Van Rijsbergen, C. J., Information Retrieval. 2nd edition, London, Butterworths, 1979.

    Google Scholar 

  6. Rocchio, J., Document Retrieval Systems — Optimization and Evaluation. PhD. Thesis, Harvard University, 1966.

    Google Scholar 

  7. Salton, G. and Buckley, C. Term-weighting approaches in automatic text retrieval. Information Processing and Management, 24:513–523. 1988

    Article  Google Scholar 

  8. Shannon, C., E., A Mathematical Theory of Communication, Bell Syst. Tech. J., 27, 379–423, 623–656. 1948

    MathSciNet  Google Scholar 

  9. Voorhees, E. M., Implementing agglomerative hierarchical clustering algorithms for use in document retrieval. Information Processing and Management, 22:465–476, 1986

    Article  Google Scholar 

  10. http://www.daviddlewis.com/resources/testcollections/reuters21578/

  11. http://www.sina.com

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Su, Z., Zhang, L., Pan, Y. (2003). Document Clustering Based on Vector Quantization and Growing-Cell Structure. In: Chung, P.W.H., Hinde, C., Ali, M. (eds) Developments in Applied Artificial Intelligence. IEA/AIE 2003. Lecture Notes in Computer Science(), vol 2718. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45034-3_33

Download citation

  • DOI: https://doi.org/10.1007/3-540-45034-3_33

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-40455-2

  • Online ISBN: 978-3-540-45034-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics