Fast Clustering Algorithm for Information Organization

Shin, Kwangcheol; Han, Sangyong

doi:10.1007/3-540-36456-0_69

Kwangcheol Shin⁵ &
Sangyong Han⁵

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2588))

Included in the following conference series:

International Conference on Intelligent Text Processing and Computational Linguistics

866 Accesses
8 Citations

Abstract

This study deals with information organization for more efficient Internet document search and browsing results. As the appropriate algorithm for this purpose, this study proposes the heuristic algorithm, which functions similarly with the star clustering algorithm but performs a more efficient time complexity of O(kn),(k<<n) instead of O(n ²) found in the star clustering algorithm. The proposed heuristic algorithm applies the cosine similarity and sets vectors composed of the most non-zero elements as the initial standard value. The algorithm is purported to execute the clustering procedure based on the concept vector and produce clusters for information organization in O(kn) period of time. In order to see how fast the proposed algorithm is in producing clusters for organizing information, the algorithm is tested on TIME and CLASSIC3 in comparison with the star clustering algorithm.

This research is supported by the ITRI of Chung-Ang University.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aslam, J., Pelekhov, K., and Rus, D.: Information Organization Algorithms. In Proceedings of the International Conference on Advances in Infrastructure for Electronic Business, Science, and Education on the Internet.(2000)
Google Scholar
Dhillon I. S. and Modha, D. S.: Concept Decomposition for Large Sparse Text Data using Clustering. Technical Report RJ 10147(9502), IBM Almaden Research Center (1999)
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Computer Science and Engineering, Chung-Ang Univ., 221 Huksuk-Dong, 156-756, DongJak-Ku, Seoul, Korea
Kwangcheol Shin & Sangyong Han

Authors

Kwangcheol Shin
View author publications
You can also search for this author in PubMed Google Scholar
Sangyong Han
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Centro de Investigación en Computación (CIC), Instituto Politécnico Nacional (IPN), Col. Zacatenco, CP 07738, Mexico D.F., Mexico
Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shin, K., Han, S. (2003). Fast Clustering Algorithm for Information Organization. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2003. Lecture Notes in Computer Science, vol 2588. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36456-0_69

Download citation

DOI: https://doi.org/10.1007/3-540-36456-0_69
Published: 30 April 2003
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-00532-2
Online ISBN: 978-3-540-36456-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics