Skip to main content

A Novel Typical-Sample-Weighted Clustering Algorithm for Large Data Sets

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3801))

Abstract

In the field of cluster analysis, most of existing algorithms are developed for small data sets, which cannot effectively process the large data sets encountered in data mining. Moreover, most clustering algorithms consider the contribution of each sample for classification uniformly. In fact, different samples should be of different contribution for clustering result. For this purpose, a novel typical-sample-weighted clustering algorithm is proposed for large data sets. By the atom clustering, the new algorithm extracts the typical samples to reduce the data amount. Then the extracted samples are weighted by their corresponding typicality and then clustered by the classical fuzzy c-means (FCM) algorithm. Finally, the Mahalanobis distance is employed to classify each original sample into obtained clusters. It is obvious that the novel algorithm can improve the speed and robustness of the traditional FCM algorithm. The experimental results with various test data sets illustrate the effectiveness of the proposed clustering algorithm.

This work was supported by National Natural Science Foundation of China (No.60202004), the Key project of Chinese Ministry of Education (No.104173) and the program for New Century Excellent Talents in University of China.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Qing, H.: Advance of the theory and application of fuzzy clustering analysis. Fuzzy System and Fuzzy Mathematics 12(2), 89–94 (1998) (in Chinese)

    Google Scholar 

  2. Gao, X.: Optimization and Applications Research on Fuzzy Clustering Algorithms. Doctoral Thesis, Xidian University, Xi’an 710071, China (1999)

    Google Scholar 

  3. Anderberg, M.R.: Cluster Analysis for Applications. Academic Press, London (1973)

    MATH  Google Scholar 

  4. Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley & Sons, Chichester (1990)

    Google Scholar 

  5. Everitt, B.: Cluster Analysis, pp. 45–60. Heinemann Educational Books Ltd., New York (1974)

    Google Scholar 

  6. Gao, X., Li, J., Ji, H.: An automatic multi-threshold image segmentation algorithm based on weighting FCM and statistical test. Acta Electronica Sinica 32(4), 661–664 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Li, J., Gao, X., Jiao, L. (2005). A Novel Typical-Sample-Weighted Clustering Algorithm for Large Data Sets. In: Hao, Y., et al. Computational Intelligence and Security. CIS 2005. Lecture Notes in Computer Science(), vol 3801. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11596448_103

Download citation

  • DOI: https://doi.org/10.1007/11596448_103

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-30818-8

  • Online ISBN: 978-3-540-31599-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics