skip to main content
10.1145/2001576.2001742acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
research-article

PSO aided k-means clustering: introducing connectivity in k-means

Published:12 July 2011Publication History

ABSTRACT

Clustering is a fundamental and hence widely studied problem in data analysis. In a multi-objective perspective, this paper combines principles from two different clustering paradigms: the connectivity principle from density-based methods is integrated into the partitional clustering approach. The standard k-Means algorithm is hybridized with Particle Swarm Optimization. The new method (PSO-kMeans) benefits from both a local and a global view on data and alleviates some drawbacks of the k-Means algorithm; thus, it is able to spot types of clusters which are otherwise difficult to obtain (elongated shapes, non-similar volumes). Our experimental results show that PSO-kMeans improves the performance of standard k-Means in all test cases and performs at least comparable to state-of-the-art methods in the worst case. PSO-kMeans is robust to outliers. This comes at a cost: the preprocessing step for finding the nearest neighbors for each data item is required, which increases the initial linear complexity of k-Means to quadratic complexity.

References

  1. A. Abraham, S. Das, and S. Roy. Swarm intelligence algorithms for data clustering. Soft Computing for Knowledge Discovery and Data Mining, Springer Verlag, pages 279--313, 2007.Google ScholarGoogle Scholar
  2. J. C. Bezdek, S. Boggavarapu, L. O. Hall, and A. Bensaid. Genetic algorithm guided clustering. In International Conference on Evolutionary Computation, pages 34--39, 1994.Google ScholarGoogle ScholarCross RefCross Ref
  3. M. Breaban, L. Alboaie, and H. Luchian. Guiding users within trust networks using swarm algorithms. In Proceedings of the Eleventh conference on Congress on Evolutionary Computation, CEC'09, pages 1770--1777, Piscataway, NJ, USA, 2009. IEEE Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. M. Breaban and H. Luchian. A unifying criterion for unsupervised clustering and feature selection. Pattern Recognition, In Press, Corrected Proof:--, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. X. Cui, T. E. Potok, and P. Palathingal. Document clustering using particle swarm optimization. In IEEE Swarm Intelligence Symposium, The Westin, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  6. D. L. Davies and D. W. Bouldin. A cluster separation measure. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1(2):224--227, 1979.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. D. Dumitrescu and K. Simon. Evolutionary prototype selection. In Proceedings of the International Conference on Theory and Applications of Mathematics and Informatics -- ICTAMI, pages 183--190, 2003.Google ScholarGoogle Scholar
  8. J. Handl and J. Knowles. Improving the scalability of multiobjective clustering. In Proceedings of the Congress on Evolutionary Computation, 2005.Google ScholarGoogle Scholar
  9. J. Handl and J. Knowles. Improving the scalability of multiobjective clustering. In Proceedings of the Congress on Evolutionary Computation, pages 2372--2379. IEEE Press, 2005.Google ScholarGoogle Scholar
  10. J. Handl, J. Knowles, and M. Dorigo. Ant-based clustering and topographic mapping. Artificial Life, 12, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. A. Hubert. Comparing partitions. Journal of Classification, 2:193--198, 1985.Google ScholarGoogle ScholarCross RefCross Ref
  12. D. R. Jones and M. A. Beltramo. Solving partitioning problems with genetic algorithms. In 4th International Conference on Genetic Algorithms, pages 442--45O, 1991.Google ScholarGoogle Scholar
  13. J. Kennedy and R. Eberhart. Particle swarm optimization. In Proceedings of the 1995 IEEE International Conference on Neural Networks, volume 4, pages 1942--1948, 1995.Google ScholarGoogle ScholarCross RefCross Ref
  14. R. Krovi. Genetic algorithms for. clustering: A preliminary investigation. In Proceedings of the Twenty-Fifth Hawaii International Conference on System Sciences, pages 540--544. IEEE Computer Society Press, 1991.Google ScholarGoogle Scholar
  15. S. Luchian, H. Luchian, and M. Petriuc. Evolutionary automated classification. In Proceedings of 1st Congress on Evolutionary Computation, pages 585--588, 1994.Google ScholarGoogle ScholarCross RefCross Ref
  16. O. Nasraoui, E. Leon, and R. Krishnapuram. Unsupervised niche clustering: Discovering an unknown number of clusters in noisy data sets. In A. Ghosh and L. Jain, editors, Evolutionary Computation in Data Mining, volume 163 of Studies in Fuzziness and Soft Computing, pages 157--188. Springer Berlin / Heidelberg, 2005.Google ScholarGoogle Scholar
  17. T. Niknam and B. Amiri. An efficient hybrid approach based on pso, aco and k-means for cluster analysis. Appl. Soft Comput., 10:183--197, January 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. P. J. Rousseeuw. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20(1):53--65, 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. I. Sarafis, A. M. S. Zalzala, and P. W. Trinder. A genetic rule-based data clustering toolkit. In Proceedings of the Evolutionary Computation on 2002. CEC '02. Proceedings of the 2002 Congress - Volume 02, CEC '02, pages 1238--1243, Washington, DC, USA, 2002. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. C. Veenhuis and M. Koeppen. Data swarm clustering. Swarm Intelligence in Data Mining, Springer Berlin / Heidelberg, pages 221--241, 2006.Google ScholarGoogle Scholar
  21. D. Zaharie. Density based clustering with crowding differential evolution. Symbolic and Numeric Algorithms for Scientific Computing, International Symposium on, pages 343--350, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. PSO aided k-means clustering: introducing connectivity in k-means

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        GECCO '11: Proceedings of the 13th annual conference on Genetic and evolutionary computation
        July 2011
        2140 pages
        ISBN:9781450305570
        DOI:10.1145/2001576

        Copyright © 2011 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 12 July 2011

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate1,669of4,410submissions,38%

        Upcoming Conference

        GECCO '24
        Genetic and Evolutionary Computation Conference
        July 14 - 18, 2024
        Melbourne , VIC , Australia

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader