Skip to main content

Advertisement

Log in

A genetic k-medoids clustering algorithm

  • Published:
Journal of Heuristics Aims and scope Submit manuscript

    We’re sorry, something doesn't seem to be working properly.

    Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

We propose a hybrid genetic algorithm for k-medoids clustering. A novel heuristic operator is designed and integrated with the genetic algorithm to fine-tune the search. Further, variable length individuals that encode different number of medoids (clusters) are used for evolution with a modified Davies-Bouldin index as a measure of the fitness of the corresponding partitionings. As a result the proposed algorithm can efficiently evolve appropriate partitionings while making no a priori assumption about the number of clusters present in the datasets. In the experiments, we show the effectiveness of the proposed algorithm and compare it with other related clustering methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Agarwal, P., M. Sharir, and E. Welzl. (1997). “The Discrete 2-Center Problem.” In Proceedings of the 13th ACM Symposium on Computational Geometry, pp. 147–155.

  • Agrawal, R. and R. Srikant. (1994). “Fast Algorithms for Mining Association Rules.” In Proceedings of the 20th VLDB Conference, pp. 487–499.

  • Areibi, S. and Z. Yang. (2004). “Effective Memetic Algorithms for VLSI Design Automation = Genetic Algorithms $+$ Local Search $+$ Multi-Level Clustering.” Evolutionary Computation 12(3), 327–353.

    Article  Google Scholar 

  • Bandyopadhyay, S. and U. Maulik. (2002). “An Evolutionary Technique Based on k-Means Algorithm for Optimal Clustering in RN.” Information Science 146(1–4), 221–237.

    Article  MATH  MathSciNet  Google Scholar 

  • Cho, R.J., M. Campbell, E. Winzeler, L. Steinmetz, A. Conway, L. Wodicka, T. Wolfsberg, A. Gabrielian, D. Landsman, D. Lockhart, and R. Davis. (1998). “A Genome-Wide Transcriptional Analysis of the Mitotic Cell Cycle.” Molecular Cell 2(1), 65–73.

    Google Scholar 

  • Cucchiara, R. (1998). “Genetic Algorithms for Clustering in Machine Vision.” Machine Vision and Applications 11(1), 1–6.

    Article  Google Scholar 

  • Davies, D.L. and D. W. Bouldin. (1979). “A Cluster Separation Measure.” IEEE Trans. Pattern Analysis and Machine Intelligence 1, 224–227.

    Article  Google Scholar 

  • Dembele, D. and P. Kastner. (2003). “Fuzzy c-Means Method for Clustering Microarray Data.” Bioinformatics 19(8), 973–980.

    Article  Google Scholar 

  • Duda, R.O., P.E. Hart, and D.G. Stork. (2001). Pattern Classification New York, Wiley.

    MATH  Google Scholar 

  • Goldberg, D.E. (1989). Genetic Algorithms in Search, Optimization, and Machine Learning. Reading, Mass, Addison-Wesley.

    MATH  Google Scholar 

  • Estivill-Castro, V. and A.T. Murray. (1997). “Spatial Clustering for Data Mining with Generic Algorithms.” Technical Report FIT-TR-97-10, Queensland University of Technology, Australia.

  • Falkenauer, E. (1998). Genetic Algorithms and Grouping Problems. Boston: John Wiley & Sons.

    Google Scholar 

  • Garey, M. and D. Johnson. (1979). Computers and Intractability—A Guide to the Theory of NP-Completeness. San Francisco, W.H. Freeman.

    MATH  Google Scholar 

  • Goldberg, D.E. and J. Richardson. (1987). “Genetic Algorithms with Sharing for Multimodal Function Optimization.” In Proceedings of the 2nd International Conference Genetic Algorithms, pp. 41–49.

  • Hall, L.O., I.B. Ozyurt, and J. C. Bezdek. (1999). “Clustering with a Genetically Optimized Approach.” IEEE Transactions Evolutionary Computation 3(2), 103–112.

    Google Scholar 

  • Hartigan, J.A. (1975). Clustering Algorithms, Wiley.

  • Hartigan, J.A., and M.A. Wong. (1979). “A k-Means Clustering Algorithm.” Applied Statistics 28, 100–110.

    Article  MATH  Google Scholar 

  • Holland, J.H. (1975). Adaptation in Natural and Artificial Systems. Ann Arbor, University of Michigan Press.

    Google Scholar 

  • Hopke, P.K. and L. Kaufman. (1990). “The Use of Sampling to Cluster Large Data Sets.” Chemom. Intelligence Laboratoire System 8, 195–204.

    Article  Google Scholar 

  • Hruschka, E.R., L.N. de Castro, and R.J. G.B. Campello. (2004). “Evolutionary Algorithms for Clustering Gene-Expression Data.” In Proceedings of the IEEE International Conference on Data Mining, pp. 403–406.

  • Hruschka, E.R. and F.F.E. Nelson. (2003). “A Genetic Algorithm for Cluster Analysis.” Intelligent Data Analysis 7, 15–25.

    Google Scholar 

  • Jain, A.K. and R.C. Dubes. (1988). Algorithms for Clustering Data. Englewood Cliffs, N.J., Prentice Hall.

    MATH  Google Scholar 

  • Kaufman, L. and P.J. Rousseeuw. (1990). Finding Groups in Data: an Introduction to Cluster Analysis. N.Y., John Wiley & Sons.

    Google Scholar 

  • Krishna, K. and M.N. Murty. (1999). “Genetic k-Means Algorithm,” IEEE Transactions on Systems, Man and Cybernetics, Part B: Cybernetics, Vol. 29, No. 3.

  • Lance, G.N. and W.T. Williams. (1967). “A General Theory of Classificatory Sorting Strategies: II Clustering Systems.” Computers Journal 10, 271–277.

    Article  Google Scholar 

  • Lucasius, CB., A.D. Dane, and G. Kateman. (1993). “On k-Medoid Clustering of Large Data Sets with the Aid of a Genetic Algorithm: Background, Feasibility and Comparison.” Analytical Chimica Acta 282, 647–669.

    Article  Google Scholar 

  • MacQueen, J. (1967). “Some Methods for Classification and Analysis of Multivariate Observations.” In Proceedings of the 5th Berkeley Symp. Mathematical Statistics and Probability, pp. 281–297.

  • Maulik, U. and S. Bandyopadhyay. (2000). “Genetic Algorithm-based Clustering Technique.” Pattern Recognition 33(9), 1455–1465.

    Article  Google Scholar 

  • Murthy, C.A. and N. Chowdhury. (1996). “In search of Optimal Clusters using Genetic Algorithms.” Pattern Recognition Letters 17, 825–832.

    Article  Google Scholar 

  • Ng, R. and J. Han. (2002). “CLARANS: A Method for Clustering Objects for Spatial Data Mining.” IEEE Transactions Knowldge of Data Engineering 14(5), 1003–1016.

    Article  Google Scholar 

  • Pal, N.R. and J.C. Bezdek. (1995). “On Cluster Validity for the Fuzzy c-Means Model.” IEEE Transactions on Fuzzy Systems 3(3), 370–379.

    Article  Google Scholar 

  • Plackett, R.L. and J.P. Burman. (1946). “The Design of Optimum Multifactorial Experiments.” Biometrika 33, 305–325.

    Article  MATH  MathSciNet  Google Scholar 

  • Scheunders, P. (1997). “A Genetic c-Means Clustering Algorithm Applied to Color Image Quantization.” Pattern Recognition 30(6), 859–866.

    Article  Google Scholar 

  • Sheng W. and X. Liu. (2004). “A Hybrid Algorithm for k-Medoids Clustering of Large Data Sets.” In Proceedings of the IEEE Congress on Evolutionary Computation, pp. 77–82.

  • Smith, G.D., J.C.W. Debuse, M.D. Ryan, and L.M. Whittley. (2000). “An Effective Genetic Algorithm for the Fixed Channel Assignment Problem.” Telecommunications Optimisation: Heuristic and Adaptive Techniques, John Wiley and Sons, pp. 357–371.

  • Tavazoie, S., D. Hughes, J.M.J. Campbell, R.J. Cho, and G.M. Church. (1999). “Systematic Determination of Genetic Metwork Architecture.” Nature Genetics 22, 281–285.

    Google Scholar 

  • Wu, S., A.W.C Liew, H. Yan, and M. Yang. (2004). “Cluster Analysis of Gene Expression Database on Self-Splitting and Merging Competitive Learning.” IEEE Transactions on Information Technology in Biomedicine 8(1).

  • Yeung, K.Y. (2001). “Clustering Analysis of Gene Expression data.” PhD Thesis, University of Washington.

  • Yi, L., S. Lu, F. Fotouhi, Y. Deng, and S. Brown. (2004). “Incremental Genetic k-Means Algorithm and Its Application in Gene Expression Data Analysis.” BMC Bioinformatics 5, 172.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Weiguo Sheng.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sheng, W., Liu, X. A genetic k-medoids clustering algorithm. J Heuristics 12, 447–466 (2006). https://doi.org/10.1007/s10732-006-7284-z

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10732-006-7284-z

Keywords

Navigation