Skip to main content

Advertisement

Log in

A niching genetic k-means algorithm and its applications to gene expression data

  • Original Paper
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Partitional clustering is a common approach to cluster analysis. Although many algorithms have been proposed, partitional clustering remains a challenging problem with respect to the reliability and efficiency of recovering high quality solutions in terms of its criterion functions. In this paper, we propose a niching genetic k-means algorithm (NGKA) for partitional clustering, which aims at reliably and efficiently identifying high quality solutions in terms of the sum of squared errors criterion. Within the NGKA, we design a niching method, which encourages mating among similar clustering solutions while allowing for some competitions among dissimilar solutions, and integrate it into a genetic algorithm to prevent premature convergence during the evolutionary clustering search. Further, we incorporate one step of k-means operation into the regeneration steps of the resulted niching genetic algorithm to improve its computational efficiency. The proposed algorithm was applied to cluster both simulated data and gene expression data and compared with previous work. Experimental results clear show that the NGKA is an effective clustering algorithm and outperforms two other genetic algorithm based clustering methods implemented for comparison.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Areibi S, Yang Z (2004) Effective memetic algorithms for VLSI design automation = genetic algorithms + local search + multi-level clustering. Evolut Comput 12(3):327–353

    Article  Google Scholar 

  • Babu GP, Murty MN (1994) Clustering with evolution strategies. Pattern Recogn 27(2):321–329

    Article  Google Scholar 

  • Back T (1996) Evolutionary algorithms in theory and practice. Oxford University Press, Oxford

  • Beasley D, Bull DR, Martin RR (1993) A sequential niche technique for multimodal function optimization. Evolut Comput 1(2):101–125

    Article  Google Scholar 

  • Bezdek JC (1981) Pattern recognition with fuzzy objective functions. Plenum, New York

    Google Scholar 

  • Branke J, Middendorf M, Schneider F (1998) Improved heuristics and a genetic algorithm for finding short supersequences. Oper Res 20(1):39–45

    MATH  MathSciNet  Google Scholar 

  • Brown DE, Huntley CL (1992) A practical application of simulated annealing to clustering. Pattern Recogn 25(4):401–412

    Article  Google Scholar 

  • Cho RJ, Campbell MJ, Winzeler EA, Steinmetz L, Conway A, Wodicka L, Wolfsberg TG, Gabrieflian AE, Landsman D, Lockhart DJ, Davis RW (1998) A genome-wide transcriptional analysis of the mitotic cell cycle. Molec Cell 2(1):65–73

    Article  Google Scholar 

  • Cucchiara R (1998) Genetic algorithms for clustering in machine vision. Mach Vis Appl 11(1):1–6

    Article  Google Scholar 

  • Damavandi N, Safavi-Naeini S (2003) A global optimization algorithm based on combined evolutionary programming for cluster analysis. In: Proceedings of IEEE conference on electrical and computer engineering, vol 2, pp 4–7

  • DeJong KA (1975) An analysis of the behavior of a class of genetic adaptive systems. PhD dissertation, University of Michigan, Ann Arbor

  • Dembele D, Kastner P (2003) Fuzzy c-means method for clustering microarray data. Bioinformatics 19(8):973–980

    Article  Google Scholar 

  • Duda RO, Hart PE (1973) Pattern classification and scene analysis. Wiley, New York

    MATH  Google Scholar 

  • Duda RO, Hart PE, Stork DG (2001) Pattern classification, 2nd edn. Wiley, New York

    MATH  Google Scholar 

  • Frey B, Dueck D (2007) Clustering by passing messages between data points. Science 315:972–976

    Article  MathSciNet  Google Scholar 

  • Garey M, Johnson D (1979) Computers and intractability—a guide to the theory of NP-completeness. W.H. Freeman, San Francisco

    MATH  Google Scholar 

  • Goldberg DE (1989) Genetic algorithms in search, optimization, and machine learning. Addison-Wesley, Reading

    MATH  Google Scholar 

  • Goldberg DE, Richardson J (1987) Genetic algorithms with sharing for multimodal function optimization. In: Proceedings of the 2nd international conference on genetic algorithms, Hillsdale, New Jersey, USA, pp 41–49

  • Hall LO, Ozyurt B, Bezdek JC (1999) Clustering with a genetically optimized approach. IEEE Trans Evol Comput 3(2):103–112

    Article  Google Scholar 

  • Hartigan JA, Wong MA (1979) A k-means clustering algorithm. Appl Stat 28:100–110

    Article  MATH  Google Scholar 

  • Holland JH (1975) Adaptation in natural and artificial systems. University of Michigan Press, Ann Arbor

  • Iyer VR, Eisen MB, Ross DT, Schuler G, Moore T, Lee JC, Trent JM, Staudt LM, Hudson J Jr, Boguski MS, Lashkari D, Shalon D, Botstein D, Brown PO (1999) The transcriptional program in the response of human fibroblasts to serum. Science 283:83–87

    Article  Google Scholar 

  • Jain AK, Dubes RC (1988) Algorithms for clustering data. Prentice Hall, Englewood Cliffs

    MATH  Google Scholar 

  • Jin HD, Leung KS, Leung WM (2001) Genetic-guided model-based clustering algorithms. Proc Int Conf Artif Intell 2:653–659

    Google Scholar 

  • Klein RW, Dubes RC (1989) Experiments in projection and clustering by simulated annealing. Pattern Recogn 22(2):213–220

    Article  MATH  Google Scholar 

  • Kodek DM (1980) Design of optimal finite word length FIR digital filters using integer programming techniques. IEEE Trans ASSP 28:304–308

    Article  MathSciNet  Google Scholar 

  • Koontz WL, Narendra PM, Fukunaga K (1975) A branch and bound clustering algorithm. IEEE Trans Comp 24:908–915

    Article  MATH  MathSciNet  Google Scholar 

  • Krishna K, Murty MN (1999) Genetic k-means algorithm. IEEE Trans Syst Man Cybern B Cybern 29(3):433–439

    Article  Google Scholar 

  • Lozano JA, Larrañaga P (1999) Applying genetic algorithms to search for the best hierarchical clustering of a dataset. Pattern Recogn Lett 20(9):911–918

    Article  Google Scholar 

  • Li FF, Morgan R, Williams D (1997) Hybrid genetic approaches to ramping rate constrained dynamic economic dispatch. Electric Power Syst Res 43(2):97–103

    Article  Google Scholar 

  • Mahfoud SW (1995) Niching methods for genetic algorithms. PhD dissertation, Univ. of Illinois, Urbana-Champaign

  • Maulik U, Bandyopadhyay S (2000) Genetic algorithm-based clustering technique. Pattern Recogn 33:1455–1465

    Article  Google Scholar 

  • Michalewicz Z (1996) Genetic algorithms + Data structure = Evolution programs, 3rd edn. Springer, Berlin

    Google Scholar 

  • Murthy CA, Chowdhury N (1996) In search of optimal clusters using genetic algorithms. Pattern Recogn Lett 17:825–832

    Article  Google Scholar 

  • Pelikan M, Goldberg DE (2000) Genetic algorithm clustering, and the breaking of symmetry. In: Proceedings of parallel problem solving from nature, pp 385–394

  • Petrowski A (1996) A clearing procedure as a niching method for genetic algorithms. In: Proceedings of IEEE international conference on evolutionary computation, pp 798–803

  • Sareni B, Krähenbühl L (1998) Fitness sharing and niching methods revisited. IEEE Trans Evol Comput 2:97–106

    Article  Google Scholar 

  • Sareni B, Krahenbuhl L, Nicolas A (2000) Efficient genetic algorithms for solving hard constrained optimization problems. IEEE Trans Magn 36(4):1027–1030

    Article  Google Scholar 

  • Sarkar M, Yegnanarayana B, Khemani D (1997) A clustering algorithm using an evolutionary programming-based approach. Pattern Recogn Lett 18:975–986

    Article  Google Scholar 

  • Sharan R, Shamir R (2000) CLICK: a clustering algorithm with application to gene expression analysis. In: Proceedings of AAAI-ISMB, pp 307–316

  • Tamburino LA, Zmuda MA, Rizki MM (1995) Generating pattern recognition systems using evolutionary learning expert. IEEE Intell Syst 10(4):63–68

    Google Scholar 

  • Tavazoie S, Hughes D, Campbell JMJ, Cho RJ, Church GM (1999) Systematic determination of genetic network architecture. Nat Genetic 22:281–285

    Article  Google Scholar 

  • Villarreal B, Karwan MH (1982) Multicriteria dynamic programming with an application to the integer case. J Math Anal Appl 38:43–69

    MATH  MathSciNet  Google Scholar 

  • Whitley D (1995) Modeling hybrid genetic algorithms. In: Winter G, Periaux J, Galan M, Cuesta P (eds) Genetic algorithms in engineering and computer science. Wiley, New York, pp 191–201

    Google Scholar 

  • Wu S, Liew AWC, Yan H, Yang M (2004) Cluster analysis of gene expression database on self-splitting and merging competitive learning. IEEE Trans Inf Technol Biomed 8(1):5–15

    Article  Google Scholar 

  • Yeung KY (2000) Clustering analysis of gene expression data. PhD Thesis, University of Washington

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Weiguo Sheng.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sheng, W., Tucker, A. & Liu, X. A niching genetic k-means algorithm and its applications to gene expression data. Soft Comput 14, 9–19 (2010). https://doi.org/10.1007/s00500-008-0386-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-008-0386-9

Keywords

Navigation