Skip to main content

Advertisement

Log in

A hybrid stochastic genetic–GRASP algorithm for clustering analysis

  • Original Paper
  • Published:
Operational Research Aims and scope Submit manuscript

Abstract

This paper presents a new stochastic methodology, which is based on the concepts of genetic algorithms (GAs) and greedy randomized adaptive search procedure (GRASP), for optimally clustering N objects into K clusters. The proposed stochastic algorithm (Hybrid GEN–GRASP) for the solution of the clustering problem is a two phase algorithm which combines a genetic algorithm for the solution of the feature selection problem and a GRASP algorithm for the solution of the clustering problem. Due to the nature of stochastic and population-based search, the proposed algorithm can overcome the drawbacks of traditional clustering methods. Its performance is compared with another methodology that uses for the solution of the feature selection problem a very popular metaheuristic method, the Tabu Search algorithm. Results from the application of the methodology to data sets from the UCI Machine Learning Repository are presented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Aha DW, Bankert RL (1996) A comparative evaluation of sequential feature selection algorithms. In: Fisher D, Lenx J-H (eds) Artificial intelligence and statistics. Springer, New York

    Google Scholar 

  • Al-Sultan K (1995) A tabu search approach to the clustering problem. Pattern Recognit 28(9):1443–1451

    Article  Google Scholar 

  • Azzag H, Guinot C, Venturini G (2006) Data and text mining with hierarchical clustering ants. In: Abraham A, Grosan C, Ramos V (eds) Swarm intelligence in data mining. Springer, Berlin, pp 153–190

    Chapter  Google Scholar 

  • Azzag H, Venturini G, Oliver A, Gu C (2007) A hierarchical ant based clustering algorithm and its use in three real-world applications. Eur J Oper Res 179:906–922

    Article  Google Scholar 

  • Babu G, Murty M (1993) A near-optimal initial seed value selection in K-means algorithm using a genetic algorithm. Pattern Recognit Lett 14(10):763–769

    Article  Google Scholar 

  • Brown D, Huntley C (1992) A practical application of simulated annealing to clustering. Pattern Recognit 25(4):401–412

    Article  Google Scholar 

  • Cano JR, Cordón O, Herrera F, Sánchez L (2002) A GRASP algorithm for clustering. In: Garijo FJ, Riquelme JC, Toro M (eds) IBERAMIA 2002, LNAI 2527. Springer, Berlin, pp 214–223

    Chapter  Google Scholar 

  • Cantu-Paz E, Newsam S, Kamath C (2004) Feature selection in scientific application. In Proceedings of the 2004 ACM SIGKDD international conference on knowledge discovery and data mining, pp 788–793

  • Celeux G, Govaert G (1992) A classification EM algorithm for clustering and two stochastic versions. Comput Stat Data Anal 14:315–332

    Article  Google Scholar 

  • Chen L, Tu L, Chen H (2005) A novel ant clustering algorithm with digraph. In: Wang L, Chen K, Ong YS (eds) ICNC 2005, LNCS 3611. Springer, Berlin, pp 1218–1228

    Google Scholar 

  • Chu S, Roddick J (2000) A clustering algorithm using the tabu search approach with simulated annealing. In: Ebecken N, Brebbia C (eds) Data mining II—Proceedings of second international conference on data mining methods and databases. Cambridge, pp 515–523

  • Cowgill M, Harvey R, Watson L (1999) A genetic algorithm approach to cluster analysis. Comput Math Appl 37:99–108

    Article  Google Scholar 

  • Feo TA, Resende MGC (1995) Greedy randomized adaptive search procedure. J Glob Optim 6:109–133

    Article  Google Scholar 

  • Glover F (1989) Tabu search I. ORSA J Comput 1(3):190–206

    Google Scholar 

  • Glover F (1990) Tabu search II. ORSA J Comput 2(1):4–32

    Google Scholar 

  • Goldberg DE (1989) Genetic algorithms in search, optimization, and machine learning. Addison-Wesley, Massachussets

    Google Scholar 

  • He Y, Hui SC, Sim Y (2006) A novel ant-based clustering approach for document clustering. In: Ng HT, et al (eds) AIRS 2006, LNCS 4182. Springer, Berlin, pp 537–544

    Google Scholar 

  • Holland JH (1975) Adaptation in natural and artificial systems. University of Michigan Press, Ann Arbor, MI

    Google Scholar 

  • Jain A, Zongker D (1997) Feature selection: evaluation, application, and small sample performance. IEEE Trans Pattern Anal Mach Intell 19:153–158

    Article  Google Scholar 

  • Jain AK, Murty MN, Flynn PJ (1999) Data clustering: a review. ACM Comput Surv 31(3):264–323

    Article  Google Scholar 

  • Janson S, Merkle D (2005) A new multi-objective particle swarm optimization algorithm using clustering applied to automated docking. In: Blesa MJ, et al (eds) HM 2005, LNCS 3636. Springer, Berlin, pp 128–141

    Google Scholar 

  • Kao Y, Cheng K (2006) An ACO-based clustering algorithm. In: Dorigo M, et al (eds) ANTS 2006, LNCS 4150. Springer, Berlin, pp 340–347

    Google Scholar 

  • Kao Y-T, Zahara E, Kao I-W (2007) A hybridized approach to data clustering. Expert sys appl doi: 10.1016/j.eswa.2007.01.028

  • Kira K, Rendell L (1992) A practical approach to feature selection. In Proceedings of the ninth international conference on machine learning, Aberdeen, Scotland, pp 249–256

  • Li Z, Tan H-Z (2006) A combinational clustering method based on artificial immune system and support vector machine. In: Gabrys B, Howlett RJ, Jain LC (eds) KES 2006, Part I, LNAI 4251. Springer, Berlin, pp 153–162

    Google Scholar 

  • Liao S-H, Wen C-H (2007) Artificial neural networks classification and clustering of methodologies and applications—literature analysis from 1995 to 2005. Expert sys appl 32:1–11

    Article  Google Scholar 

  • Liu Y, Chen K, Liao X, Zhang W (2004) A genetic clustering method for intrusion detection. Pattern Recognit 37:927–942

    Article  Google Scholar 

  • Liu Y, Liu Y, Wang L, Chen K (2005) A hybrid tabu search based clustering algorithm. In: Khosla R, et al (eds) KES 2005, LNAI 3682. Springer, Berlin, pp 186–192

    Google Scholar 

  • Marinakis Y, Migdalas A, Pardalos PM (2005a) Expanding neighborhood GRASP for the traveling salesman problem. Comput Optim Appl 32:231–257

    Article  Google Scholar 

  • Marinakis Y, Migdalas A, Pardalos PM (2005b) A hybrid genetic-GRASP algorithm using langrangean relaxation for the traveling salesman problem. J Comb Optim 10:311–326

    Article  Google Scholar 

  • Marinakis Y, Marinaki M, Doumpos M, Matsatsinis N, Zopounidis C, (2007) Optimization of nearest neighbor classifiers via metaheuristic algorithms for credit risk assessment. J Glob Optim (accepted)

  • Maulik U, Bandyopadhyay S (2000) Genetic algorithm-based clustering technique. Pattern Recognit 33:1455–1465

    Article  Google Scholar 

  • Meng L, Wu QH, Yong ZZ (2000) A faster genetic clustering algorithm. In: Cagnoni S, et al (eds) EvoWorkshops 2000, LNCS 1803. Springer, Berlin, pp 22–33

    Google Scholar 

  • Mirkin B, (1996) Mathematical classification and clustering. Kluwer Academic Publishers, Dordrecht, The Netherlands

    Google Scholar 

  • Nasraoui O, Gonzalez F, Cardona C, Rojas C, Dasgupta D (2003) A scalable artificial immune system model for dynamic unsupervised learning. In: Cantú-Paz E, et al (eds) GECCO 2003, LNCS 2723. Springer-Verlag, Berlin Heidelberg, pp 219–230

    Google Scholar 

  • Ng MK (2000) A note on constrained K-means algorithms. Pattern Recognit 33:515–519

    Article  Google Scholar 

  • Paterlini S, Krink T (2006) Differential evolution and particle swarm optimisation in partitional clustering. Comput Stat Data Anal 50:1220–1247

    Article  Google Scholar 

  • Ray S, Turi RH (1999) Determination of number of clusters in k-means clustering and application in colour image segmentation. In Proceedings of the 4th international conference on advances in pattern recognition and digital techniques (ICAPRDT99), Calcutta, India

  • Reeves CR (1995) Genetic algorithms. In: Reeves CR (ed) Modern heuristic techniques for combinatorial problems. McGraw–Hill, London, pp 151–196

    Google Scholar 

  • Reeves CR (2003) Genetic algorithms. In: Glover F, Kochenberger GA (eds) Handbooks of metaheuristics. Kluwer Academic Publishers, Dordrecht, pp 55–82

    Chapter  Google Scholar 

  • Resende MGC, Ribeiro CC (2003) Greedy randomized adaptive search procedures. In: Glover F, Kochenberger GA (eds) Handbooks of metaheuristics. Kluwer Academic Publishers, Dordrecht, pp 219–249

    Chapter  Google Scholar 

  • Rokach L, Maimon O (2005) Clustering methods. In: Maimon O, Rokach L (eds) Data mining and knowledge discovery handbook. Springer, New York, pp 321–352

    Chapter  Google Scholar 

  • Selim S, Alsultan K (1991) A simulated annealing algorithm for the clustering problems. Pattern Recognit 24(10):1003–1008

    Article  Google Scholar 

  • Shelokar PS, Jayaraman VK, Kulkarni BD (2004) An ant colony approach for clustering. Anal Chim Acta 509:187–195

    Article  Google Scholar 

  • Shen H-Y, Peng X-Q, Wang J-N, Hu Z-K (2005) A mountain clustering based on improved PSO algorithm. In: Wang L, Chen K, Ong YS (eds) ICNC 2005, LNCS 3612. Springer, Berlin, pp 477–481

    Google Scholar 

  • Shen J, Chang SI, Lee ES, Deng Y, Brown SJ (2005) Determination of cluster number in clustering microarray data. Appl Math Comput 169:1172–1185

    Article  Google Scholar 

  • Sheng W, Liu X (2006) A genetic k-medoids clustering algorithm. J Heuristics 12:447–466

    Article  Google Scholar 

  • Sherafat V, Nunes de Castro L, Hruschka ER (2004) TermitAnt: an ant clustering algorithm improved by ideas from termite colonies. In: Pal NR, et al (eds) ICONIP 2004, LNCS 3316. Springer, Berlin, pp 1088–1093

    Google Scholar 

  • Sun J, Xu W, Ye B (2006) Quantum-behaved particle swarm optimization clustering algorithm. In: Li X, Zaiane OR, Li Z (eds) ADMA 2006, LNAI 4093. Springer, Berlin, pp 340–347

    Google Scholar 

  • Sung C, Jin H (2000) A Tabu-search-based heuristic for clustering. Pattern Recognit 33:849–858

    Article  Google Scholar 

  • Tarsitano A, (2003) A computational study of several relocation methods for k-means algorithms. Pattern Recognit 36:2955–2966

    Article  Google Scholar 

  • Tsang C-H, Kwong S (2006) Ant colony clustering and feature extraction for anomaly intrusion detection. Stud Comput Intell (SCI) 34:101–123

    Article  Google Scholar 

  • Tseng L, Yang S (2000) A genetic clustering algorithm for data with non-spherical-shape clusters. Pattern Recognit 33:1251–1259

    Article  Google Scholar 

  • Tseng L, Yang S (2001) A genetic approach to the automatic clustering problem. Pattern Recognit 34:415–424

    Article  Google Scholar 

  • Wu F-X, Zhang WJ, Kusalik AJ (2003) A genetic k-means clustering algorithm applied to gene expression data. In: Xiang Y, Chaib-draa B (eds) AI 2003, LNAI 2671. Springer, Berlin, pp 520–526

    Google Scholar 

  • Xu R, Wunsch II D (2005) Survey of clustering algorithms. IEEE Trans Neural Netw 16(3):645–678

    Article  Google Scholar 

  • Yang Y, Kamel MS (2006) An aggregated clustering approach using multi-ant colonies algorithms. Pattern Recognit 39:1278–1289

    Article  Google Scholar 

  • Yeh J-Y, Fu JC (2007) A hierarchical genetic algorithm for segmentation of multi-spectral human–brain MRI. Expert sys appl doi: 10.1016/j.eswa.2006.12.012

  • Younsi R, Wang W (2004) A new artificial immune system algorithm for clustering. In: Yang ZR, et al (eds) IDEAL 2004, LNCS 3177. Springer, Berlin, pp 58–64

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Constantin Zopounidis.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Marinakis, Y., Marinaki, M., Doumpos, M. et al. A hybrid stochastic genetic–GRASP algorithm for clustering analysis. Oper Res Int J 8, 33–46 (2008). https://doi.org/10.1007/s12351-008-0004-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12351-008-0004-8

Keywords

Navigation