Skip to main content
Log in

Data clustering based on hybrid K-harmonic means and modifier imperialist competitive algorithm

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Clustering is a process for partitioning datasets. Clustering is one of the most commonly used techniques in data mining and is very useful for optimum solution. K-means is one of the simplest and the most popular methods that is based on square error criterion. This algorithm depends on initial states and is easily trapped and converges to local optima. Some recent researches show that K-means algorithm has been successfully applied to combinatorial optimization problems for clustering. K-harmonic means clustering solves the problem of initialization using a built-in boosting function, but it is suffering from running into local optima. In this article, we purpose a novel method that is based on combining two algorithms; K-harmonic means and modifier imperialist competitive algorithm. It is named ICAKHM. To carry out this experiment, four real datasets have been employed whose results indicate that ICAKHM. Four real datasets are employed to measure the proposed method include Iris, Wine, Glass and Contraceptive Method Choice with small, medium and large dimensions. The experimented results show that the new method (ICAKHM) carries out better results than the efficiency of KHM, PSOKHM, GSOKHM and ICAKM methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

References

  1. Anderson E (1935) The irises of the Gaspe Peninsula. Bull Amer Iris Soc 59:2

    Google Scholar 

  2. Atashpaz Gargari E, Lucas C (2007) A imperialist competitive algorithm: an algorithm for optimization inspired by imperialistic competition. In: IEEE congress on evolutionary computation, pp 4661–4667

  3. Atashpaz Gargari E, Lucas C (2007) Designing an optimal PID controller using colonial competitive

  4. Bahmani Firouzi B, ShaSadeghi M, Niknam T (2010) A new hybrid algorithm based on PSO, SA, and k-means for cluster analysis. Int J Innov Comput Infor Control 6(4):1–10

    Google Scholar 

  5. Cui X, Potok TE (2005) Document clustering using particle swarm optimization. In: IEEE swarm intelligence symposium, Pasadena

  6. Fathian M, Amiri B, Maroosi A (2008) A honey-bee mating approach on clustering. Int J Adv Manuf Technol 43(9–10):809–821

    Article  Google Scholar 

  7. Fisher RA (1936) The use of multiple measurements in taxonomic problems. Annals Eugen 7:179–188

    Google Scholar 

  8. ftp://ftp.ics.uci.edu/pub/machine-learning-databases.

  9. Güngör Z, Ünler A (2008) K-harmonic means data clustering with tabu-search method. Appl Math Model 32:1115–1125

    Article  MATH  Google Scholar 

  10. Hammerly G, Elkan C (2002) Alternatives to the k-means algorithm that find better clusterings. In: Proceedings of the 11th international conference on information and knowledge management, pp 600–607

  11. Hu G, Zhou S, Guan J, Hu X (2008) Towards effective document clustering: A constrained K-means based approach. Inf Process Manag 44(4):1397–1409

    Article  Google Scholar 

  12. Hugo S (1956) Mathematical Snapshots. Oxford University Press, p 266

  13. Jasour AM, Atashpaz Gargari E, Lucas C (2008) Vehicle fuzzy controller design using imperialist competitive algorithm. In: Second Iranian joint congress on fuzzy and intelligent systems, Tehran, Iran

  14. Kao YT, Zahar EI, Kao W (2008) A hybridized approach to data clustering. Expert Syst Appl 34(3):1754–1762

    Article  Google Scholar 

  15. Kennedy J, Eberhart R (1995) Particle swarm optimization, In: Proceedings of IEEE International Conference on Neural Networks Perth, Australia

  16. Liu B, Wang L, Jin YH (2008) An effective hybrid PSO-based algorithm for flow shop scheduling with limited buffers. Comput Oper Res 35(9):2791–2806

    Article  MATH  Google Scholar 

  17. Lloyd L (1982) Least square quantization in PCM. IEEE Transac Infor Theory 28(2):129–137

    Google Scholar 

  18. Maitra M, Chatterjee A (2008) A hybrid cooperative-comprehensive learning based PSO algorithm for image segmentation using multilevel thresholding. Expert Syst Appl 34:1341–1350

    Article  Google Scholar 

  19. Min Huan X, Bo Dong L (2008) A clustering-based modeling scheme of the manufacturing resources for process planning. Int J Adv Manuf Technol 38(1–2):154–162

    Google Scholar 

  20. Morales AK, Erazo FR (2009) A search space reduction methodology for data mining in large data bases. Eng Appl Artif Intell 22(1):92–100

    Article  Google Scholar 

  21. Ng MK, Wong JC (2002) Clustering categorical data sets using tabu search techniques. J Pattern Recognit Lett 35(12):2783–2790

    Article  MATH  Google Scholar 

  22. Niknam T, Olamaie J, Amiri B (2008) A hybrid evolutionary algorithm based on ACO and SA for cluster analysis. J Appl Sci 8(15):2695–2702

    Article  Google Scholar 

  23. Niknam T, Bahmani Firouzi B, Nayeripour M (2008) An efficient hybrid evolutionary algorithm for cluster analysis. World Appl Sci J 4(2):300–307

    Google Scholar 

  24. Niknam T, Amiri B (2010) An efficient hybrid approach based on PSO, ACO and k-means for cluster analysis. J Appl Soft Comput 10(1):183–197

    Article  Google Scholar 

  25. Pan H, Wang L, Liu B (2006) Particle swarm optimization for function optimization in noisy environment. Appl Math Comput 181:908–919

    Article  MATH  MathSciNet  Google Scholar 

  26. Roshanaei M, Atashpaz Gargari E, Lucas C (2008) Adaptive beamforming using colonial competitive algorithm. In: 2nd International joint conference on computational engineering, Vancouver, Canada

  27. Rajabioun R, Hashemzadeh F, Atashpaz Gargari E, Mesgari B, Rajaiee Salmasi F (2008) Identification of a MIMO evaporator and Its decentralized PID controller tuning using colonial competitive algorithm. In: The international federation of automatic control congress. Seoul Korea, pp 9952–9957

  28. Rajabioun R, Hashemzadeh F, Atashpaz Gargari E (2008) Colonial competitive algorithm: a novel approach for PID controller design in MIMO distillation column process. Int J Intell Comput Cybern 1(3):337–355

    Article  MATH  MathSciNet  Google Scholar 

  29. Shelokar PS, Jayaraman VK, Kulkarni BD (2004) An ant colony approach for clustering. Analytica Chimica Acta 509(2):187–195

    Article  Google Scholar 

  30. Sung CS, Jin HW (2000) A tabu-search-based heuristic for clustering. Pattern Recognit Lett 33(5):849–858

    Article  Google Scholar 

  31. Tan PN, Steinbach M, Kumar V (2005) Introduction to data mining. Addison-Wesley, Boston, pp 487–559

  32. Tjhi WC, Chen LH (2008) A heuristic-based fuzzy co-clustering algorithm for categorization of high-dimensional data. Fuzzy Sets Syst 159(4):371–389

    Article  MATH  MathSciNet  Google Scholar 

  33. Tibshirani R, Walther G, Hastie T (2001) Estimating the number of clusters in a data set via the gap statistic. J. Stat. Soc. Series B 63(2):411–423

    Google Scholar 

  34. Zalik KR (2008) An efficient k-means clustering algorithm. Pattern Recognit Lett 29:1385–1391

    Google Scholar 

  35. Zhang B, Hsu M, Dayal U (1999) K-harmonic means—a data clustering algorithm. Technical Report HPL-1999-124. Hewlett-Packard Laboratories

  36. Zhang B, Hsu M, Dayal U (2000) K-harmonic means. In: International workshop on temporal, spatial and spatio-temporal data mining, TSDM2000. Lyon, France, 12 Sept 2000

  37. Zhou H, Liu YH (2008) Accurate integration of multi-view range images using k-means clustering. Pattern Recognit 41(1):152–175

    Article  MATH  Google Scholar 

Download references

Acknowledgments

This research was supported by Mahshahr branch, Islamic Azad University, Iran. I would also like to thank the board and jury of Mahshahr University.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marjan Abdeyazdan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Abdeyazdan, M. Data clustering based on hybrid K-harmonic means and modifier imperialist competitive algorithm. J Supercomput 68, 574–598 (2014). https://doi.org/10.1007/s11227-013-1053-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-013-1053-1

Keywords

Navigation