Abstract
Clustering is a process for partitioning datasets. Clustering is one of the most commonly used techniques in data mining and is very useful for optimum solution. K-means is one of the simplest and the most popular methods that is based on square error criterion. This algorithm depends on initial states and is easily trapped and converges to local optima. Some recent researches show that K-means algorithm has been successfully applied to combinatorial optimization problems for clustering. K-harmonic means clustering solves the problem of initialization using a built-in boosting function, but it is suffering from running into local optima. In this article, we purpose a novel method that is based on combining two algorithms; K-harmonic means and modifier imperialist competitive algorithm. It is named ICAKHM. To carry out this experiment, four real datasets have been employed whose results indicate that ICAKHM. Four real datasets are employed to measure the proposed method include Iris, Wine, Glass and Contraceptive Method Choice with small, medium and large dimensions. The experimented results show that the new method (ICAKHM) carries out better results than the efficiency of KHM, PSOKHM, GSOKHM and ICAKM methods.
Similar content being viewed by others
References
Anderson E (1935) The irises of the Gaspe Peninsula. Bull Amer Iris Soc 59:2
Atashpaz Gargari E, Lucas C (2007) A imperialist competitive algorithm: an algorithm for optimization inspired by imperialistic competition. In: IEEE congress on evolutionary computation, pp 4661–4667
Atashpaz Gargari E, Lucas C (2007) Designing an optimal PID controller using colonial competitive
Bahmani Firouzi B, ShaSadeghi M, Niknam T (2010) A new hybrid algorithm based on PSO, SA, and k-means for cluster analysis. Int J Innov Comput Infor Control 6(4):1–10
Cui X, Potok TE (2005) Document clustering using particle swarm optimization. In: IEEE swarm intelligence symposium, Pasadena
Fathian M, Amiri B, Maroosi A (2008) A honey-bee mating approach on clustering. Int J Adv Manuf Technol 43(9–10):809–821
Fisher RA (1936) The use of multiple measurements in taxonomic problems. Annals Eugen 7:179–188
Güngör Z, Ünler A (2008) K-harmonic means data clustering with tabu-search method. Appl Math Model 32:1115–1125
Hammerly G, Elkan C (2002) Alternatives to the k-means algorithm that find better clusterings. In: Proceedings of the 11th international conference on information and knowledge management, pp 600–607
Hu G, Zhou S, Guan J, Hu X (2008) Towards effective document clustering: A constrained K-means based approach. Inf Process Manag 44(4):1397–1409
Hugo S (1956) Mathematical Snapshots. Oxford University Press, p 266
Jasour AM, Atashpaz Gargari E, Lucas C (2008) Vehicle fuzzy controller design using imperialist competitive algorithm. In: Second Iranian joint congress on fuzzy and intelligent systems, Tehran, Iran
Kao YT, Zahar EI, Kao W (2008) A hybridized approach to data clustering. Expert Syst Appl 34(3):1754–1762
Kennedy J, Eberhart R (1995) Particle swarm optimization, In: Proceedings of IEEE International Conference on Neural Networks Perth, Australia
Liu B, Wang L, Jin YH (2008) An effective hybrid PSO-based algorithm for flow shop scheduling with limited buffers. Comput Oper Res 35(9):2791–2806
Lloyd L (1982) Least square quantization in PCM. IEEE Transac Infor Theory 28(2):129–137
Maitra M, Chatterjee A (2008) A hybrid cooperative-comprehensive learning based PSO algorithm for image segmentation using multilevel thresholding. Expert Syst Appl 34:1341–1350
Min Huan X, Bo Dong L (2008) A clustering-based modeling scheme of the manufacturing resources for process planning. Int J Adv Manuf Technol 38(1–2):154–162
Morales AK, Erazo FR (2009) A search space reduction methodology for data mining in large data bases. Eng Appl Artif Intell 22(1):92–100
Ng MK, Wong JC (2002) Clustering categorical data sets using tabu search techniques. J Pattern Recognit Lett 35(12):2783–2790
Niknam T, Olamaie J, Amiri B (2008) A hybrid evolutionary algorithm based on ACO and SA for cluster analysis. J Appl Sci 8(15):2695–2702
Niknam T, Bahmani Firouzi B, Nayeripour M (2008) An efficient hybrid evolutionary algorithm for cluster analysis. World Appl Sci J 4(2):300–307
Niknam T, Amiri B (2010) An efficient hybrid approach based on PSO, ACO and k-means for cluster analysis. J Appl Soft Comput 10(1):183–197
Pan H, Wang L, Liu B (2006) Particle swarm optimization for function optimization in noisy environment. Appl Math Comput 181:908–919
Roshanaei M, Atashpaz Gargari E, Lucas C (2008) Adaptive beamforming using colonial competitive algorithm. In: 2nd International joint conference on computational engineering, Vancouver, Canada
Rajabioun R, Hashemzadeh F, Atashpaz Gargari E, Mesgari B, Rajaiee Salmasi F (2008) Identification of a MIMO evaporator and Its decentralized PID controller tuning using colonial competitive algorithm. In: The international federation of automatic control congress. Seoul Korea, pp 9952–9957
Rajabioun R, Hashemzadeh F, Atashpaz Gargari E (2008) Colonial competitive algorithm: a novel approach for PID controller design in MIMO distillation column process. Int J Intell Comput Cybern 1(3):337–355
Shelokar PS, Jayaraman VK, Kulkarni BD (2004) An ant colony approach for clustering. Analytica Chimica Acta 509(2):187–195
Sung CS, Jin HW (2000) A tabu-search-based heuristic for clustering. Pattern Recognit Lett 33(5):849–858
Tan PN, Steinbach M, Kumar V (2005) Introduction to data mining. Addison-Wesley, Boston, pp 487–559
Tjhi WC, Chen LH (2008) A heuristic-based fuzzy co-clustering algorithm for categorization of high-dimensional data. Fuzzy Sets Syst 159(4):371–389
Tibshirani R, Walther G, Hastie T (2001) Estimating the number of clusters in a data set via the gap statistic. J. Stat. Soc. Series B 63(2):411–423
Zalik KR (2008) An efficient k-means clustering algorithm. Pattern Recognit Lett 29:1385–1391
Zhang B, Hsu M, Dayal U (1999) K-harmonic means—a data clustering algorithm. Technical Report HPL-1999-124. Hewlett-Packard Laboratories
Zhang B, Hsu M, Dayal U (2000) K-harmonic means. In: International workshop on temporal, spatial and spatio-temporal data mining, TSDM2000. Lyon, France, 12 Sept 2000
Zhou H, Liu YH (2008) Accurate integration of multi-view range images using k-means clustering. Pattern Recognit 41(1):152–175
Acknowledgments
This research was supported by Mahshahr branch, Islamic Azad University, Iran. I would also like to thank the board and jury of Mahshahr University.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Abdeyazdan, M. Data clustering based on hybrid K-harmonic means and modifier imperialist competitive algorithm. J Supercomput 68, 574–598 (2014). https://doi.org/10.1007/s11227-013-1053-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-013-1053-1