Abstract
Kernel K-means can handle nonlinearly separate datasets by mapping the input datasets into a high-dimensional feature space. The kernel matrix reflects the inner structure of data, so it is a key to construct an appropriate kernel matrix. However, many kernel-based methods need to be set kernel parameter artificially in advance. It is difficult to set an appropriate kernel parameter for each dataset artificially, which limits the performance of the kernel K-means algorithm to some extent. It is necessary to design a method which can adjust the kernel parameter automatically according to the data structure. In addition, the number of clusters also needs to be set. To overcome these challenges, this paper proposed a self-adaptive kernel K-means based on the shuffled frog leaping algorithm, which regard the kernel parameter and the number of clusters as the position information of the frog. We designed a clustering validity index named Between-Within Proportion suitable for the kernel space (KBWP) by modifying the clustering validity index Between-Within Proportion (BWP). Treat KBWP as fitness in the shuffled frog leaping algorithm, and then do local and global optimization until the max iterations. The kernel parameter and the number of clusters corresponding to the maximum fitness are optimal. We experimentally verify our algorithm on artificial datasets and real datasets. Experimental results demonstrate the effectiveness and good performance of the proposed algorithm.
Similar content being viewed by others
References
Bach F, Jordan M (2003) Learning spectral clustering. In: Proceedings of neural information processing systems (NIPS2003), pp 305–312
Bäck T (1996) Evolutionary algorithms in theory and practice: evolution strategies, evolutionary programming, genetic algorithms. Oxford University Press, Oxford
Bandyopadhyay S, Maulik U (2002) Genetic clustering for automatic evolution of clusters and application to image classification. Pattern Recognit 35(3):1197–1208
Eusuff MM, Lansey KE (2003) Optimization of water distribution network design using the shuffled frog leaping algorithm. J Water Sourc Plan Manag 129(3):210–225
Filippone M, CamastraF F, Masulli F et al (2008) A survey of kernel and spectral methods for clustering. Pattern Recognit 41(1):176–190
Frey BJ, Dueck D (2007) Clustering by passing messages between data points. Science 315(5814):972–976
Gu B, Sheng VS, Tay KY et al (2015) Incremental support vector learning for ordinal regression. IEEE Trans Neural Netw Learn Sys 26(7):1403–1416
Gu B, Sun X, Sheng VS (2016) Structural minimax probability machine. IEEE Trans Neural Netw Learn Sys, pp 1–11. doi:10.1109/TNNLS.2016.2544779
Handhayani T, Hiryanto L (2015) Intelligent kernel k-means for clustering gene expression. Proced Comput Sci 59:171–177
Hasanien HM (2015) Shuffled frog leaping algorithm for photovoltaic model identification. Sustain Energy IEEE Trans 6(2):509–515
Hojjat A, Sarma KC (2014) Fuzzy genetic algorithm for optimization of steel structures. J Struct Eng 126(5):596–604
Jing CY, Sun H, Zhang PW (2014) A clustering algorithm applying to transportation market segment. Appl Mech Mater 505:735–739
Li Y, Yu F (2009) A new validity function for fuzzy clustering. IEEE Comput Intell Nat Comput 1:462–465
Li YY, Gong MG, Zhang TR (2013) Automatic kernel clustering with quantum-behaved particle swarm optimization algorithm.In: 2013 IEEE workshop on memetic computing, pp 72–79
Mühlenbein H, Schlierkamp-Voosen D (2017) Predictive models for the breeder genetic algorithm I. Continuous parameter optimization. Evolut Comput 1(1):25–49
Nesamalar JJD, Venkatesh P, Raja SC (2016) Managing multi-line power congestion by using hybrid nelder-mead-fuzzy adaptive particle swarm optimization (HNM-FAPSO). Appl Soft Comput 43:222–234
Omran M, Salman A, Engelbrecht AP (2005) Dynamic clustering using particle swarm optimization with application in unsupervised image classification. In: Fifth world enformatika conference (ICCI2005), vol 9, no 11, pp 1307–6884
Pena JM, Lozano JA, Larranaga P (1999) An empirical comparison of four initialization methods for the K-means algorithm. Pattern Recognit Lett 20(10):1027–1040
Sahu K, Shrivastava SK (2015) Kernel K-means clustering for phishing website and malware categorization. Int J Comput Appl 111(9):20–25
Shi Y (2004) Particle swarm optimization. IEEE. Connect 2(1):8–13
Storn R, Price K (1997) Differential evolution–a simple and efficient heuristic for global optimization over continuous spaces. J Glob Optim 11(2):341–359
Sun JG, Liu J, Zhao LY (2008) Clustering algorithms research. J Softw 19(3):48–61
Swagatam D, Ajith A (2008) Automatic kernel clustering with a multi-elitist particles warm optimization algorithm. Pattern Recognit Lett 29(5):688–699
Tsapanos N, Tefas A, Nikolaidis N et al (2016) Efficient mapreduce kernel K-means for big data clustering. In: Hellenic conference on artificial intelligence ACM
Venter G, Sobieszczanskisobieski J (2015) Particle swarm optimization. In: International conference on biomedical engineering and informatics, pp 129–132
Wen X, Shao L, Xue Y et al (2015) A rapid learning algorithm for vehicle classification. Inf Sci 295(1):395–406
Xie JY, Jiang S, Xie WX et al (2011) An efficient global k-means clustering algorithm. J Comput (China) 6(2):271–279
Zelnik-Manor L, Perona P (2005) Self-tuning spectral clustering. Advances in neural information processing systems 17. MIT Press, Cambridge, pp 1601–1608
Zhang LJ, Hu XH (2014) Locally adaptive multiple kernel clustering. Neurocomputing 137:92–197
Zhang L, Zhou WD, Jiao LC (2002) Kernel clustering algorithm. Chin J Comput 25(6):587–590
Zhang XY, Wang W, Nørvag K et al (2010) K-AP: generating specified K clusters by efficient affinity propagation. In: Proceedings 2010 10th IEEE international conference on data mining (ICDM 2010), pp 1187–1192
Zhang X, Liu J, Zhang X (2014) Kernel k-means clustering optimized by bare bones differential evolution algorithm. In: International conference on natural computation IEEE, pp 693–697
Zheng YH, Jeon B, Xu DH et al (2015) Image segmentation by generalized hierarchical fuzzy C-means algorithm. J Intell Fuzzy Syst 28(2):961–973
Acknowledgments
This work is supported by the National Natural Science Foundation of China under Grant Nos. 61379101 and 61672522, the National Basic Research Program of China under Grant No. 2013CB329502, A Project Funded by the Priority Academic Program Development of Jiangsu Higer Education Institutions (PAPD), and Jiangsu Collaborative Innovation Center on Atmospheric Environment and Equipment Technology (CICAEET).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
Shuyan Fan declares that she has no conflict of interest. Shifei Ding declares that he has no conflict of interest. Yu Xue declares that he has no conflict of interest.
Additional information
Communicated by V. Loia.
Rights and permissions
About this article
Cite this article
Fan, S., Ding, S. & Xue, Y. Self-adaptive kernel K-means algorithm based on the shuffled frog leaping algorithm. Soft Comput 22, 861–872 (2018). https://doi.org/10.1007/s00500-016-2389-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-016-2389-2