Abstract
Clustering analysis is a popular data analysis technology that has been successfully applied in many fields, such as pattern recognition, machine learning, image processing, data mining, computer vision and fuzzy control. Clustering analysis has made great progress in these fields. The purpose of clustering analysis is to classify data according to their intrinsic attributes such that data that have the same characteristics are in the same class and data that differ are in different classes. Currently, the k-means clustering algorithm is one of the most commonly used clustering methods because it is simple and easy to implement. However, its performance largely depends on the initial solution, and it easily falls into locally optimal solutions during the execution of the algorithm. To overcome the shortcomings of k-means clustering, many scholars have used meta-heuristic optimization algorithms to solve data clustering problems and have obtained satisfactory results. Therefore, in this paper, a selfish herd optimization algorithm based on the simplex method (SMSHO) is proposed. In SMSHO, the simplex method replaces mating operations to generate new prey individuals. The incorporation of the simplex method increases the population diversity of algorithm, thereby improving the global searching ability of algorithm. Twelve clustering datasets are selected to verify the performance of SMSHO in solving clustering problems. The SMSHO is compared with ABC, BPFPA, DE, k-means, PSO, SMSSO and SHO. The experimental results show that SMSHO has faster convergence speed, higher accuracy and higher stability than the other algorithms.
Similar content being viewed by others
References
Unglert K, Radic V, Jellinek AM (2016) Principal component analysis vs. self-organizing maps combined with hierarchical clustering for pattern recognition in volcano seismic spectra. J Volcanol Geotherm Res 320(15):58–74. https://doi.org/10.1016/j.jvolgeores.2016.04.014
Diaz-Rozo J, Bielza C, Larranaga P (2017) Machine learning-based CPS for clustering high throughput machining cycle conditions. Proc Manuf 10:997–1008. https://doi.org/10.1016/j.promfg.2017.07.091
GeethaRamani R, Balasubramanian L (2018) Macula segmentation and fovea localization employing image processing and heuristic based clustering for automated retinal screening. Comput Methods Programs Biomed 160:153–163. https://doi.org/10.1016/j.cmpb.2018.03.020
Thomas MC, Zhu W, Romagnoli JA (2018) Data mining and clustering in chemical process databases for monitoring and knowledge discovery. J Process Control 67:160–175. https://doi.org/10.1016/j.jprocont.2017.02.006
Campos V, Sastre F, Yagües M, Bellver M, Giró-i-Nieto X, Torres J (2017) Distributed training strategies for a computer vision deep learning algorithm on a distributed GPU cluster. Proc Comput Sci 108:315–324. https://doi.org/10.1016/j.procs.2017.05.074
Ngo LT, Dang TH, Pedrycz W (2018) Towards interval-valued fuzzy set-based collaborative fuzzy clustering algorithms. Pattern Recogn 81:404–416. https://doi.org/10.1016/j.patcog.2018.04.006
Nanda SJ, Panda G (2014) A survey on nature inspired meta-heuristic algorithms for partitional clustering. Swarm Evolut Comput 16:1–18. https://doi.org/10.1016/j.swevo.2013.11.003
Xu R, Wunsch D (2005) Survey of clustering algorithms. IEEE Trans Neural Network 16(3):645–678. https://doi.org/10.1109/TNN.2005.845141
Zalik KR (2008) An efficient k-means clustering algorithm. Pattern Recognition Letter. https://doi.org/10.1016/j.patrec.2008.02.014
Borobia A, Canogar R (2017) The real nonnegative inverse eigenvalue problem is NP-hard. Linear Algebra Appl 522(1):127–139. https://doi.org/10.1016/j.laa.2017.02.010
Zhao Yanchang, Song Junde (2001) GDILC: a grid-based density-isoline clustering algorithm. International Conferences on Info-Tech and Info-Net. Proceedings (Cat. No.01EX479) 3: 140–145. Beijing, China. https://doi.org/https://doi.org/10.1109/ICII.2001.983048
Merwe VD, Engelbrecht AP (2003) Data clustering using particle swarm optimization. In: Proceedings of IEEE Congress on Evolutionary Computation. Vol. 03. pp 215-220. https://doi.org/https://doi.org/10.1109/CEC.2003.1299577
Ozturk C, Karaboga D (2008) Classification by neural networks and clustering with artificial bee colony algorithm. In: Proceedings of the 6th International Symposium on Intelligent and Manufacturing Systems, Features, Strategies and Innovation. Sakarya, Turkey
Ramadas M, Abraham A, Kumar S (2019) FSDE-Forced Strategy Differential Evolution used for data clustering. J King Saud Univ Comput Inf Sci 31(1):52–61. https://doi.org/10.1016/j.jksuci.2016.12.005
Zhang S, Zhou Y (2015) Grey wolf optimizer based on powell local optimization method for clustering analysis. Discret Dyn Nat Soc. https://doi.org/10.1155/2015/481360
Wang R, Zhou Y, Qiao S, Huang K (2016) Flower pollination algorithm with bee pollinator for cluster analysis. Inform Process Lett 116(1):1–14. https://doi.org/10.1016/j.ipl.2015.08.007
Zhou Y, Zhou Y, Luo Q, Abdel-Basset M (2017) A simplex method-based social spider optimization algorithm for clustering Analysis. Eng Appl Artif Intell 64:67–82. https://doi.org/10.1016/j.engappai.2017.06.004
Han XiaoHong, Quan L, Xiong XiaoYan, Almeter M, Xiang J, Lan Y (2017) A novel data clustering algorithm based on modified gravitational search algorithm. Eng Appl Artif Intell 61:1–7. https://doi.org/10.1016/j.engappai.2016.11.003
Alswaitti M, Albughdadi M, Isa NAM (2018) Density-based particle swarm optimization algorithm for data clustering. Expert Syst Appl 91:170–186. https://doi.org/10.1016/j.eswa.2017.08.050
Jadhav AN, Gomathi N (2018) WGC: Hybridization of exponential grey wolf optimizer with whale optimization for data clustering. Alex Eng J 57(3):1569–1584. https://doi.org/10.1016/j.aej.2017.04.013
Boushaki SI, Kamel N, Bendjeghaba O (2018) A new quantum chaotic cuckoo search algorithm for data clustering. Expert Syst Appl 96:358–372
Abualigah LM, Khader AT, Hanandeh ES, Gandomi AH (2017) A novel hybridization strategy for krill herd algorithm applied to clustering techniques. Appl Soft Comput 60:423–435. https://doi.org/10.1016/j.asoc.2017.06.059
Amiri E, Mahmoudi S (2016) Efficient protocol for data clustering by fuzzy Cuckoo optimization algorithm. Appl Soft Compu 41:15–21. https://doi.org/10.1016/j.asoc.2015.12.008
Fausto F, Cuevas E, Valdivia A, González A (2017) A global optimization algorithm inspired in the behavior of selfish herds. Biosystems 160:39–55. https://doi.org/10.1016/j.biosystems.2017.07.010
Ma M, Luo Q, Zhou Y, Chen X, Li L (2015) An improved animal migration optimization algorithm for clustering analysis. Discrete dyn Nat Soc. https://doi.org/10.1155/2015/194792
Hamilton WD (1971) Geometry to the selfish herd. J Theory Biol 31(2):295–311. https://doi.org/10.1016/0022-5193(71)90189-5
Blake CL, Merz CJ (2007) UCI repository of machine learning databases. http://archive.ics.uci.edu/ml/datasets.html. Accessed 2007
Taher N, Babak A (2010) An efficient hybrid approach based on PSO ACO and k-means for cluster analysis. Appl Soft Comput 10(1):183–197. https://doi.org/10.1016/j.asoc.2009.07.001
Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugenics 7 Part 2. https://doi.org/10.1111/j.1469-1809.1936.tb02137.x
Niknam T, Olamaie J, Amiri B (2008) A hybrid evolutionary algorithm based on ACO and SA for cluster analysis. J Appl Sci 8(15):2695–2702. https://doi.org/10.3923/jas.2008.2695.2702
Zou W, Zhu Y, Chen H, Sui X (2010) A clustering approach using cooperative artificial bee colony algorithm. Discrete dyn Nat Soc. https://doi.org/10.1155/2010/459796
Derrac J, Gracie S, Molina D, Herrera F (2011) A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms. Swarm Evolut Comput 1:3–18. https://doi.org/10.1016/j.swevo.2011.02.002
Nelder JA, Mead R (1965) A simplex method for function minimization. Comput J 30:308–313. https://doi.org/10.1093/comjnl/7.4.308
Qi X (1994) Theoretical analysis of evolutionary algorithms with an infinite population size in continuous space part i: basic properties of selection and mutation. IEEE Trans Neural Netw 5(1):102–119. https://doi.org/10.1109/72.265965
Maulik U, Bandyopadhyay S (2000) Genetic algorithm-based clustering technique. Pattern Recogn 33(9):1455–1465. https://doi.org/10.1016/S0031-3203(99)00137-5
Likas A, Vlassis N, Verbeek JJ (2003) The global k-means clustering algorithm. Pattern Recogn 36(2):451–461. https://doi.org/10.1016/S0031-3203(02)00060-2
Yang Y, Cai J, Yang H, Zhang J, Zhao X (2020) TAD: A trajectory clustering algorithm based on spatial-temporal density analysis. Expert Syst Appl 139:112846. https://doi.org/10.1016/j.eswa.2019.112846
Osamy W, Salim A, Khedr AM (2020) An information entropy based-clustering algorithm for heterogeneous wireless sensor networks. Wireless Netw 26:1869–1886. https://doi.org/10.1007/s11276-018-1877-y
Bui Q-T, Vo B, Do H-A, Hung NQV, Snasel V (2020) F-Mapper: a Fuzzy Mapper clustering algorithm. Knowl-Based Syst 189:105107. https://doi.org/10.1016/j.knosys.2019.105107
Acknowledgements
The authors would like to thank the anonymous reviewers for their valuable comments and suggestions. This paper has been awarded by the National Natural Science Foundation of China (61941113, 82074580), the Fundamental Research Fund for the Central Universities (30918015103, 30918012204), supported by Science and Technology on Information System Engineering Laboratory (No: 05202004), Nanjing Science and Technology Development Plan Project (201805036), China Academy of Engineering Consulting Research Project (2019-ZD-1-02-02), National Social Science Foundation (18BTQ073), State Grid Technology Project (5211XT190033). The authors gratefully acknowledge financial support from China Scholarship Council (CSC NO. 201906840057).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declared that they have no conflicts of interest to this work. We declare that we do not have any commercial or associative interest that represents a conflict of interest in connection with the work submitted.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhao, R., Wang, Y., Xiao, G. et al. A selfish herd optimization algorithm based on the simplex method for clustering analysis. J Supercomput 77, 8840–8910 (2021). https://doi.org/10.1007/s11227-020-03597-0
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-020-03597-0