Abstract
Cluster analysis is an essential tool in data mining. Several clustering algorithms have been proposed and implemented, most of which are able to find good quality clustering results. However, the majority of the traditional clustering algorithms, such as the K-means, K-medoids, and Chameleon, still depend on being provided a priori with the number of clusters and may struggle to deal with problems where the number of clusters is unknown. This lack of vital information may impose some additional computational burdens or requirements on the relevant clustering algorithms. In real-world data clustering analysis problems, the number of clusters in data objects cannot easily be preidentified and so determining the optimal amount of clusters for a dataset of high density and dimensionality is quite a difficult task. Therefore, sophisticated automatic clustering techniques are indispensable because of their flexibility and effectiveness. This paper presents a systematic taxonomical overview and bibliometric analysis of the trends and progress in nature-inspired metaheuristic clustering approaches from the early attempts in the 1990s until today’s novel solutions. Finally, key issues with the formulation of metaheuristic algorithms as a clustering problem and major application areas are also covered in this paper.











Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Abbasi AA, Younis M (2007) A survey on clustering algorithms for wireless sensor networks. Comput Commun 30(14–15):2826–2841
Abdulwahab HA, Noraziah A, Alsewari AA, Salih SQ (2019) An enhanced version of black hole algorithm via levy flight for optimization and data clustering problems. IEEE Access 7:142085–142096
Abraham A, Das S, Konar A (2007) Kernel based automatic clustering using modified particle swarm optimization algorithm. In: Proceedings of the 9th annual conference on genetic and evolutionary computation. ACM, pp 2–9
Abualigah LM, Khader AT (2017) Unsupervised text feature selection technique based on hybrid particle swarm optimization algorithm with genetic operators for the text clustering. J Supercomput 73(11):4773–4795
Abualigah LM, Khader AT, Al-Betar MA (2016) Multi-objectives-based text clustering technique using K-mean algorithm. In: 2016 7th international conference on computer science and information technology (CSIT). IEEE, pp 1–6
Abualigah LM, Khader AT, Hanandeh ES (2018) A hybrid strategy for krill herd algorithm with harmony search algorithm to improve the data clustering. Intell Decis Technol 12(1):3–14
Abualigah LM, Khader AT, Al-Betar MA, Alomari OA (2017) Text feature selection with a robust weight scheme and dynamic dimension reduction to text document clustering. Expert Syst Appl 84:24–36
Abualigah LM, Khader AT, Al-Betar MA, Awadallah MA (2016) A krill herd algorithm for efficient text documents clustering. In: 2016 IEEE symposium on computer applications & industrial electronics (ISCAIE). IEEE, pp 67–72
Abualigah LM, Khader AT, AlBetar MA, Hanandeh ES (2016) A new hybridization strategy for krill herd algorithm and harmony search algorithm applied to improve the data clustering. In: 1st EAI international conference on computer science and engineering. European Alliance for Innovation (EAI), p 54
Abualigah LM, Khader AT, AlBetar MA, Hanandeh ES (2017) Unsupervised text feature selection technique based on particle swarm optimization algorithm for improving the text clustering. In: EAI international conference on computer science and engineering
Abubaker A, Baharum A, Alrefaei M (2015) Automatic clustering using multi-objective particle swarm and simulated annealing. PLoS One 10(7):e0130995
Agarwal P, Mehta S (2016) Enhanced flower pollination algorithm on data clustering. Int J Comput Appl 38(2–3):144–155
Agarwal P, Alam MA, Biswas R (2011) Issues, challenges and tools of clustering algorithms. arXiv preprint arXiv:1110.2610
Agbaje MB, Ezugwu AE, Els R (2019) Automatic data clustering using hybrid firefly particle swarm optimization algorithm. IEEE Access 7:184963–184984
Aggarwal CC (ed) (2014) Data classification: algorithms and applications. CRC Press, Boca Raton
Agusti LE, Salcedo-Sanz S, Jiménez-Fernández S, Carro-Calvo L, Del Ser J, Portilla-Figueras JA (2012) A new grouping genetic algorithm for clustering problems. Expert Syst Appl 39(10):9695–9703
Akinyelu AA, Ezugwu AE (2019) Nature inspired instance selection techniques for support vector machine speed optimization. IEEE Access 7:154581–154599
Akinyelu AA, Ezugwu AE, Adewumi AO (2019) Ant colony optimization edge selection for support vector machine speed optimization. Neural Comput Appl 32:1–33
Akyol S, Alatas B (2017) Plant intelligence based metaheuristic optimization algorithms. Artif Intell Rev 47(4):417–462
Alatas B (2011) ACROA: artificial chemical reaction optimization algorithm for global optimization. Expert Syst Appl 38(10):13170–13180
Aliniya Z, Mirroshandel SA (2019) A novel combinatorial merge-split approach for automatic clustering using imperialist competitive algorithm. Expert Syst Appl 117:243–266
Alswaitti M, Albughdadi M, Isa NAM (2018) Density-based particle swarm optimization algorithm for data clustering. Expert Syst Appl 91:170–186
Anand N, Vikram P (2015) Comprehensive analysis & performance comparison of clustering algorithms for big data. Rev Comput Eng Res 4:54–80
Anari B, Torkestani JA, Rahmani AM (2017) Automatic data clustering using continuous action-set learning automata and its application in segmentation of images. Appl Soft Comput 51:253–265
Arbelaitz O, Gurrutxaga I, Muguerza J, PéRez JM, Perona I (2013) An extensive comparative study of cluster validity indices. Pattern Recognit 46(1):243–256
Atabay HA, Sheikhzadeh MJ, Torshizi M (2016) A clustering algorithm based on integration of K-means and PSO. In: 2016 1st conference on swarm intelligence and evolutionary computation (CSIEC). IEEE, pp 59–63
Baker FB, Hubert LJ (1975) Measuring the power of hierarchical cluster analysis. J Am Stat Assoc 70(1975):31–38
Banati H, Bajaj M (2013) Performance analysis of firefly algorithm for data clustering. Int J Swarm Intell 1(1):19–35
Bandyopadhyay S, Saha S (2008) A point symmetry-based clustering technique for automatic evolution of clusters. IEEE Trans Knowl Data Eng 20(2008):1441–1457
Berkhin P (2006) A survey of clustering data mining techniques. In: Kogan J, Nicholas C, Teboulle M (eds) Grouping multidimensional data. Springer, Berlin, pp 25–71
Bezdek JC, Pal NR (1998) Some new indexes of cluster validity. IEEE Trans Syst Man Cyber Part B 28(3):301–315
Bezdek JC (2013) Pattern recognition with fuzzy objective function algorithms. Springer, Berlin
Blanco-Mesa F, León-Castro E, Merigó JM (2019) A bibliometric analysis of aggregation operators. Appl Soft Comput 81:105488
Blanco-Mesa F, Merigó JM, Gil-Lafuente AM (2017) Fuzzy decision making: a bibliometric-based review. J Intell Fuzzy Syst 32(3):2033–2050
Boryczka U (2009) Finding groups in data: cluster analysis with ants. Appl Soft Comput 9(1):61–70
Bouyer A, Ghafarzadeh H, Tarkhaneh O (2015) An efficient hybrid algorithm using cuckoo search and differential evolution for data clustering. Indian J Sci Technol 8(24):1–12
Calinski T, Harabasz J (1974) A dendrite method for cluster analysis. Commun Stat 3(1974):1–27
Chang DX, Zhang XD, Zheng CW, Zhang DM (2010) A robust dynamic niching genetic algorithm with niche migration for automatic clustering problem. Pattern Recognit 43(4):1346–1360
Chang H, Yeung DY (2008) Robust path-based spectral clustering. Pattern Recognit 41(1):191–203
Chou CH, Su MC, Lai E (2004) A new cluster validity measure and its application to image compression. Pattern Anal Appl 7(2):205–220
Chowdhury A, Bose S, Das S (2011) Automatic clustering based on invasive weed optimization algorithm. In: International conference on swarm, evolutionary, and memetic computing. Springer, Berlin, Heidelberg, pp 105–112
Chu Y, Mi H, Liao H, Ji Z, Wu QH (2008) A fast bacterial swarming algorithm for high-dimensional function optimization. In: 2008 IEEE congress on evolutionary computation (IEEE world congress on computational intelligence). IEEE, pp 3135–3140
Chuang LY, Hsiao CJ, Yang CH (2011) Chaotic particle swarm optimization for data clustering. Expert Syst Appl 38(12):14555–14563
Condorcet MJAN (2014) “Essai sur l“Application de l“Analyse `a la Probabilite´ des decisions rendues a la Pluralite´ des Voix,” paris: L“Imprimerie Royale, 1785
Corter, J. E. and Gluck, M. A. (1992). “Explaining basic categories: Feature predictability and information,” Psychological Bulletin, vol. 111, no. 2, pp 291–303, 1992
Cowgill MC, Harvey RJ, Watson LT (1999) A genetic algorithm approach to cluster analysis. Comput Math Appl 37(7):99–108
Cruz DPF, Maia RD, de Castro LN (2013) A new encoding scheme for a bee-inspired optimal data clustering algorithm. In: 2013 BRICS congress on computational intelligence and 11th Brazilian congress on computational intelligence. IEEE, pp 136–141
Cura T (2012) A particle swarm optimization approach to clustering. Expert Syst Appl 39(1):1582–1588
Dalrymple-Alford EC (1970) The measurement of clustering in free recall. Psycol. Bull. 74:32–34
Das S, Abraham A, Konar A (2007) Automatic clustering using an improved differential evolution algorithm. IEEE Trans Syst Man Cybern Part A Syst Hum 38(1):218–237
Das S, Abraham A, Konar A (2008) Automatic kernel clustering with a multi-elitist particle swarm optimization algorithm. Pattern Recognit Lett 29(5):688–699
Das S, Chowdhury A, Abraham A (2009) A bacterial evolutionary algorithm for automatic data clustering. In: 2009 IEEE congress on evolutionary computation. IEEE, pp 2403–2410
Das S, Mullick SS, Suganthan PN (2016) Recent advances in differential evolution—an updated survey. Swarm Evol Comput 27:1–30
Davies DL, Bouldin DW (1979) A cluster separation measure. IEEE Trans Pattern Anal Mach Intell 2:224–227
Dorigo M, Birattari M (2010) Ant colony optimization. Springer, Berlin, pp 36–39
Dorigo M, Stützle T (2019) Ant colony optimization: overview and recent advances. In: Gendreau M, Potvin JY (eds) Handbook of metaheuristics. Springer, Cham, pp 311–351
Drewes B (2005) Some industrial applications of text mining. In: Knowledge mining. StudFuzz, vol 185. Springer, Berlin, Heidelberg, pp 233–238
Duan G, Hu W, Zhang Z (2016) A novel data clustering algorithm based on modified adaptive particle swarm optimization. Int J Signal Process Image Process Pattern Recognit 9(3):179–188
Duda RO, Hart PE, Stork DG (2001) Pattern classification. Wiley, New York
Dunn JC (1973) A Fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. J Cyber 3(1973):32–57
Dutt A, Ismail MA, Herawan T (2017) A systematic review on educational data mining. IEEE Access 5:15991–16005
Dutta D, Dutta P, Sil J (2012) Data clustering with mixed features by multi objective genetic algorithm. In: 2012 12th international conference on hybrid intelligent systems (HIS). IEEE, pp 336–341
Eberhart R, Kennedy J (1995) A new optimizer using particle swarm theory. In: MHS’95. Proceedings of the sixth international symposium on micro machine and human science. IEEE, pp 39–43
Elaziz MA, Nabil NEGGAZ, Ewees AA, Lu S (2019) Automatic data clustering based on hybrid atom search optimization and sine–cosine algorithm. In: 2019 IEEE congress on evolutionary computation (CEC). IEEE, pp 2315–2322
Ester M, Kriegel HP, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD, vol 96, No. 34, pp 226–231
Ezenkwu CP, Ozuomba S, Kalu C (2015) Application of K-means algorithm for efficient customer segmentation: a strategy for targeted customer services. Int J Adv Res Artif Intell (IJARAI) 4(10):40–44
Ezugwu AE (2019) Enhanced symbiotic organisms search algorithm for unrelated parallel machines manufacturing scheduling with setup times. Knowl-Based Syst 172:15–32
Ezugwu AES, Adewumi AO (2017) Discrete symbiotic organisms search algorithm for travelling salesman problem. Expert Syst Appl 87:70–78
Ezugwu AE, Adewumi AO (2017) Soft sets based symbiotic organisms search algorithm for resource discovery in cloud computing environment. Future Gener Comput Syst 76:33–50
Ezugwu AE, Akutsah F (2018) An improved firefly algorithm for the unrelated parallel machines scheduling problem with sequence-dependent setup times. IEEE Access 6:54459–54478
Ezugwu AE, Prayogo D (2019) Symbiotic organisms search algorithm: theory, recent advances and applications. Expert Syst Appl 119:184–209
Ezugwu AE, Adeleke OJ, Viriri S (2018) Symbiotic organisms search algorithm for the unrelated parallel machines scheduling with sequence-dependent setup times. PLoS One 13(7):e0200030
Ezugwu AE, Adeleke OJ, Akinyelu AA, Viriri S (2019) A conceptual comparison of several metaheuristic algorithms on continuous optimisation problems. Neural Comput Appl 32:1–45
Ezugwu AE, Akutsah F, Olusanya MO, Adewumi AO (2018) Enhanced intelligent water drops algorithm for multi-depot vehicle routing problem. PLoS One 13(3):e0193751
Ezugwu AE (2020) Nature-inspired metaheuristic techniques for automatic clustering: a survey and performance study. SN Appl Sci 2(2):273
Fahad A, Alshatri N, Tari Z, Alamri A, Khalil I, Zomaya AY, Foufou S, Bouras A (2014) A survey of clustering algorithms for big data: taxonomy and empirical analysis. IEEE Trans Emerg Top Comput 2(3):267–279
Falkenauer E (1998) Genetic algorithms and grouping problems. Wiley, Chichester
Fisher DH (1987) Knowledge acquisition via incremental conceptual clustering. Mach Learn 2(2):139–172. https://doi.org/10.1007/BF00114265
Fister I, Fister I Jr, Yang XS, Brest J (2013) A comprehensive review of firefly algorithms. Swarm Evol Comput 13:34–46
Fortier JJ, Solomon H (1966) Clustering procedures. In: Krishnaiah PR (ed) Multivariate analysis, vol 62. Academic Press, New York
Fowlkes EB, Mallows CL (1983) A method for comparing two hierarchical clusterings. J Am Stat Assoc 78(383):553–569
Gan G, Ma C, Wu J (2007) Data clustering: theory, algorithms, and applications, vol 20. SIAM, Philadelphia
Gluck MA, Corter JE (1985) Information, uncertainty, and the utility of categories. In: Program of the 7th annual conference of the cognitive science society, pp 283–287
Goel S, Sharma A, Bedi P (2011) Cuckoo search clustering algorithm: a novel strategy of biomimicry. In: 2011 world congress on information and communication technologies. IEEE, pp 916–921
Goldberg DE, Holland JH (1988) Genetic algorithms and machine learning. Mach Learn 3(2):95–99
Guha S, Rastogi R, Shim K (2001) Cure: an efficient clustering algorithm for large databases. Inf Syst 26(1):35–58
Guo D, Chen J, Chen Y, Li Z (2018) LBIRCH: an improved BIRCH algorithm based on link. ICMLC 2018:74–79
Halkidi M, Vazirgiannis M (2001) Clustering validity assessment: finding the optimal partitioning of a data set. In: Proceedings 2001 IEEE international conference on data mining. IEEE, pp 187–194
Halkidi M, Batistakis Y, Vazirgiannis M (2002) Clustering validity checking methods: part II. ACM Sigmod Rec 31(3):19–27
Halkidi M, Vazirgiannis M, Batistakis I (2000) Quality scheme assessment in the clustering process. In: Proceedings of PKDD, Lyon, France
Hameed PN, Verspoor K, Kusljic S, Halgamuge S (2018) A two-tiered unsupervised clustering approach for drug repositioning through heterogeneous data integration. BMC Bioinform 19:29
Hamerly G, Elkan C (2004) Learning the k in k-means. In: Advances in neural information processing systems. MIT Cambridge Press, 2003, pp 281–288
Hamilton JD (1994) Time series analysis. Princeton University Press, Princeton, p 159
Hartigan JA, Wong MA (1979) Algorithm AS 136: A k-means clustering algorithm. J R Stat Soc Ser C (Appl Stat) 28(1):100–108
Hassanzadeh T, Meybodi MR (2012) A new hybrid approach for data clustering using firefly algorithm and K-means. In: The 16th CSI international symposium on artificial intelligence and signal processing (AISP 2012). IEEE, pp 007–011
Hatamlou A (2013) Black hole: a new heuristic optimization approach for data clustering. Inf Sci 222:175–184
He H, Tan Y (2012) A two-stage genetic algorithm for automatic clustering. Neurocomputing 81:49–59
He Q, Jin X, Du C, Zhuang F, Shi Z (2014) Clustering in extreme learning machine feature space. Neurocomputing 128:88–95
Hoos HH, Stützle T (2004) Stochastic local search: foundations and applications. Elsevier, Amsterdam
Hosseinimotlagh S, Papalexakis EE (2018) Unsupervised content-based identification of fake news articles with tensor decomposition ensembles. In: Proceedings of the workshop on misinformation and misbehavior mining on the web (MIS2)
Hruschka ER, Campello RJ, Freitas AA (2009) A survey of evolutionary algorithms for clustering. IEEE Trans Syst Man Cybern Part C (Appl Rev) 39(2):133–155
Huang CL, Huang WC, Chang HY, Yeh YC, Tsai CY (2013) Hybridization strategies for continuous ant colony optimization and particle swarm optimization applied to data clustering. Appl Soft Comput 13(9):3864–3872
Hussain K, Salleh MNM, Cheng S, Shi Y (2019) Metaheuristic research: a comprehensive survey. Artif Intell Rev 52(4):2191–2233
Jaccard P (1901) Distribution de la flore alpine dans le bassin des Dranses et dans quelques régions voisines. Bulletin de la Société Vaudoise des Sciences Naturelles 37:241–272
Jafar OM, Sivakumar R (2010) Ant-based clustering algorithms: a brief survey. Int J Comput Theory Eng 2(5):787
Jain AK (2010) Data clustering: 50 years beyond K-means. Pattern Recognit Lett 31(8):651–666
Jain AK, Murty MN, Flynn PJ (1999) Data clustering: a review. ACM Comput Surv (CSUR) 31(3):264–323
Jakobsson M, Rosenberg NA (2007) CLUMPP: a cluster matching and permutation program for dealing with label switching and multimodality in analysis of population structure. Bioinformatics 23(14):1801–1806
Janmaijaya M, Shukla AK, Abraham A, Muhuri PK (2018) A scientometric study of neurocomputing publications (1992–2018): an aerial overview of intrinsic structure. Publications 6(3):32
Jensi R, Jiji GW (2015) MBA-LF: a new data clustering method using modified BAT algorithm and levy flight. ICTACT J Soft Comput 6(1):1093–1101
José-García A, Gómez-Flores W (2016) Automatic clustering using nature-inspired metaheuristics: a survey. Appl Soft Comput 41:192–213
Kansal T, Bahuguna S, Singh V, Choudhury T (2018) Customer segmentation using K-means clustering. In: 2018 international conference on computational techniques, electronics and mechanical systems (CTEMS)
Kao Y, Chen CC (2014) Automatic clustering for generalised cell formation using a hybrid particle swarm optimisation. Int J Prod Res 52(12):3466–3484
Kapoor S, Zeya I, Singhal C, Nanda SJ (2017) A grey wolf optimizer based automatic clustering algorithm for satellite image segmentation. Proc Comput Sci 115:415–422
Karaboga D (2005) An idea based on honey bee swarm for numerical optimization, vol 200. Technical report-tr06. Erciyes Universityersity, Engineering Faculty, Computer Engineering Department, pp 1–10
Karaboğa D, Ökdem S (2004) A simple and global optimization algorithm for engineering problems: differential evolution algorithm. Turk J Electr Eng Comput Sci 12(1):53–60
Karaboga D, Ozturk C (2011) A novel clustering approach: artificial bee colony (ABC) algorithm. Appl Soft Comput 11(1):652–657
Karthikeyan M, Aruna P (2013) Probability based document clustering and image clustering using content-based image retrieval. Appl Soft Comput 13(2):959–966
Karypis G, Han EH, Chameleon VK (1999) A hierarchical clustering algorithm using dynamic modeling. IEEE Comput 32(8):68–75
Kaushik K, Arora V (2015) A hybrid data clustering using firefly algorithm based improved genetic algorithm. Proc Comput Sci 58:249–256
Kosters WA, Laros JF (2007) Metrics for mining multisets. In: International conference on innovative techniques and applications of artificial intelligence. Springer, London, pp 293–303
Kotsiantis S, Pintelas EP (2004) Recent advances in clustering: a brief survey. WSEAS Trans Inf Sci Appl 1:73–81
Kovács F, Ivancsy R (2006) Cluster validity measurement for arbitrary shaped clustering. In: Proceeding of the 5th. WSEAS international conference on artificial, knowledge engineering and data bases, Madrid, Spain, February 15–17, 2006, pp 372–377
Kumar V, Chhabra JK, Kumar D (2014) Automatic cluster evolution using gravitational search algorithm and its application on image segmentation. Eng Appl Artif Intell 29:93–103
Kumar V, Chhabra JK, Kumar D (2016) Automatic data clustering using parameter adaptive harmony search algorithm and its application to image segmentation. J Intell Syst 25(4):595–610
Kumar Y, Sahoo G (2014) A review on gravitational search algorithm and its applications to data clustering & classification. Int J Intell Syst Appl 6(6):79
Kundu D, Suresh K, Ghosh S, Das S, Abraham A, Badr Y (2009) Automatic clustering using a synergy of genetic algorithm and multi-objective differential evolution. In: International conference on hybrid artificial intelligence systems. Springer, Berlin, Heidelberg, pp 177–186
Kuo RJ, Zulvia FE (2018) Automatic clustering using an improved artificial bee colony optimization for customer segmentation. Knowl Inf Syst 57(2):331–357
Kuo RJ, Huang YD, Lin CC, Wu YH, Zulvia FE (2014) Automatic kernel clustering with bee colony optimization algorithm. Inf Sci 283:107–122
Kuo RJ, Syu YJ, Chen ZY, Tien FC (2012) Integration of particle swarm optimization and genetic algorithm for dynamic clustering. Inf Sci 195:124–140
Kuo R, Zulvia F (2013) Automatic clustering using an improved particle swarm optimization. J Ind Intell Inf 1(1):46–51
Kużelewska U (2014) Clustering algorithms in hybrid recommender system on MovieLens data. Stud Log Gramm Rhetor 37(50):125–139
Lago-Fernández LF, Corbacho F (2010) Normality-based validation for crisp clustering. Pattern Recognit 43(3):782–795
Landers JR, Duperrouzel B (2018) Machine learning approaches to competing in fantasy leagues for the NFL. IEEE Trans Games 11(2):159–172
Lashkari M, Moattar MH (2015) The improved K-means clustering algorithm using the proposed extended PSO algorithm. In: 2015 international congress on technology, communication and knowledge (ICTCK). IEEE, pp 429–434
Lee WP, Chen SW (2010) Automatic clustering with differential evolution using cluster number oscillation method. In: 2010 2nd international workshop on intelligent systems and applications. IEEE, pp 1–4
Legány C, Juhász S, Babos A (2006) Cluster validity measurement techniques. In: Proceeding of the 5th. WSEAS international conference on artificial, knowledge engineering and data bases, Madrid, Spain, February 15–17, 2006, pp 388–393
Ling HL, Wu JS, Zhou Y, Zheng WS (2016) How many clusters? A robust PSO-based local density model. Neurocomputing 207:264–275
Liu R, Wang X, Li Y, Zhang X (2012) Multi-objective invasive weed optimization algortihm for clustering. In: 2012 IEEE congress on evolutionary computation. IEEE, pp 1–8
Liu R, Zhu B, Bian R, Ma Y, Jiao L (2015) Dynamic local search based immune automatic clustering algorithm and its applications. Appl Soft Comput 27:250–268
Liu T, Rosenberg C, Rowley HA (2007) Clustering billions of images with large scale nearest neighbor search. In: 2007 IEEE workshop on applications of computer vision (WACV’07). IEEE, p 28
Liu X, Fu H (2010) An effective clustering algorithm with ant colony. JCP 5(4):598–605
Liu Y, Wu X, Shen Y (2011) Automatic clustering using genetic algorithms. Appl Math Comput 218(4):1267–1279
MacQueen J (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, vol 1, No. 14, pp 281–297
Majhi SK, Biswal S (2018) Optimal cluster analysis using hybrid K-means and ant lion optimizer. Karbala Int J Mod Sci 4(4):347–360
Malhat MG, Mousa HM, El-Sisi AB (2014) Clustering of chemical data sets for drug discovery. In: 2014 9th international conference on informatics and systems
Marinakis Y, Marinaki M, Matsatsinis N (2009) A hybrid discrete artificial bee colony-GRASP algorithm for clustering. In: 2009 international conference on computers & industrial engineering. IEEE, pp 548–553
Masoud H, Jalili S, Hasheminejad SMH (2013) Dynamic clustering using combinatorial particle swarm optimization. Appl Intell 38(3):289–314
Maulik U, Saha I (2009) Modified differential evolution based fuzzy clustering for pixel classification in remote sensing imagery. Pattern Recognit 42(9):2135–2149
Maulik U, Saha I (2010) Automatic fuzzy clustering using modified differential evolution for image classification. IEEE Trans Geosci Remote Sens 48(9):3503–3510
Mehrabian AR, Lucas C (2006) A novel numerical optimization algorithm inspired from weed colonization. Ecol Inform 1(4):355–366
Merigó JM, Cobo MJ, Laengle S, Rivas D, Herrera-Viedma E (2019) Twenty years of soft computing: a bibliometric overview. Soft Comput 23(5):1477–1497
Milligan GW, Cooper MC (1987) Methodology review: clustering methods. Appl Psychol Meas 11(4):329–354
Mohammadpour T, Bidgoli AM, Enayatifar R, Javadi HH (2019) Efficient clustering in collaborative filtering recommender system: hybrid method based on genetic algorithm and gravitational emulation local search algorithm. Genomics 111(6):1902–1912
Molina D, Poyatos J, Del Ser J, García S, Hussain A, Herrera F (2020) Comprehensive taxonomies of nature-and bio-inspired optimization: inspiration versus algorithmic behavior, critical analysis and recommendations. arXiv preprint arXiv:2002.08136
Mouton JP, Ferreira M, Helberg SJA (2020) A comparison of clustering algorithms for automatic modulation classification. Expert Syst Appl 151:113317
Muhuri PK, Shukla AK, Abraham A (2019) Industry 4.0: a bibliometric analysis and detailed overview. Eng Appl Artif Intell 78:218–235
Muhuri PK, Shukla AK, Janmaijaya M, Basu A (2018) Applied soft computing: a bibliometric analysis of the publications and citations during (2004–2016). Appl Soft Comput 69:381–392
Murty MR, Naik A, Murthy JVR, Reddy PP, Satapathy SC, Parvathi K (2014) Automatic clustering using teaching learning based optimization. Appl Math 5(08):1202
Nanda SJ, Panda G (2013) Automatic clustering algorithm based on multi-objective immunized PSO to classify actions of 3D human models. Eng Appl Artif Intell 26(5–6):1429–1441
Nayak J, Kanungo DP, Naik B, Behera HS (2016) Evolutionary improved swarm-based hybrid K-means algorithm for cluster analysis. In: Proceedings of the second international conference on computer and communication technologies. Springer, New Delhi, pp 343–352
Nayak J, Nanda M, Nayak K, Naik B, Behera HS (2014) An improved firefly fuzzy c-means (FAFCM) algorithm for clustering real world data sets. In: Advanced computing, networking and informatics, vol 1. Springer, Cham, pp 339–348
Nayyar A, Puri V (2017) Comprehensive analysis & performance comparison of clustering algorithms for big data. Rev Comput Eng Res 4(2):54–80
Nerurkar P, Shirke A, Chandane M, Bhirud S (2018) Empirical analysis of data clustering algorithms. Proc Comput Sci 125:770–779
Niknam T, Olamaie J, Amiri B (2008) A hybrid evolutionary algorithm based on ACO and SA for cluster analysis. J Appl Sci 8(15):2695–2702
Niu B, Wang H (2012) Bacterial colony optimization. Discrete Dyn Nat Soc 2012:698057. https://doi.org/10.1155/2012/698057
Omran MG, Salman A, Engelbrecht AP (2006) Dynamic clustering using particle swarm optimization with application in image segmentation. Pattern Anal Appl 8(4):332
Ozturk C, Hancer E, Karaboga D (2015) Dynamic clustering with improved binary artificial bee colony algorithm. Appl Soft Comput 28:69–80
Pacheco TM, Gonçalves LB, Ströele V, Soares SSR (2018) An ant colony optimization for automatic data clustering problem. In: 2018 IEEE congress on evolutionary computation (CEC). IEEE, pp 1–8
Pal NR, Biswas J (1997) Cluster validation using graph theoretic concepts. Pattern Recognit 30(6):847–857
Paterlini S, Krink T (2006) Differential evolution and particle swarm optimisation in partitional clustering. Comput Stat Data Anal 50(5):1220–1247
Pelleg D (2000) Extending K-means with efficient estimation of the number of clusters in ICML. In: Proceedings of the 17th international conference on machine learning, pp 277–281
Peng H, Wang J, Shi P, Riscos-Núñez A, Pérez-Jiménez MJ (2015) An automatic clustering algorithm inspired by membrane computing. Pattern Recognit Lett 68:34–40
Raftery A (1986) A note on Bayes factors for log-linear contingency table models with vague prior information. J R Stat Soc 48(2):249–250
Rahman MA, Islam MZ (2014) A hybrid clustering technique combining a novel genetic algorithm with K-means. Knowl-Based Syst 71:345–365
Rajah V, Ezugwu AE (2020) Hybrid symbiotic organism search algorithms for automatic data clustering. In: 2020 conference on information communications technology and society (ICTAS). IEEE, pp 1–9
Rajpurohit J, Sharma TK, Abraham A, Vaishali A (2017) Glossary of metaheuristic algorithms. Int J Comput Inf Syst Ind Manag Appl 9:181–205
Ramadas M, Abraham A (2019) Metaheuristics for data clustering and image segmentation. Springer, Berlin
Rana S, Jasola S, Kumar R (2010) A hybrid sequential approach for data clustering using K-Means and particle swarm optimization algorithm. Int J Eng Sci Technol 2(6)
Rana S, Jasola S, Kumar R (2013) A boundary restricted adaptive particle swarm optimization for data clustering. Int J Mach Learn Cybern 4(4):391–400
Rand WM (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 66(336):846–850
Rani Y, Rohil H (2013) A study of hierarchical clustering algorithm. Int J Inf Comput Technol 3(10):1115–1122
Rao RV, Savsani VJ, Vakharia DP (2011) Teaching–learning-based optimization: a novel method for constrained mechanical design optimization problems. Comput Aided Des 43(3):303–315
Raposo C, Antunes CH, Barreto JP (2014) Automatic clustering using a genetic algorithm with new solution encoding and operators. In: International conference on computational science and its applications. Springer, Cham, pp 92–103
Razmjooy N, Khalilpour M, Ramezani M (2016) A new meta-heuristic optimization algorithm inspired by FIFA world cup competitions: theory and its application in PID designing for AVR system. J Control Autom Electr Syst 27(4):419–440
Rendon LE, Garcia R, Abundez I, Gutierrez C et al (2002) Niva: a robust cluster validity. In: 2th. WSEAS international conference on scientific computation and soft computing, Crete, Greece, pp 209–213
Rijsbergen V (1979) Information retrieval. Butterworths, London, p 1979
Rokach L (2005) “Clustering methods”, data mining and knowledge discovery handbook. Springer, Berlin, pp 331–352
Rousseeuw PJ (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20:53–65
Sabau AS (2012) Survey of clustering based financial fraud detection research. Inform Econ 16(1):110
Saemi B, Hosseinabadi AA, Kardgar M, Balas VE (2018) Nature inspired partitioning clustering algorithms: a review and analysis. In: Advances in intelligent systems and computing, pp 96–116
Saha I, Maulik U, Bandyopadhyay S (2009) A new differential evolution based fuzzy clustering for automatic cluster evolution. In: 2009 IEEE international advance computing conference. IEEE, pp 706–711
Sahoo AJ, Kumar Y (2014) Modified teacher learning based optimization method for data clustering. In: Advances in signal processing and intelligent recognition systems. Springer, Cham, pp 429–437
Saitta S, Raphael B, Smith I (2007) Abounded index for cluster validity. In: Perner P (ed) Machine learning and data mining in pattern recognition, vol 4571. Lecture notes in computer science. Springer, Berlin, pp 174–187
Salcedo-Sanz S, Carro-Calvo L, Portilla-Figueras A, Cuadra L, Camacho D (2013) Fuzzy clustering with grouping genetic algorithms. In: International conference on intelligent data engineering and automated learning. Springer, Berlin, Heidelberg, pp 334–341
Sarsoh JT, Hashim KM, Miften FS (2009) Comparisons between automatic and non-automatic clustering algorithms. J Coll Educ Pure Sci 4(1):221–227
Satapathy SC, Naik A (2011) Data clustering based on teaching-learning-based optimization. In: International conference on swarm, evolutionary, and memetic computing. Springer, Berlin, Heidelberg, pp 148–156
Sathappan S, Sridhar S, Tomar DC (2017) A literature study on traditional clustering algorithms for uncertain data. J Adv Math Comput Sci 21:1–21
Saxena A, Prasad M, Gupta A, Bharill N, Patel OP, Tiwari A et al (2017) A review of clustering techniques and developments. Neurocomputing 267:664–681
Senthilnath J, Das V, Omkar SN, Mani V (2013) Clustering using levy flight cuckoo search. In: Proceedings of seventh international conference on bio-inspired computing: theories and applications (BIC-TA 2012). Springer, India, pp 65–75
Senthilnath J, Omkar SN, Mani V (2011) Clustering using firefly algorithm: performance study. Swarm Evol Comput 1(3):164–171
Sharma M, Chhabra JK (2019) Sustainable automatic data clustering using hybrid PSO algorithm with mutation. Sustain Comput Inform Syst 23:144–157
Sharma SC (1996) Applied multivariate techniques. Wiley, New York
Shehab M, Khader AT, Al-Betar MA (2017) A survey on applications and variants of the cuckoo search algorithm. Appl Soft Comput 61:1041–1059
Shirkhorshidi AS, Aghabozorgi S, Wah TY, Herawan T (2014) Big data clustering: a review. Lecture notes in computer science. Springer, Cham, pp 707–720
Shukla AK, Banshal SK, Seth T, Basu A, John R, Muhuri PK (2020) A bibliometric overview of the field of type-2 fuzzy sets and systems [discussion forum]. IEEE Comput Intell Mag 15(1):89–98
Shukla AK, Sharma R, Muhuri PK (2018) A review of the scopes and challenges of the modern real-time operating systems. Int J Embed Real-Time Commun Syst (IJERTCS) 9(1):66–82
Shukla N, Merigó JM, Lammers T, Miranda L (2020) Half a century of computer methods and programs in biomedicine: a bibliometric analysis from 1970 to 2017. Comput Methods Programs Biomed 183:105075
Silva Filho TM, Pimentel BA, Souza RM, Oliveira AL (2015) Hybrid methods for fuzzy clustering based on fuzzy c-means and improved particle swarm optimization. Expert Syst Appl 42(17–18):6315–6328
Singh J, Kumar R, Mishra AK (2015) Clustering algorithms for wireless sensor networks: a review. In: 2015 2nd international conference on computing for sustainable global development (INDIACom)
Storn R, Price K (1997) Differential evolution—a simple and efficient heuristic for global optimization over continuous spaces. J Glob Optim 11(4):341–359
Strehl A, Ghosh J (2000) Clustering guidance and quality evaluation using relationship-based visualization. In: Intelligent engineering systems through artificial neural networks, St. Louis, Missouri, USA, pp 483–488
Sundararajan S, Karthikeyan S (2014) An efficient hybrid approach for data clustering using dynamic K-means algorithm and firefly algorithm. J Eng Appl Sci 9(8):1348–1353
Suresh K, Kundu D, Ghosh S, Das S, Abraham A (2009) Automatic clustering with multi-objective differential evolution algorithms. In 2009 IEEE congress on evolutionary computation. IEEE, pp 2590–2597
Taghva K, Sharma M (2007) Comparison of automatic clustering and manual categorization of documents. In: Akhgar B (ed) ICCS
Tan PN, Steinbach M, Kumar V (2013) Data mining cluster analysis: basic concepts and algorithms. In: Introduction to data mining, pp 487–533
Tang WH, Wu QH (2011) Evolutionary computation. In: Tang WH, Wu QH (eds) Condition monitoring and assessment of power transformers using computational intelligence. Springer, London, pp 15–36
Theodoridis S, Koutroubas K (1999) Pattern recognition. Academic Press, Cambridge
Thomas MC, Romagnoli J (2016) Extracting knowledge from historical databases for process monitoring using feature extraction and data clustering. In: Proceedings of the 26th European symposium on computer aided process engineering—ESCAPE vol 26, pp 861–864
Tran DC, Wu Z, Wang Z, Deng C (2015) A novel hybrid data clustering algorithm based on artificial bee colony algorithm and k-means. Chin J Electron 24(4):694–701
Tsai CW, Huang KW, Yang CS, Chiang MC (2015) A fast particle swarm optimization for clustering. Soft Comput 19(2):321–338
Tsay RS (2005) Analysis of financial time series. Wiley, New York
Tseng LY, Yang SB (2001) A genetic approach to the automatic clustering problem. Pattern Recognit 34(2):415–424
Van der Merwe DW, Engelbrecht AP (2003) Data clustering using particle swarm optimization. In: The 2003 congress on evolutionary computation, 2003. CEC’03, vol 1. IEEE, pp 215–220
Van Eck NJ, Waltman L (2010) Software survey: VOSviewer, a computer program for bibliometric mapping. Scientometrics 84(2):523–538
Von Luxburg U (2007) A tutorial on spectral clustering. Stat Comput 17(4):395–416
Vo-Van T, Nguyen-Hai A, Tat-Hong MV, Nguyen-Trang T (2020) A new clustering algorithm and its application in assessing the quality of underground water. Sci Program 2020:6458576. https://doi.org/10.1155/2020/6458576
Wang R, Zhou Y, Qiao S, Huang K (2016) Flower pollination algorithm with bee pollinator for cluster analysis. Inf Process Lett 116(1):1–14
Wang S, Wu Y (2010) Clustering analysis based on chaos genetic algorithm. In: 2010 Chinese control and decision conference. IEEE, pp 16–19
Wei G, Liu H, Xie M (2009) Clustering large spatial data with local-density and its application. Inf Technol J 8(4):476–485
Wu X, Kumar V, Quinlan JR, Ghosh J, Yang Q, Motoda H, McLachlan GJ, Ng A, Liu B, Philip SY, Zhou ZH (2008) Top 10 algorithms in data mining. Knowl Inf Syst 14(1):1–37
Xu D, Tian Y (2015) A comprehensive survey of clustering algorithms. Ann Data Sci 2(2):165–193
Xu R, Wunsch D (2005) Survey of clustering algorithms. IEEE Trans Neural Netw 16(3):645–678
Yang XS (2010) Firefly algorithm, nature inspired metaheuristic algorithms, 2010. LUniversityer Press, Frome
Younsi R, Wang W (2004) A new artificial immune system algorithm for clustering. In: International conference on intelligent data engineering and automated learning. Springer, Berlin, Heidelberg, pp 58–64
Yu D, Xu Z, Kao Y, Lin CT (2017) The structure and citation landscape of IEEE Transactions on Fuzzy Systems (1994–2015). IEEE Trans Fuzzy Syst 26(2):430–442
Yu JY, Chong PHJ (2005) A survey of clustering schemes for mobile ad hoc networks. IEEE Commun Surv Tutor 7(1):32–48
Žalik KR (2008) An efficient k′-means clustering algorithm. Pattern Recognit Lett 29(9):1385–1391
Žalik KR, Žalik B (2011) Validity index for clusters of different sizes and densities. Pattern Recognit Lett 32(2):221–234
Zanjireh MM, Shahrabi A, Larijani H (2013) ANCH: a new clustering algorithm for wireless sensor networks
Zhao M, Tang H, Guo J, Sun Y (2014) Data clustering using particle swarm optimization. In: Future information technology. Springer, Berlin, Heidelberg, pp 607–612
Zhao XQ, Zhou JH (2015) Improved kernel possibilistic fuzzy clustering algorithm based on invasive weed optimization. J Shanghai Jiaotong Univ (Sci) 20(2):164–170
Zhong Y, Zhang S, Zhang L (2013) Automatic fuzzy clustering based on adaptive multi-objective differential evolution for remote sensing imagery. IEEE J Sel Top Appl Earth Obs Remote Sens 6(5):2290–2301
Zhou Y, Wu H, Luo Q, Abdel-Baset M (2018) Automatic data clustering using nature-inspired symbiotic organism search algorithm. Knowl-Based Syst 163:546–557
Zhou Y, Wu H, Luo Q, Abdel-Baset M (2019) Automatic data clustering using nature-inspired symbiotic organism search algorithm. Knowl-Based Syst 163:546–557
Zou F, Chen D, Xu Q (2019) A survey of teaching–learning-based optimization. Neurocomputing 335:366–383
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that there is no conflict of interests regarding the publication of the paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Ezugwu, A.E., Shukla, A.K., Agbaje, M.B. et al. Automatic clustering algorithms: a systematic review and bibliometric analysis of relevant literature. Neural Comput & Applic 33, 6247–6306 (2021). https://doi.org/10.1007/s00521-020-05395-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-020-05395-4