Skip to main content
Log in

I/F-Race tuned firefly algorithm and particle swarm optimization for K-medoids-based clustering

  • Research Paper
  • Published:
Evolutionary Intelligence Aims and scope Submit manuscript

Abstract

Clustering is still one of the most common unsupervised learning techniques in data mining since it allows the discovery of meaningful and interesting patterns, knowledge, rules and associations from large-scale datasets. K-medoids, a variant of K-means, is a popular clustering method that attempts to find the optimal combination of K medoids from among a set of potential combinations. It has been successfully applied to solve various real-life problems owing to its simplicity and effectiveness. Nevertheless, due to the exponential number of possible combinations of K medoids, it is extremely challenging to produce the optimal one within a reasonable amount of time. Therefore, in this work, we propose to formulate the problem of K-medoids clustering as an optimization problem and then combine two effective and powerful Swarm Intelligence (SI) algorithms, namely Firefly Algorithm (FA) and Particle Swarm Optimization (PSO), to select the appropriate combination of K medoids. We extensively evaluate the proposed FA-PSO for K-medoids-based clustering, abbreviated as FA-PSO-KMED, using 10 UCI datasets. We first use the Iterated F-Race (I/F-Race) algorithm to determine the optimal parameter settings for FA and PSO. Then, we compare the results of the proposed FA-PSO-KMED with those obtained using the well-known state-of-the-art K-medoids-based clustering algorithms: PAM, CLARA and CLARANS. We also compare the results with 11 popular swarm intelligence algorithms: PSO, ABC, CS, FA, BA, APSO, EHO, HHO, SMA, AO and RSA. Experimental results and statistical analysis show that the proposed FA-PSO-KMED is very promising and demonstrates a significant improvement over the other clustering algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Abualigah L, Yousri D, Abd Elaziz M et al (2021) Aquila optimizer: a novel meta-heuristic optimization algorithm. Comput Ind Eng 157(107):250. https://doi.org/10.1016/j.cie.2021.107250

    Article  Google Scholar 

  2. Abualigah L, Abd Elaziz M, Sumari P et al (2022) Reptile search algorithm (RSA): a nature-inspired meta-heuristic optimizer. Expert Syst Appl 191(116):158. https://doi.org/10.1016/j.eswa.2021.116158

    Article  Google Scholar 

  3. Agushaka JO, Ezugwu AE, Abualigah L (2022) Dwarf mongoose optimization algorithm. Comput Methods Appl Mech Eng 391(114):570. https://doi.org/10.1016/j.cma.2022.114570

    Article  MATH  Google Scholar 

  4. Alrefai N, Ibrahim O (2022) Optimized feature selection method using particle swarm intelligence with ensemble learning for cancer classification based on microarray datasets. Neural Comput Appl 34:13513–13528. https://doi.org/10.1007/s00521-022-07147-y

    Article  Google Scholar 

  5. Armano G, Farmani MR (2016) Multiobjective clustering analysis using particle swarm optimization. Expert Syst Appl 55:184–193. https://doi.org/10.1016/j.eswa.2016.02.009

    Article  Google Scholar 

  6. Banharnsakun A (2017) A MapReduce-based artificial bee colony for large-scale data clustering. Pattern Recogn Lett 93:78–84. https://doi.org/10.1016/j.patrec.2016.07.027

    Article  Google Scholar 

  7. Benmounah Z, Meshoul S, Batouche M et al (2018) Parallel swarm intelligence strategies for large-scale clustering based on MapReduce with application to epigenetics of aging. Appl Soft Comput 69:771–783. https://doi.org/10.1016/j.asoc.2018.04.012

    Article  Google Scholar 

  8. Bousmaha R, Hamou RM, Amine A (2022) Automatic selection of hidden neurons and weights in neural networks for data classification using hybrid particle swarm optimization, multi-verse optimization based on Lévy flight. Evol Intel 15(3):1695–1714. https://doi.org/10.1007/s12065-021-00579-w

    Article  Google Scholar 

  9. Chen J, Qi X, Chen L et al (2020) Quantum-inspired ant lion optimized hybrid k-means for cluster analysis and intrusion detection. Knowl-Based Syst 203(106):167. https://doi.org/10.1016/j.knosys.2020.106167

    Article  Google Scholar 

  10. Danesh M, Shirgahi H (2017) A novel hybrid knowledge of firefly and PSO swarm intelligence algorithms for efficient data clustering. J Intell Fuzzy Syst 33(6):3529–3538. https://doi.org/10.3233/JIFS-17170

    Article  Google Scholar 

  11. Das A, Dhal KG, Ray S et al (2022) Fitness based weighted flower pollination algorithm with mutation strategies for image enhancement. Multimed Tools Appl. https://doi.org/10.1007/s11042-022-12879-z

    Article  Google Scholar 

  12. Dey A, Dey S, Bhattacharyya S et al (2020) Novel quantum inspired approaches for automatic clustering of gray level images using particle swarm optimization, spider monkey optimization and ageist spider monkey optimization algorithms. Appl Soft Comput 88(106):040. https://doi.org/10.1016/j.asoc.2019.106040

    Article  Google Scholar 

  13. Dhal KG, Das A, Ray S et al (2021) Randomly attracted rough firefly algorithm for histogram based fuzzy image clustering. Knowl-Based Syst 216(106):814. https://doi.org/10.1016/j.knosys.2021.106814

    Article  Google Scholar 

  14. Dua D, Graff C (2019) UCI machine learning repository. University of California, School of Information and Computer Science, Irvine, CA

  15. D’Urso P, De Giovanni L, Vitale V (2022) A robust method for clustering football players with mixed attributes. Ann Oper Res. https://doi.org/10.1007/s10479-022-04558-x

    Article  Google Scholar 

  16. Ezugwu AE, Ikotun AM, Oyelade OO et al (2022) A comprehensive survey of clustering algorithms: state-of-the-art machine learning applications, taxonomy, challenges, and future research prospects. Eng Appl Artif Intell 110(104):743. https://doi.org/10.1016/j.engappai.2022.104743

    Article  Google Scholar 

  17. Feng Y, Lu H, Xie W et al (2018) An improved fuzzy c-means clustering algorithm based on multi-chain quantum bee colony optimization. Wirel Pers Commun 102(2):1421–1441. https://doi.org/10.1007/s11277-017-5203-2

    Article  Google Scholar 

  18. Gao Z, Zhang C, Li Z (2022) Financial sequence prediction based on swarm intelligence algorithms and internet of things. J Supercomput 78:17470–17490. https://doi.org/10.1007/s11227-022-04572-7

    Article  Google Scholar 

  19. Ghosh P, Mali K, Das SK (2018) Chaotic firefly algorithm-based fuzzy c-means algorithm for segmentation of brain tissues in magnetic resonance images. J Vis Commun Image Represent 54:63–79. https://doi.org/10.1016/j.jvcir.2018.04.007

    Article  Google Scholar 

  20. Han J, Pei J, Kamber M (2011) Data mining: concepts and techniques. Elsevier. https://doi.org/10.1016/C2009-0-61819-5

    Book  MATH  Google Scholar 

  21. Hashim FA, Hussien AG (2022) Snake optimizer: a novel meta-heuristic optimization algorithm. Knowl-Based Syst 242(108):320. https://doi.org/10.1016/j.knosys.2022.108320

    Article  Google Scholar 

  22. Heidari AA, Mirjalili S, Faris H et al (2019) Harris hawks optimization: algorithm and applications. Futur Gener Comput Syst 97:849–872. https://doi.org/10.1016/j.future.2019.02.028

    Article  Google Scholar 

  23. Ilango SS, Vimal S, Kaliappan M et al (2019) Optimization using artificial bee colony based clustering approach for big data. Clust Comput 22(5):12169–12177. https://doi.org/10.1007/s10586-017-1571-3

    Article  Google Scholar 

  24. Jaya Mabel Rani A, Pravin A (2022) Clustering by hybrid k-means and black hole entropic fuzzy clustering algorithm for medical data. Int J Model Simul Sci Comput. https://doi.org/10.1142/S179396232341012X

    Article  Google Scholar 

  25. Karaboga D (2005) An idea based on honey bee swarm for numerical optimization. Technical report, Technical report-tr06, Erciyes university, Engineering Faculty, Computer Engineering Department

  26. Kaufman L, Rousseeuw PJ (2009) Finding groups in data: an introduction to cluster analysis, vol 344. Wiley. https://doi.org/10.1002/9780470316801

    Book  MATH  Google Scholar 

  27. Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of the IEEE international conference on neural networks, pp 1942–1948. https://doi.org/10.1109/ICNN.1995.488968

  28. Khennak I, Drias H (2017) An accelerated PSO for query expansion in web information retrieval: application to medical dataset. Appl Intell 47(3):793–808. https://doi.org/10.1007/s10489-017-0924-1

    Article  Google Scholar 

  29. Kumar A, Kumar D, Jarial S (2018) A novel hybrid k-means and artificial bee colony algorithm approach for data clustering. Decis Sci Lett 7(1):65–76. https://doi.org/10.5267/j.dsl.2017.4.003

    Article  Google Scholar 

  30. Kumar Y, Singh PK (2018) Improved cat swarm optimization algorithm for solving global optimization problems and its application to clustering. Appl Intell 48(9):2681–2697. https://doi.org/10.1007/s10489-017-1096-8

    Article  Google Scholar 

  31. Kuo RJ, Zulvia FE (2018) Automatic clustering using an improved artificial bee colony optimization for customer segmentation. Knowl Inf Syst 57(2):331–357. https://doi.org/10.1007/s10115-018-1162-5

    Article  Google Scholar 

  32. Li S, Chen H, Wang M et al (2020) Slime Mould algorithm: a new method for stochastic optimization. Futur Gener Comput Syst 111:300–323. https://doi.org/10.1016/j.future.2020.03.055

    Article  Google Scholar 

  33. Liao SH, Widowati R, Puttong P (2022) Data mining analytics investigate Facebook live stream users’ behaviors and business models: the evidence from Thailand. Entertain Comput 41(100):478. https://doi.org/10.1016/j.entcom.2022.100478

    Article  Google Scholar 

  34. López-Ibáñez M, Dubois-Lacoste J, Pérez Cáceres L et al (2016) The irace package: iterated racing for automatic algorithm configuration. Oper Res Perspect 3:43–58. https://doi.org/10.1016/j.orp.2016.09.002

    Article  Google Scholar 

  35. Majumder A (2022) Termite alate optimization algorithm: a swarm-based nature inspired algorithm for optimization problems. Evolut Intell. https://doi.org/10.1007/s12065-022-00714-1

    Article  Google Scholar 

  36. Menéndez HD, Otero FE, Camacho D (2016) Medoid-based clustering using ant colony optimization. Swarm Intell 10(2):123–145. https://doi.org/10.1007/s11721-016-0122-5

    Article  Google Scholar 

  37. Ng RT, Han J (2002) CLARANS: a method for clustering objects for spatial data mining. IEEE Trans Knowl Data Eng 14(5):1003–1016. https://doi.org/10.1109/TKDE.2002.1033770

    Article  Google Scholar 

  38. Oyelade ON, Ezugwu AES, Mohamed TI et al (2022) Ebola optimization search algorithm: a new nature-inspired metaheuristic optimization algorithm. IEEE Access 10:16150–16177. https://doi.org/10.1109/ACCESS.2022.3147821

    Article  Google Scholar 

  39. Pandey KK, Shukla D (2022) Min–max kurtosis mean distance based k-means initial centroid initialization method for big genomic data clustering. Evolut Intell. https://doi.org/10.1007/s12065-022-00720-3

    Article  Google Scholar 

  40. Prakash V, Vinothina V, Kalaiselvi K et al (2022) An improved bacterial colony optimization using opposition-based learning for data clustering. Clust Comput. https://doi.org/10.1007/s10586-022-03633-z

    Article  Google Scholar 

  41. Sancho A, Ribeiro J, Reis MS et al (2022) Cluster analysis of crude oils with k-means based on their physicochemical properties. Comput Chem Eng 157(107):633. https://doi.org/10.1016/j.compchemeng.2021.107633

    Article  Google Scholar 

  42. Tan WH, Mohamad-Saleh J (2022) Modified normative fish swarm algorithm for optimizing power extraction in photovoltaic systems. Evolut Intell. https://doi.org/10.1007/s12065-022-00724-z

    Article  Google Scholar 

  43. Tripathi AK, Sharma K, Bala M (2018) Dynamic frequency based parallel k-bat algorithm for massive data clustering (DFBPKBA). Int J Syst Assur Eng Manag 9(4):866–874. https://doi.org/10.1007/s13198-017-0665-x

    Article  Google Scholar 

  44. Verma H, Verma D, Tiwari PK (2021) A population based hybrid FCM-PSO algorithm for clustering analysis and segmentation of brain image. Expert Syst Appl 167(114):121. https://doi.org/10.1016/j.eswa.2020.114121

    Article  Google Scholar 

  45. Wang GG, Deb S, Coelho LS (2015) Elephant herding optimization. In: 2015 3rd international symposium on computational and business intelligence, pp 1–5. https://doi.org/10.1109/ISCBI.2015.8

  46. Xie H, Zhang L, Lim CP et al (2019) Improving k-means clustering with enhanced firefly algorithms. Appl Soft Comput 84(105):763. https://doi.org/10.1016/j.asoc.2019.105763

    Article  Google Scholar 

  47. Xu F, Li Z, Mao M et al (2022) LGWO-SVM geological steering identification method for shale gas based on a gamma spectral dataset. Neural Comput Appl 34(15):12317–12329. https://doi.org/10.1007/s00521-021-06570-x

    Article  Google Scholar 

  48. Yang XS (2009) Firefly algorithms for multimodal optimization. In: Proceedings of the 5th international symposium on stochastic algorithms: foundations and applications, pp 169–178. https://doi.org/10.1007/978-3-642-04944-6_14

  49. Yang XS (2010) Engineering optimization: an introduction with metaheuristic applications. Wiley. https://doi.org/10.1002/9780470640425

    Book  Google Scholar 

  50. Yang XS (2010b) A new metaheuristic bat-inspired algorithm. In: Nature inspired cooperative strategies for optimization, pp 65–74. https://doi.org/10.1007/978-3-642-12538-6_6

  51. Yang XS, Deb S (2009) Cuckoo search via lévy flights. In: 2009 world congress on nature & biologically inspired computing), pp 210–214. https://doi.org/10.1109/NABIC.2009.5393690

Download references

Acknowledgements

We would like to express our special thanks of gratitude to the Directorate General for Scientific Research and Technological Development (DGRSDT), for the support of this work under the grant number C0662300.

Funding

This work was supported by the Directorate General for Scientific Research and Technological Development (DGRSDT) under the grant number C0662300.

Author information

Authors and Affiliations

Authors

Contributions

Supervision   Habiba Drias. Concept and Design   Ilyes Khennak, Faysal Bendakir and Samy Hamdi. Data Collection and/or Processing   Ilyes Khennak, Faysal Bendakir and Samy Hamdi. Analysis and/or Interpretation   Ilyes Khennak. Literature Search   Ilyes Khennak. Manuscript Writing   Ilyes Khennak. Critical Review   Habiba Drias and Yassine Drias

Corresponding author

Correspondence to Ilyes Khennak.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Informed Consent

The authors consent the declarations and the publication of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Khennak, I., Drias, H., Drias, Y. et al. I/F-Race tuned firefly algorithm and particle swarm optimization for K-medoids-based clustering. Evol. Intel. 16, 351–373 (2023). https://doi.org/10.1007/s12065-022-00794-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12065-022-00794-z

Keywords

Navigation