Skip to main content
Log in

Artificial rabbits optimization algorithm with automatically DBSCAN clustering algorithm to similarity agent update for features selection problems

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Feature selection is one of the important steps in data mining to reduce the dimensions of datasets. Due to the fact that feature selection is inherently a NP-hard problem, no deterministic algorithm has been identified to solve this problem in acceptable time. Meta-heuristic algorithms are reliable alternatives to solve such problems in acceptable time. In the literature, a large number of algorithms have been proposed to solve the feature selection problem using meta-heuristic optimization algorithms. In this work, a new feature selection algorithm based on ARO meta-heuristic algorithm and DBSCAN clustering algorithm with automatic adjustment of input parameters (ARO-DBSCAN) is proposed. Using side algorithms to improve the performance of meta-heuristic algorithms can potentially cause getting stuck in local optima. The method proposed in the work has improved the performance of the ARO meta-heuristic for the feature selection problem without increasing the probability stuck in local optima. The use of DBSCAN clustering algorithm, which is based on density, increases the exploitation of ARO in the search space while maintaining its exploration. As a result, the performance of the ARO algorithm increases significantly in feature selection problems. The proposed algorithm is compared with 8 state-of-the-art feature selection algorithms on the UCI benchmark datasets and three real-world high-dimensional datasets. Result of experiments show the better performance of ARO-DBSCAN algorithm in the appropriate execution time. Also, in high-dimensional data, the proposed method is able to significantly reduce the number of dataset features. which makes the analysis of these datasets more efficient. The source code for the algorithm being proposed is accessible to the public on https://github.com/alihamdipour/ARO-DBSCAN.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Algorithm 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data availability and access

The datasets analyzed during the current study are publicly available in the [UCI] repository, [https://archive.ics.uci.edu/datasets].

References

  1. Fan C, Chen M, Wang X, Wang J, Huang B (2021) A review on data preprocessing techniques toward efficient and reliable knowledge discovery from building operational data. Front Energy Res 9:652801

    Article  Google Scholar 

  2. Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97(1–2):273–324

    Article  Google Scholar 

  3. Abdulwahab HM, Ajitha S, Saif MAN (2022) Feature selection techniques in the context of big data: taxonomy and analysis. Appl Intell 52(12):13568–13613

    Article  Google Scholar 

  4. Dokeroglu T, Deniz A, Kiziloz HE (2022) A comprehensive survey on recent metaheuristics for feature selection. Neurocomputing 494:269–296

    Article  Google Scholar 

  5. Abramson D, Abela J (1991) A Parallel Genetic Algorithm for Solving the School Timetabling Problem. Citeseer, ???

  6. Eberhart R, Kennedy J (1995) A new optimizer using particle swarm theory. In: MHS’95. Proceedings of the Sixth International Symposium on Micro Machine and Human Science, pp. 39–43. Ieee

  7. Juneja M, Nagar S (2016) Particle swarm optimization algorithm and its parameters: A review. In: 2016 International Conference on Control, Computing, Communication and Materials (ICCCCM), pp. 1–5. IEEE

  8. Jain M, Singh V, Rani A (2019) A novel nature-inspired algorithm for optimization: Squirrel search algorithm. Swarm Evol Comput 44:148–175

    Article  Google Scholar 

  9. Abdollahzadeh B, Soleimanian Gharehchopogh F, Mirjalili S (2021) Artificial gorilla troops optimizer: a new nature-inspired metaheuristic algorithm for global optimization problems. Int J Intell Syst 36(10):5887–5958

    Article  Google Scholar 

  10. Zervoudakis K, Tsafarakis S (2020) A mayfly optimization algorithm. Comput Ind Eng 145:106559

    Article  Google Scholar 

  11. Kaveh A, Farhoudi N (2013) A new optimization method: Dolphin echolocation. Adv Eng Softw 59:53–70

    Article  Google Scholar 

  12. Pan W-T (2012) A new fruit fly optimization algorithm: taking the financial distress model as an example. Knowl-Based Syst 26:69–74

    Article  Google Scholar 

  13. Moosavi SHS, Bardsiri VK (2017) Satin bowerbird optimizer: A new optimization algorithm to optimize anfis for software development effort estimation. Eng Appl Artif Intell 60:1–15

    Article  Google Scholar 

  14. Yang X-S (2012) Flower pollination algorithm for global optimization. In: Unconventional Computation and Natural Computation: 11th International Conference, UCNC 2012, Orléan, France, September 3-7, 2012. Proceedings 11, pp. 240–249. Springer

  15. Koçer HG, Türkoğlu B, Uymaz SA (2023) Chaotic golden ratio guided local search for big data optimization. Eng Sci Technol Int J 41:101388

    Google Scholar 

  16. Mirjalili S, Mirjalili SM, Lewis A (2014) Grey wolf optimizer. Adv Eng Softw 69:46–61

    Article  Google Scholar 

  17. Turkoglu B, Uymaz SA, Kaya E (2024) Chaotic artificial algae algorithm for solving global optimization with real-world space trajectory design problems. Arabian Journal for Science and Engineering, 1–28

  18. Uymaz O, Turkoglu B, Kaya E, Asuroglu T (2024) A novel diversity guided galactic swarm optimization with feedback mechanism. IEEE Access

  19. Uymaz SA, Tezel G, Yel E (2015) Artificial algae algorithm (aaa) for nonlinear global optimization. Appl Soft Comput 31:153–171

    Article  Google Scholar 

  20. Muthiah-Nakarajan V, Noel MM (2016) Galactic swarm optimization: a new global optimization metaheuristic inspired by galactic motion. Appl Soft Comput 38:771–787

    Article  Google Scholar 

  21. Wang L, Cao Q, Zhang Z, Mirjalili S, Zhao W (2022) Artificial rabbits optimization: A new bio-inspired meta-heuristic algorithm for solving engineering optimization problems. Eng Appl Artif Intell 114:105082

    Article  Google Scholar 

  22. Ester M, Kriegel H-P, Sander J, Xu X et al (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Kdd 96:226–231

    Google Scholar 

  23. Oreski S, Oreski G (2014) Genetic algorithm-based heuristic for feature selection in credit risk assessment. Expert Syst Appl 41(4):2052–2064

    Article  Google Scholar 

  24. Wang Y, Chen X, Jiang W, Li L, Li W, Yang L, Liao M, Lian B, Lv Y, Wang S et al (2011) Predicting human microrna precursors based on an optimized feature subset generated by ga-svm. Genomics 98(2):73–78

    Article  Google Scholar 

  25. Khammassi C, Krichen S (2017) A ga-lr wrapper approach for feature selection in network intrusion detection. Comput Security 70:255–277

    Article  Google Scholar 

  26. Wang X, Yang J, Teng X, Xia W, Jensen R (2007) Feature selection based on rough sets and particle swarm optimization. Pattern Recogn Lett 28(4):459–471

    Article  Google Scholar 

  27. Chen L-F, Su C-T, Chen K-H, Wang P-C (2012) Particle swarm optimization for feature selection with application in obstructive sleep apnea diagnosis. Neural Comput Appl 21:2087–2096

    Article  Google Scholar 

  28. Zhou Y, Lin J, Guo H (2021) Feature subset selection via an improved discretization-based particle swarm optimization. Appl Soft Comput 98:106794

    Article  Google Scholar 

  29. Yang H, Du Q, Chen G (2012) Particle swarm optimization-based hyperspectral dimensionality reduction for urban land cover classification. IEEE J Sel Topics Appl Earth Obs Remote Sensing 5(2):544–554

    Article  Google Scholar 

  30. Pramanik R, Sarkar S, Sarkar R (2022) An adaptive and altruistic pso-based deep feature selection method for pneumonia detection from chest x-rays. Appl Soft Comput 128:109464

    Article  Google Scholar 

  31. Dorigo M, Di Caro G (1999) Ant colony optimization: a new meta-heuristic. In: Proceedings of the 1999 Congress on Evolutionary computation-CEC99 (Cat. No. 99TH8406), vol. 2, pp. 1470–1477. IEEE

  32. Sivagaminathan RK, Ramakrishnan S (2007) A hybrid approach for feature subset selection using neural networks and ant colony optimization. Expert Syst Appl 33(1):49–60

    Article  Google Scholar 

  33. Kanan HR, Faez K (2008) An improved feature selection method based on ant colony optimization (aco) evaluated on face recognition system. Appl Math Comput 205(2):716–725

    Google Scholar 

  34. Aghdam MH, Ghasem-Aghaee N, Basiri ME (2009) Text feature selection using ant colony optimization. Expert Syst Appl 36(3):6843–6853

    Article  Google Scholar 

  35. Karimi F, Dowlatshahi MB, Hashemi A (2023) Semiaco: a semi-supervised feature selection based on ant colony optimization. Expert Syst Appl 214:119130

    Article  Google Scholar 

  36. Faramarzi A, Heidarinejad M, Stephens B, Mirjalili S (2020) Equilibrium optimizer: a novel optimization algorithm. Knowl-Based Syst 191:105190

    Article  Google Scholar 

  37. Ahmed S, Ghosh KK, Mirjalili S, Sarkar R (2021) Aieou: automata-based improved equilibrium optimizer with u-shaped transfer function for feature selection. Knowl-Based Syst 228:107283

    Article  Google Scholar 

  38. Mirjalili S, Lewis A (2016) The whale optimization algorithm. Adv Eng Softw 95:51–67

    Article  Google Scholar 

  39. Mafarja M, Mirjalili S (2018) Whale optimization approaches for wrapper feature selection. Appl Soft Comput 62:441–453

    Article  Google Scholar 

  40. Geem ZW, Kim JH, Loganathan GV (2001) A new heuristic optimization algorithm harmony search. simulation 76(2):60–68

    Article  Google Scholar 

  41. Ahmed S, Ghosh KK, Singh PK, Geem ZW, Sarkar R (2020) Hybrid of harmony search algorithm and ring theory-based evolutionary algorithm for feature selection. IEEE Access 8:102629–102645

    Article  Google Scholar 

  42. Mafarja M, Qasem A, Heidari AA, Aljarah I, Faris H, Mirjalili S (2020) Efficient hybrid nature-inspired binary optimizers for feature selection. Cogn Comput 12:150–175

    Article  Google Scholar 

  43. Pan H, Chen S, Xiong H (2023) A high-dimensional feature selection method based on modified gray wolf optimization. Appl Soft Comput 135:110031

    Article  Google Scholar 

  44. Balochian S, Baloochian H (2019) Social mimic optimization algorithm and engineering applications. Expert Syst Appl 134:178–191

    Article  Google Scholar 

  45. Ghosh KK, Singh PK, Hong J, Geem ZW, Sarkar R (2020) Binary social mimic optimization algorithm with x-shaped transfer function for feature selection. IEEE Access 8:97890–97906

    Article  Google Scholar 

  46. Tharwat A, Gabel T (2020) Parameters optimization of support vector machines for imbalanced data using social ski driver algorithm. Neural Comput Appl 32:6925–6938

    Article  Google Scholar 

  47. Alhussan AA, Abdelhamid AA, El-Kenawy E-SM, Ibrahim A, Eid MM, Khafaga DS, Ahmed AE (2023) A binary waterwheel plant optimization algorithm for feature selection. IEEE Access

  48. Mirjalili S (2016) Sca: a sine cosine algorithm for solving optimization problems. Knowl-Based Syst 96:120–133

    Article  Google Scholar 

  49. Takieldeen AE, El-kenawy E-SM, Hadwan M, Zaki RM (2022) Dipper throated optimization algorithm for unconstrained function and feature selection. Comput Mater Contin 72:1465–1481

    Google Scholar 

  50. Abdelhamid AA, El-Kenawy E-SM, Ibrahim A, Eid MM, Khafaga DS, Alhussan AA, Mirjalili S, Khodadadi N, Lim WH, Shams MY (2023) Innovative feature selection method based on hybrid sine cosine and dipper throated optimization algorithms. IEEE Access 11:79750–79776

    Article  Google Scholar 

  51. Bertsimas D, Tsitsiklis J (1993) Simulated annealing. Statistical science 8(1):10–15

  52. Mafarja MM, Mirjalili S (2017) Hybrid whale optimization algorithm with simulated annealing for feature selection. Neurocomputing 260:302–312

    Article  Google Scholar 

  53. Khan K, Rehman SU, Aziz K, Fong S, Sarasvady S (2014) Dbscan: Past, present and future. In: The Fifth International Conference on the Applications of Digital Information and Web Technologies (ICADIWT 2014), pp. 232–238. IEEE

  54. Schubert E, Sander J, Ester M, Kriegel HP, Xu X (2017) Dbscan revisited, revisited: why and how you should (still) use dbscan. ACM Trans Database Syst (TODS) 42(3):1–21

    Article  MathSciNet  Google Scholar 

  55. Lai W, Zhou M, Hu F, Bian K, Song Q (2019) A new dbscan parameters determination method based on improved mvo. Ieee Access 7:104085–104095

    Article  Google Scholar 

  56. Sawant K (2014) Adaptive methods for determining dbscan parameters. Int J Innov Sci, Eng Technol 1(4):329–334

    Google Scholar 

  57. Starczewski A, Goetzen P, Er MJ (2020) A new method for automatic determining of the dbscan parameters. J Artif Int Soft Comput Res 10:209

    Google Scholar 

  58. Ankerst M, Breunig MM, Kriegel H-P, Sander J (1999) Optics: ordering points to identify the clustering structure. ACM SIGMOD Rec 28(2):49–60

    Article  Google Scholar 

  59. Liu P, Zhou D, Wu N (2007) Vdbscan: varied density based spatial clustering of applications with noise. In: 2007 International Conference on Service Systems and Service Management, pp. 1–4. IEEE

Download references

Author information

Authors and Affiliations

Authors

Contributions

The authors confirm contribution to the paper as follows: Ali Hamdipour and Abdolali Basiri contributed to study conception and design; Ali Hamdipour and Mostafa Zaare contributed to data collection; Ali Hamdipour, Abdolali Basiri, and Mostafa Zaare contributed to analysis and interpretation of results; Ali Hamdipour contributed to draft manuscript preparation; Abdolali Basiri, Mostafa Zaare, Seyedali Mirjalili supervised the study; Abdolali Basiri and Seyedali Mirjalili contributed to review and edit. All authors reviewed the results and approved the final version of the manuscript.

Corresponding author

Correspondence to Abdolali Basiri.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Ethical approval

We only used publicly available data, ensuring privacy. No private or personal information was used. Therefore, formal ethical approval and consent were not necessary. We followed all legal guidelines for research ethics.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A

Details of compressions

2.1 In term of classfication accuracy

The results obtained from BSMO feature selection algorithm in comparison with the ARO-DBSCAN with respect to classification accuracy show that in two cases (BreastEW, Exactly2) BSMO algorithm had higher classification accuracy, in five cases (Breastcancer, PenglungEW, Wine, Tic-tac-toe, Zoo) they had same classification accuracy, and in the remaining cases (14 item) the ARO-DBSCAN has obtained a higher classification accuracy. Therefore, in general, the ARO-DBSCAN has performed better than BSMO algorithm in classification accuracy.

The results obtained from WOASAT-2 feature selection algorithm in comparison with the ARO-DBSCAN with respect to classification accuracy show that in any case WOASAT-2 algorithm had higher classification accuracy, in five cases (PenglungEW, Wine, Tic-tac-toe, GSAUFM, E-mails) they had same classification accuracy, and in the remaining cases (16 item) the ARO-DBSCAN has obtained a higher classification accuracy. Therefore, in general, the ARO-DBSCAN has performed better than WOASAT-2 algorithm in classification accuracy.

The results obtained from SSD-LAHC feature selection algorithm in comparison with the ARO-DBSCAN with respect to classification accuracy show that in two cases (BreastEW, Exactly2) SSD-LAHC algorithm had higher classification accuracy, in 9 cases (PenglungEW, BreastCancer, Exactly, M-of-n, Tic-tac-toe, Vote, Wine, Zoo, ARCENE) they had same classification accuracy, and in the remaining cases (10 item) the ARO-DBSCAN has obtained a higher classification accuracy. Therefore, in general, the ARO-DBSCAN has performed better than SSD-LAHC algorithm in classification accuracy.

The results obtained from RTHS feature selection algorithm in comparison with the ARO-DBSCAN with respect to classification accuracy show that RTHS algorithm had not higher classification accuracy in any case, in four cases (Tic-tac-toe, PenglungEW, Wine, Zoo, CongressEW) they had same classification accuracy, and in the remaining cases (16 item) the ARO-DBSCAN has obtained a higher classification accuracy. Therefore, in general, the ARO-DBSCAN has performed better than RTHS algorithm in classification accuracy.

The results obtained from WOA-CM feature selection algorithm in comparison with the ARO-DBSCAN with respect to classification accuracy show that in one cases (Exactly2) WOA-CM algorithm had higher classification accuracy, in three cases (Tic-tac-toe, PenglungEW, Wine) they had same classification accuracy, and in the remaining cases (17 item) the ARO-DBSCAN has obtained a higher classification accuracy. Therefore, in general, the ARO-DBSCAN has performed better than WOA-CM algorithm in classification accuracy.

The results obtained from AAPSO feature selection algorithm in comparison with the ARO-DBSCAN with respect to classification accuracy show that in one cases (Exactly2) AAPSO algorithm had higher classification accuracy, in four cases (Tic-tac-toe, Zoo, Wine, PenglungEW) they had same classification accuracy, and in the remaining cases (16 item) the ARO-DBSCAN has obtained a higher classification accuracy. Therefore, in general, the ARO-DBSCAN has performed better than AAPSO algorithm in classification accuracy.

The results obtained from AIEOU feature selection algorithm in comparison with the ARO-DBSCAN in terms of classification accuracy show that in one cases (Exactly2) AIEOU algorithm had higher classification accuracy, in three cases (Tic-tac-toe, Wine, PenglungEW) they had same classification accuracy, and in the remaining cases (17 item) the ARO-DBSCAN has obtained a higher classification accuracy. Therefore, in general, the ARO-DBSCAN has performed better than AIEOU algorithm in terms of classification accuracy.

The results obtained from ASGW feature selection algorithm in comparison with the ARO-DBSCAN with respect to classification accuracy show that in the 21 cases the ARO-DBSCAN has obtained a higher classification accuracy. Therefore, in general, the ARO-DBSCAN has performed better than ASGW algorithm in classification accuracy.

2.2 In term of NSF

Table 7 presents the comparison results of the ARO-DBSCAN with satae-of-the-art algorithms from the aspect of NSF. The results obtained from BSMO feature selection algorithm in comparison with the ARO-DBSCAN with respect to NSF show that in four cases (Ionosphere, WaveformEW, KrVsKpEW, SpectEW) BSMO algorithm had lower NSF, in two cases (BreastCancer, Tic-tac-toe) they had same NSF, and in the remaining cases (15 item) the ARO-DBSCAN has obtained a lower NSF. Therefore, in general, the ARO-DBSCAN has performed better than BSMO algorithm in NSF.

The results obtained from SSD-LAHC feature selection algorithm in comparison with the ARO-DBSCAN with respect to NSF show that in five cases (Exactly, Zoo, Exactly2, WaveformEW, KrVsKpEW) SSD-LAHC algorithm had lower NSF, in four cases (BreastCancer, M-of-n, Tic-tac-toe, Exactly) they had same NSF, and in the remaining cases (12 item) the ARO-DBSCAN has obtained a lower NSF. Therefore, in general, the ARO-DBSCAN has performed better than SSD-LAHC algorithm in NSF.

The results obtained from WOA-CM feature selection algorithm in comparison with the ARO-DBSCAN with respect to NSF show that in four cases (WaveformEW, KrVsKpEW, Zoo, CongressEW) WOA-CM algorithm had lower NSF, in one case (Tic-tac-toe) they had same NSF, and in the remaining cases (16 item) the ARO-DBSCAN has obtained a lower NSF. Therefore, in general, the ARO-DBSCAN has performed better than WOA-CM algorithm in NSF.

The results obtained from RTHS feature selection algorithm in comparison with the ARO-DBSCAN with respect to NSF show that in four cases (BreastCancer, WaveformEW, KrVsKpEW) RTHS algorithm had lower NSF, in one case (Tic-tac-toe) they had same NSF, and in the remaining cases (17 item) the ARO-DBSCAN has obtained a lower NSF. Therefore, in general, the ARO-DBSCAN has performed better than RTHS algorithm in NSF.

The results obtained from AAPSO feature selection algorithm in comparison with the ARO-DBSCAN with respect to NSF show that in three cases (Exactly2, WaveformEW, KrVsKpEW) AAPSO algorithm had lower NSF, in two cases (BreastCancer, Tic-tac-toe) they had same NSF, and in the remaining cases (16 item) the ARO-DBSCAN has obtained a lower NSF. Therefore, in general, the ARO-DBSCAN has performed better than AAPSO algorithm in NSF.

The results obtained from AIEOU feature selection algorithm in comparison with the ARO-DBSCAN with respect to NSF show that in six cases (BreastCancer, Zoo, WaveformEW, KrVsKpEW, SpectEW, CongressEW) AIEOU algorithm had lower NSF, in one case (Tic-tac-toe) they had same NSF, and in the remaining cases (14 item) the ARO-DBSCAN has obtained a lower NSF. Therefore, in general, the ARO-DBSCAN has performed better than AIEOU algorithm in NSF.

The results obtained from ASGW feature selection algorithm in comparison with the ARO-DBSCAN with respect to NSF show that in three cases (Tic-tac-toe, WaveformEW, KrVsKpEW) ASGW algorithm had lower NSF and in the remaining cases (18 item) the ARO-DBSCAN has obtained a lower NSF. Therefore, in general, the ARO-DBSCAN has performed better than ASGW algorithm in NSF.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hamdipour, A., Basiri, A., Zaare, M. et al. Artificial rabbits optimization algorithm with automatically DBSCAN clustering algorithm to similarity agent update for features selection problems. J Supercomput 81, 150 (2025). https://doi.org/10.1007/s11227-024-06606-8

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11227-024-06606-8

Keywords