Skip to main content
Log in

Dynamic evolutionary data and text document clustering approach using improved Aquila optimizer based arithmetic optimization algorithm and differential evolution

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Data and text clustering are popular and frequently used in the data mining domain, mainly to deal with big data analytics. The main problem in these techniques is finding the most coherent clusters allocating similar-related objects into one group. In this paper, an improved clustering analysis approach is proposed using an advanced optimization method called AOAOA. The proposed AOAOA method improved the Aquila optimizer (AO) search performance by the operators of the arithmetic optimization algorithms (AOA) and differential evolution (DE) and using a novel transition mechanism. The primary motivation for this modification is that the original optimizer suffers from local optima stagnation and lacks search balance. Thus, the proposed AOAOA overcame these shortcomings by integrating various powerful search strategies and a new update strategy. Experiments are conducted on two parts; eight standard data clustering datasets and ten text documents benchmark datasets to evaluate the performance of the proposed AOAOA method. The proposed method is compared against several well-known optimization algorithms and advanced state-of-the-art methods published in the literature. The data clustering results also showed promising performance for the proposed AOAOA compared to other comparative data clustering methods. Moreover, the results illustrated that the proposed AOAOA can find new best solutions for several different complicated cases as the text document clustering results. The proposed AOAOA got accurate and robust results compared to several state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Chuang L-Y, Hsiao C-J, Yang C-H (2011) Chaotic particle swarm optimization for data clustering. Expert Syst Appl 38(12):14555–14563

    Article  Google Scholar 

  2. Gandomi AH, Chen F, Abualigah L (2022) Machine learning technologies for big data analytics. Electronics 11(3):421

    Article  Google Scholar 

  3. Paul D, Saha S, Kumar A et al (2021) Evolutionary multi-objective optimization based overlapping subspace clustering. Pattern Recogn Lett 145:208–215

    Article  Google Scholar 

  4. Saini N, Saha S, Jangra A, Bhattacharyya P (2019) Extractive single document summarization using multi-objective optimization: exploring self-organized differential evolution, grey wolf optimizer and water cycle algorithm. Knowl-Based Syst 164:45–67

    Article  Google Scholar 

  5. Song W, Qiao Y, Park SC, Qian X (2015) A hybrid evolutionary computation approach with its application for optimizing text document clustering. Expert Syst Appl 42(5):2517–2524

    Article  Google Scholar 

  6. Hassani H, Beneki C, Unger S, Mazinani MT, Yeganegi MR (2020) Text mining in big data analytics. Big Data Cogn Comput 4(1):1

    Article  Google Scholar 

  7. Chen J, Gong Z, Liu W (2020) A Dirichlet process biterm-based mixture model for short text stream clustering. Appl Intell 50(5):1609–1619

    Article  Google Scholar 

  8. Abualigah LM, Khader AT, Al-Betar MA, Alomari OA (2017) Text feature selection with a robust weight scheme and dynamic dimension reduction to text document clustering. Expert Syst Appl 84:24–36

    Article  Google Scholar 

  9. Zhao L, Zhao T, Sun T, Liu Z, Chen Z (2020) Multi-view robust feature learning for data clustering. IEEE Signal Process Lett 27:1750–1754

    Article  Google Scholar 

  10. Abualigah LM, Khader AT, Hanandeh ES, Gandomi AH (2017) A novel hybridization strategy for krill herd algorithm applied to clustering techniques. Appl Soft Comput 60:423–435

    Article  Google Scholar 

  11. Zubaroğlu A, Atalay V (2021) Data stream clustering: a review. Artif Intell Rev 54:1201–1236

    Article  Google Scholar 

  12. Abualigah L, Gandomi AH, Elaziz MA, Hamad HA, Omari M, Alshinwan M, Khasawneh AM (2021) Advances in meta-heuristic optimization algorithms in big data text clustering. Electronics 10(2):101

    Article  Google Scholar 

  13. Abualigah L, Diabat A, Geem ZW (2020) A comprehensive survey of the harmony search algorithm in clustering applications. Appl Sci 10(11):3827

    Article  Google Scholar 

  14. Selvaraj S, Choi E (2021) Swarm intelligence algorithms in text document clustering with various benchmarks. Sensors 21(9):3196

    Article  Google Scholar 

  15. Wu D, Yang R, Shen C (2021) Sentiment word co-occurrence and knowledge pair feature extraction based LDA short text clustering algorithm. J Intell Inf Syst 56:1–23

    Article  Google Scholar 

  16. Oyelade ON, Ezugwu AE, Mohamed TI, Abualigah L (2022) Ebola optimization search algorithm: a new nature-inspired metaheuristic algorithm. IEEE Access 10:16150–16177

    Article  Google Scholar 

  17. Agushaka JO, Ezugwu AE, Abualigah L (2022) Dwarf mongoose optimization algorithm. Comput Methods Appl Mech Eng 391:114570

    Article  MathSciNet  MATH  Google Scholar 

  18. Zhou Y, Wu H, Luo Q, Abdel-Baset M (2019) Automatic data clustering using nature-inspired symbiotic organism search algorithm. Knowl-Based Syst 163:546–557

    Article  Google Scholar 

  19. Thirumoorthy K, Muneeswaran K (2021) A hybrid approach for text document clustering using Jaya optimization algorithm. Expert Syst Appl 178:115040

    Article  Google Scholar 

  20. Purushothaman R, Rajagopalan S, Dhandapani G (2020) Hybridizing gray wolf optimization (GWO) with grasshopper optimization algorithm (GOA) for text feature selection and clustering. Appl Soft Comput 96:106651

    Article  Google Scholar 

  21. Rahnema N, Gharehchopogh FS (2020) An improved artificial bee colony algorithm based on whale optimization algorithm for data clustering. Multimed Tools Appl 79(43):32169–32194

    Article  Google Scholar 

  22. Abualigah LMQ (2019) Feature selection and enhanced krill herd algorithm for text document clustering. Springer, Berlin

    Book  Google Scholar 

  23. Bharti KK, Singh PK (2016) Chaotic gradient artificial bee colony for text clustering. Soft Comput 20(3):1113–1126

    Article  Google Scholar 

  24. Li Y, Chung SM, Holt JD (2008) Text document clustering based on frequent word meaning sequences. Data Knowl Eng 64(1):381–404

    Article  Google Scholar 

  25. Janani R, Vijayarani S (2019) Text document clustering using spectral clustering algorithm with particle swarm optimization. Expert Syst Appl 134:192–200

    Article  Google Scholar 

  26. Forsati R, Mahdavi M, Shamsfard M, Meybodi MR (2013) Efficient stochastic algorithms for document clustering. Inf Sci 220:269–291

    Article  MathSciNet  Google Scholar 

  27. Forsati R, Keikha A, Shamsfard M (2015) An improved bee colony optimization algorithm with an application to document clustering. Neurocomputing 159:9–26

    Article  Google Scholar 

  28. Basu T, Murthy C (2015) A similarity assessment technique for effective grouping of documents. Inf Sci 311:149–162

    Article  Google Scholar 

  29. Ding C, Utiyama M, Sumita E (2018) NOVA: A feasible and flexible annotation system for joint tokenization and part-of-speech tagging. ACM Trans Asian Low-Resour Lang Inf Proces 18(2):1–18

    Article  Google Scholar 

  30. Sangaiah AK, Fakhry AE, Abdel-Basset M, El-henawy I (2019) Arabic text clustering using improved clustering algorithms with dimensionality reduction. Clust Comput 22(2):4535–4549

    Article  Google Scholar 

  31. Willett P The porter stemming algorithm: then and now. Program. https://www.emerald.com/insight/content/doi/10.1108/00330330610681295/full/html?casa_token=K6S89sCwui4AAAAA:vEJfHGxrrgOeSukYuqYiQTbnwJK51ZRxrOsuiQDfBgo3XUyY6VuwIuT3_aT_3Fb9J-42JoGiiYUOkZbdF3P7zIZh6xCtjJutRsVwr36G2-V-u3CRboE

  32. Salton G, Wong A, Yang C-S (1975) A vector space model for automatic indexing. Commun ACM 18(11):613–620

    Article  MATH  Google Scholar 

  33. Bafna P, Pramod D, Vaidya A (2016) Document clustering: TF-IDF approach. In: 2016 International conference on electrical, electronics, and optimization techniques (ICEEOT). IEEE, pp 61–66

  34. Hatamlou A (2013) Black hole: a new heuristic optimization approach for data clustering. Inf Sci 222:175–184

    Article  MathSciNet  Google Scholar 

  35. Abualigah L, Yousri D, Abd Elaziz M, Ewees AA, Al-qaness MA, Gandomi AH (2021) Aquila optimizer: a novel meta-heuristic optimization algorithm. Comput Ind Eng 157:107250

    Article  Google Scholar 

  36. Abualigah L, Diabat A, Mirjalili S, Abd Elaziz M, Gandomi AH (2021) The arithmetic optimization algorithm. Comput Methods Appl Mech Eng 376:113609

    Article  MathSciNet  MATH  Google Scholar 

  37. Storn R, Price K (1997) Differential evolution—a simple and efficient heuristic for global optimization over continuous spaces. J Global Optim 11(4):341–359

    Article  MathSciNet  MATH  Google Scholar 

  38. Mirjalili S, Gandomi AH, Mirjalili SZ, Saremi S, Faris H, Mirjalili SM (2017) Salp swarm algorithm: a bio-inspired optimizer for engineering design problems. Adv Eng Softw 114:163–191

    Article  Google Scholar 

  39. Mirjalili S (2016) SCA: a sine cosine algorithm for solving optimization problems. Knowl-Based Syst 96:120–133

    Article  Google Scholar 

  40. Mirjalili S (2016) Dragonfly algorithm: a new meta-heuristic optimization technique for solving single-objective, discrete, and multi-objective problems. Neural Comput Appl 27(4):1053–1073

    Article  MathSciNet  Google Scholar 

  41. Mirjalili S, Mirjalili SM, Lewis A (2014) Grey wolf optimizer. Adv Eng Softw 69:46–61

    Article  Google Scholar 

  42. Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of ICNN’95—international conference on neural networks, vol 4. IEEE, pp 1942–1948

  43. Mirjalili S (2015) The ant lion optimizer. Adv Eng Softw 83:80–98

    Article  Google Scholar 

  44. Faramarzi A, Heidarinejad M, Mirjalili S, Gandomi AH (2020) Marine predators algorithm: a nature-inspired metaheuristic. Expert Syst Appl 152:113377

    Article  Google Scholar 

  45. Faramarzi A, Heidarinejad M, Stephens B, Mirjalili S (2020) Equilibrium optimizer: a novel optimization algorithm. Knowl-Based Syst 191:105190

    Article  Google Scholar 

  46. Abd Elaziz M, Mirjalili S (2019) A hyper-heuristic for improving the initial population of whale optimization algorithm. Knowl-Based Syst 172:42–63

    Article  Google Scholar 

  47. Jouhari H, Lei D, Al-qaness MAA, Abd Elaziz M, Ewees AA, Farouk O (2019) Sine–cosine algorithm to enhance simulated annealing for unrelated parallel machine scheduling with setup times. Mathematics 7(11):1120

    Article  Google Scholar 

  48. Abualigah L, Diabat A, Sumari P, Gandomi AH (2021) A novel evolutionary arithmetic optimization algorithm for multilevel thresholding segmentation of Covid-19 CT images. Processes 9(7):1155

    Article  Google Scholar 

  49. Bouyer A, Hatamlou A (2018) An efficient hybrid clustering method based on improved cuckoo optimization and modified particle swarm optimization algorithms. Appl Soft Comput 67:172–182

    Article  Google Scholar 

  50. Tan Y, Tan G-Z, Deng S-G (2014) Hybrid particle swarm optimization with chaotic search for solving integer and mixed integer programming problems. J Cent South Univ 21(7):2731–2742

    Article  Google Scholar 

  51. Zhou Y, Zhou Y, Luo Q, Abdel-Basset M (2017) A simplex method-based social spider optimization algorithm for clustering analysis. Eng Appl Artif Intell 64:67–82

    Article  Google Scholar 

  52. Boushaki SI, Kamel N, Bendjeghaba O (2018) A new quantum chaotic cuckoo search algorithm for data clustering. Expert Syst Appl 96:358–372

    Article  MATH  Google Scholar 

  53. Kartous W, Layeb A, Chikhi S (2014) A new quantum cuckoo search algorithm for multiple sequence alignment. J Intell Syst 23(3):261–275

    Article  Google Scholar 

  54. Bouyer A, Ghafarzadeh H, Tarkhaneh O (2015) An efficient hybrid algorithm using cuckoo search and differential evolution for data clustering. Indian J Sci Technol 8(24):1–12

    Article  Google Scholar 

  55. Jadhav AN, Gomathi N (2018) WGC: hybridization of exponential grey wolf optimizer with whale optimization for data clustering. Alex Eng J 57(3):1569–1584

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank the Deanship of Scientific Research at Umm Al-Qura University for supporting this work by Grant Code: (22UQU4320277DSR04).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Laith Abualigah.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Abualigah, L., Almotairi, K.H. Dynamic evolutionary data and text document clustering approach using improved Aquila optimizer based arithmetic optimization algorithm and differential evolution. Neural Comput & Applic 34, 20939–20971 (2022). https://doi.org/10.1007/s00521-022-07571-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-022-07571-0

Keywords

Navigation