Skip to main content
Log in

Hybrid binary arithmetic optimization algorithm with simulated annealing for feature selection in high-dimensional biomedical data

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Gene expression data play a significant role in the development of effective cancer diagnosis and prognosis techniques. However, many redundant, noisy, and irrelevant genes (features) are present in the data, which negatively affect the predictive accuracy of diagnosis and increase the computational burden. To overcome these challenges, a new hybrid filter/wrapper gene selection method, called mRMR-BAOAC-SA, is put forward in this article. The suggested method uses Minimum Redundancy Maximum Relevance (mRMR) as a first-stage filter to pick top-ranked genes. Then, Simulated Annealing (SA) and a crossover operator are introduced into Binary Arithmetic Optimization Algorithm (BAOA) to propose a novel hybrid wrapper feature selection method that aims to discover the smallest set of informative genes for classification purposes. BAOAC-SA is an enhanced version of the BAOA in which SA and crossover are used to help the algorithm in escaping local optima and enhancing its global search capabilities. The proposed method was evaluated on 10 well-known microarray datasets, and its results were compared to other current state-of-the-art gene selection methods. The experimental results show that the proposed approach has a better performance compared to the existing methods in terms of classification accuracy and the minimum number of selected genes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Chaudhuri A, Sahu TP (2021) A hybrid feature selection method based on binary Jaya algorithm for micro-array data classification. Comput Electr Eng 90:106963. https://doi.org/10.1016/j.compeleceng.2020.106963

    Article  Google Scholar 

  2. Pashaei E, Pashaei E (2021) Gene selection using hybrid dragonfly black hole algorithm: a case study on RNA-seq COVID-19 data. Anal Biochem 627:114242. https://doi.org/10.1016/j.ab.2021.114242

    Article  Google Scholar 

  3. Zhang G, Hou J, Wang J et al (2020) Feature selection for microarray data classification using hybrid information gain and a modified binary krill herd algorithm. Interdiscip Sci Comput Life Sci 12:288–301. https://doi.org/10.1007/s12539-020-00372-w

    Article  Google Scholar 

  4. Dashtban M, Balafar M (2017) Gene selection for microarray cancer classification using a new evolutionary method employing artificial intelligence concepts. Genomics 109:91–107. https://doi.org/10.1016/j.ygeno.2017.01.004

    Article  Google Scholar 

  5. Alomari OA, Khader AT, Al-Betar MA, Awadallah MA (2018) A novel gene selection method using modified MRMR and hybrid bat-inspired algorithm with β-hill climbing. Appl Intell 48:4429–4447. https://doi.org/10.1007/s10489-018-1207-1

    Article  Google Scholar 

  6. Alomari OA, Makhadmeh SN, Al-Betar MA et al (2021) Gene selection for microarray data classification based on gray wolf optimizer enhanced with TRIZ-inspired operators. Knowl Based Syst 223:107034. https://doi.org/10.1016/J.KNOSYS.2021.107034

    Article  Google Scholar 

  7. Dabba A, Tari A, Meftali S, Mokhtari R (2021) Gene selection and classification of microarray data method based on mutual information and moth flame algorithm. Expert Syst Appl 166:114012. https://doi.org/10.1016/J.ESWA.2020.114012

    Article  Google Scholar 

  8. Yan C, Ma J, Luo H, Patel A (2019) Hybrid binary coral reefs optimization algorithm with simulated annealing for feature selection in high-dimensional biomedical datasets. Chemom Intell Lab Syst 184:102–111. https://doi.org/10.1016/j.chemolab.2018.11.010

    Article  Google Scholar 

  9. Abualigah L, Diabat A, Mirjalili S et al (2021) The arithmetic optimization algorithm. Comput Methods Appl Mech Eng 376:113609. https://doi.org/10.1016/J.CMA.2020.113609

    Article  MathSciNet  MATH  Google Scholar 

  10. Bansal P, Gehlot K, Singhal A, Gupta A (2022) Automatic detection of osteosarcoma based on integrated features and feature selection using binary arithmetic optimization algorithm. Multimed Tools Appl 81:8807–8834. https://doi.org/10.1007/S11042-022-11949-6/TABLES/6

    Article  Google Scholar 

  11. Agushaka JO, Ezugwu AE (2021) Advanced arithmetic optimization algorithm for solving mechanical engineering design problems. PLoS ONE 16:e0255703. https://doi.org/10.1371/JOURNAL.PONE.0255703

    Article  Google Scholar 

  12. Premkumar M, Jangir P, Kumar BS et al (2021) A new arithmetic optimization algorithm for solving real-world multiobjective CEC-2021 constrained optimization problems: diversity analysis and validations. IEEE Access 9:84263–84295. https://doi.org/10.1109/ACCESS.2021.3085529

    Article  Google Scholar 

  13. Chauhan S, Vashishtha G (2021) Mutation-based arithmetic optimization algorithm for global optimization. In: 2021 Int Conf Intell Technol (CONIT). https://doi.org/10.1109/CONIT51480.2021.9498358

  14. Ewees AA, Al-qaness MAA, Abualigah L et al (2021) Boosting arithmetic optimization algorithm with genetic algorithm operators for feature selection: case study on cox proportional hazards model. Mathematics 9:2321. https://doi.org/10.3390/MATH9182321

    Article  Google Scholar 

  15. Ibrahim RA, Abualigah L, Ewees AA et al (2021) An electric fish-based arithmetic optimization algorithm for feature selection. Entropy 2021 23:1189. https://doi.org/10.3390/E23091189

    Article  MathSciNet  Google Scholar 

  16. Abualigah L, Diabat A, Sumari P, Gandomi AH (2021) A novel evolutionary arithmetic optimization algorithm for multilevel thresholding segmentation of COVID-19 CT images. Processes 9:1155. https://doi.org/10.3390/PR9071155

    Article  Google Scholar 

  17. Khatir S, Tiachacht S, Le Thanh C et al (2021) An improved artificial neural network using arithmetic optimization algorithm for damage assessment in FGM composite plates. Compos Struct 273:114287. https://doi.org/10.1016/J.COMPSTRUCT.2021.114287

    Article  Google Scholar 

  18. Mafarja M, Mirjalili S (2017) Hybrid whale optimization algorithm with simulated annealing for feature selection. Neurocomputing 260:302–312. https://doi.org/10.1016/j.neucom.2017.04.053

    Article  Google Scholar 

  19. Abdel-Basset M, Ding W, El-Shahat D (2021) A hybrid Harris Hawks optimization algorithm with simulated annealing for feature selection. Artif Intell Rev 54:593–637. https://doi.org/10.1007/s10462-020-09860-3

    Article  Google Scholar 

  20. Khamees M, Albakry A, Shaker K (2018) Multi-objective feature selection: hybrid of Salp Swarm and simulated annealing approach. In: Al-mamory SO, Alwan JK, Hussein AD (eds) Al-mamory S, Alwan J, Hussein A (eds) New Trends in Information and Communications Technology Applications. NTICT 2018. Communications in Computer and Information Science. Springer, Cham, pp 129–142

  21. Chantar H, Tubishat M, Essgaer M, Mirjalili S (2021) Hybrid binary dragonfly algorithm with simulated annealing for feature selection. SN Comput Sci 2:1–11. https://doi.org/10.1007/s42979-021-00687-5

    Article  Google Scholar 

  22. Shukla AK, Singh P, Vardhan M (2019) A new hybrid wrapper TLBO and SA with SVM approach for gene expression data. Inf Sci (Ny) 503:238–254. https://doi.org/10.1016/j.ins.2019.06.063

    Article  MathSciNet  Google Scholar 

  23. Pandey AC, Rajpoot DS (2019) Feature selection method based on grey wolf optimization and simulated annealing. Recent Adv Comput Sci Commun 14:635–646. https://doi.org/10.2174/2213275912666190408111828

    Article  Google Scholar 

  24. Pashaei E, Pashaei E (2019) Gene selection using intelligent dynamic genetic algorithm and random forest. In: 11th International Conference on Electrical and Electronics Engineering (ELECO), pp 470–474

  25. Paniri M, Dowlatshahi MB, Nezamabadi-pour H (2020) MLACO: A multi-label feature selection algorithm based on ant colony optimization. Knowl Based Syst 192:105285. https://doi.org/10.1016/J.KNOSYS.2019.105285

    Article  Google Scholar 

  26. Tabakhi S, Moradi P (2015) Relevance-redundancy feature selection based on ant colony optimization. Pattern Recognit 48:2798–2811. https://doi.org/10.1016/j.patcog.2015.03.020

    Article  Google Scholar 

  27. Gao L, Ye M, Lu X, Huang D (2017) Hybrid method based on information gain and support vector machine for gene selection in cancer classification. Genomics Proteomics Bioinform 15:389–395. https://doi.org/10.1016/j.gpb.2017.08.002

    Article  Google Scholar 

  28. Al-Betar MA, Alomari OA, Abu-Romman SM (2020) A TRIZ-inspired bat algorithm for gene selection in cancer classification. Genomics 112:114–126. https://doi.org/10.1016/j.ygeno.2019.09.015

    Article  Google Scholar 

  29. Pashaei E, Ozen M, Aydin N (2016) Biomarker discovery based on BBHA and AdaboostM1 on microarray data for cancer classification. In: Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS. Institute of Electrical and Electronics Engineers Inc., pp 3080–3083

  30. Dash R (2021) An adaptive harmony search approach for gene selection and classification of high dimensional medical data. J King Saud Univ Comput Inf Sci 33:195–207. https://doi.org/10.1016/j.jksuci.2018.02.013

    Article  Google Scholar 

  31. Shukla AK, Singh P, Vardhan M (2020) An adaptive inertia weight teaching-learning-based optimization algorithm and its applications. Appl Math Model 77:309–326. https://doi.org/10.1016/j.apm.2019.07.046

    Article  MathSciNet  MATH  Google Scholar 

  32. Bir-Jmel A, Douiri SM, Elbernoussi S (2019) Gene selection via a new hybrid ant colony optimization algorithm for cancer classification in high-dimensional data. Comput Math Methods Med 2019:1–20. https://doi.org/10.1155/2019/7828590

    Article  MATH  Google Scholar 

  33. Kundu R, Chattopadhyay S, Cuevas E, Sarkar R (2022) AltWOA: altruistic whale optimization algorithm for feature selection on microarray datasets. Comput Biol Med 144:105349. https://doi.org/10.1016/J.COMPBIOMED.2022.105349

    Article  Google Scholar 

  34. Ghobaei-Arani M (2021) A workload clustering-based resource provisioning mechanism using biogeography based optimization technique in the cloud based systems. Soft Comput 25:3813–3830. https://doi.org/10.1007/S00500-020-05409-2/FIGURES/11

    Article  Google Scholar 

  35. Ghobaei-Arani M, Shahidinejad A (2021) An efficient resource provisioning approach for analyzing cloud workloads: a metaheuristic-based clustering approach. J Supercomput 77:711–750. https://doi.org/10.1007/S11227-020-03296-W/FIGURES/14

    Article  Google Scholar 

  36. Aslanpour MS, Dashti SE, Ghobaei-Arani M, Rahmanian AA (2018) Resource provisioning for cloud applications: a 3-D, provident and flexible approach. J Supercomput 74:6470–6501. https://doi.org/10.1007/S11227-017-2156-X/FIGURES/20

    Article  Google Scholar 

  37. Jain I, Jain VK, Jain R (2018) Correlation feature selection based improved-binary particle swarm optimization for gene selection and cancer classification. Appl Soft Comput J 62:203–215. https://doi.org/10.1016/j.asoc.2017.09.038

    Article  Google Scholar 

  38. Pashaei E, Ozen M, Aydin N (2016) Gene selection and classification approach for microarray data based on random forest ranking and BBHA. In: 3rd IEEE EMBS International Conference on Biomedical and Health Informatics, BHI 2016. Institute of Electrical and Electronics Engineers Inc., pp 308–311

  39. Pashaei E, Aydin N (2017) Binary black hole algorithm for feature selection and classification on biological data. Appl Soft Comput J 56:94–106. https://doi.org/10.1016/j.asoc.2017.03.002

    Article  Google Scholar 

  40. Pashaei E, Pashaei E, Aydin N (2019) Gene selection using hybrid binary black hole algorithm and modified binary particle swarm optimization. Genomics 111:669–686. https://doi.org/10.1016/j.ygeno.2018.04.004

    Article  Google Scholar 

  41. Wang H, Jing X, Niu B (2017) A discrete bacterial algorithm for feature selection in classification of microarray gene expression cancer data. Knowl Based Syst 126:8–19. https://doi.org/10.1016/j.knosys.2017.04.004

    Article  Google Scholar 

  42. Shukla AK, Singh P, Vardhan M (2018) A hybrid gene selection method for microarray recognition. Biocybern Biomed Eng 38:975–991. https://doi.org/10.1016/j.bbe.2018.08.004

    Article  Google Scholar 

  43. Wang A, An N, Chen G et al (2015) Accelerating wrapper-based feature selection with K-nearest-neighbor. Knowl Based Syst 83:81–91. https://doi.org/10.1016/j.knosys.2015.03.009

    Article  Google Scholar 

  44. Wang A, An N, Yang J et al (2017) Wrapper-based gene selection with Markov blanket. Comput Biol Med 81:11–23. https://doi.org/10.1016/j.compbiomed.2016.12.002

    Article  Google Scholar 

  45. Lu H, Chen J, Yan K et al (2017) A hybrid feature selection algorithm for gene expression data classification. Neurocomputing 256:56–62. https://doi.org/10.1016/j.neucom.2016.07.080

    Article  Google Scholar 

  46. Tran B, Xue B, Zhang M (2019) Variable-length particle swarm optimization for feature selection on high-dimensional classification. IEEE Trans Evol Comput 23:473–487. https://doi.org/10.1109/TEVC.2018.2869405

    Article  Google Scholar 

  47. Bolón-Canedo V, Sánchez-Maroño N, Alonso-Betanzos A (2015) Distributed feature selection: An application to microarray data classification. Appl Soft Comput J 30:136–150. https://doi.org/10.1016/j.asoc.2015.01.035

    Article  Google Scholar 

  48. Zhou Y, Zhang W, Kang J et al (2021) A problem-specific non-dominated sorting genetic algorithm for supervised feature selection. Inf Sci (Ny) 547:841–859. https://doi.org/10.1016/j.ins.2020.08.083

    Article  MathSciNet  MATH  Google Scholar 

  49. Mollaee M, Moattar MH (2016) A novel feature extraction approach based on ensemble feature selection and modified discriminant independent component analysis for microarray data classification. Biocybern Biomed Eng 36:521–529. https://doi.org/10.1016/j.bbe.2016.05.001

    Article  Google Scholar 

  50. Pashaei E, Yilmaz A, Aydin N (2016) A combined SVM and Markov model approach for splice site identification. In: 6th International Conference on Computer and Knowledge Engineering (ICCKE 2016), pp 200–204

  51. Medjahed SA, Saadi TA, Benyettou A, Ouali M (2017) Kernel-based learning and feature selection analysis for cancer diagnosis. Appl Soft Comput J 51:39–48. https://doi.org/10.1016/j.asoc.2016.12.010

    Article  Google Scholar 

  52. Ahmad Alomari O, Tajudin Khader A, Azmi Al-Betar M, Mohammad Abualigah L (2017) Gene selection for cancer classification by combining minimum redundancy maximum relevancy and bat-inspired algorithm. Int J Data Min Bioinform 19:32–51. https://doi.org/10.1504/IJDMB.2017.088538

    Article  Google Scholar 

  53. Shreem SS, Abdullah S, Nazri MZA (2014) Hybridising harmony search with a Markov blanket for gene selection problems. Inf Sci (Ny) 258:108–121. https://doi.org/10.1016/j.ins.2013.10.012

    Article  MathSciNet  Google Scholar 

  54. Zhu Z, Ong YS, Dash M (2007) Markov blanket-embedded genetic algorithm for gene selection. Pattern Recognit 40:3236–3248. https://doi.org/10.1016/j.patcog.2007.02.007

    Article  MATH  Google Scholar 

  55. Apolloni J, Leguizamón G, Alba E (2016) Two-hybrid wrapper-filter feature selection algorithms applied to high-dimensional microarray experiments. Appl Soft Comput J 38:922–932. https://doi.org/10.1016/j.asoc.2015.10.037

    Article  Google Scholar 

  56. Delahaye D, Chaimatanan S, Mongeau M (2019) Simulated Annealing: From basics to applications. In: Handbook of Metaheuristics. International Series in Operations Research and Management Science. Springer, Cham, pp 1–35

  57. Hameed SS, Hassan WH, Latiff LA, Muhammadsharif FF (2021) A comparative study of nature-inspired metaheuristic algorithms using a three-phase hybrid approach for gene selection and classification in high-dimensional cancer datasets. Soft Comput 2513(25):8683–8701. https://doi.org/10.1007/S00500-021-05726-0

    Article  Google Scholar 

  58. Mafarja M, Mirjalili S (2018) Whale optimization approaches for wrapper feature selection. Appl Soft Comput J 62:441–453. https://doi.org/10.1016/j.asoc.2017.11.006

    Article  Google Scholar 

  59. Pashaei E, Pashaei E (2020) Gene selection for cancer classification using a new hybrid of binary black hole algorithm. In: The 28th IEEE Conference on Signal Processing and Communications Applications (SIU2020). Institute of Electrical and Electronics Engineers Inc.

  60. Dabba A, Tari A, Meftali S (2021) Hybridization of moth flame optimization algorithm and quantum computing for gene selection in microarray data. J Ambient Intell Humaniz Comput 12:2731–2750. https://doi.org/10.1007/s12652-020-02434-9

    Article  Google Scholar 

  61. Bommert A, Sun X, Bischl B et al (2020) Benchmark for filter methods for feature selection in high-dimensional classification data. Comput Stat Data Anal 143:1–19. https://doi.org/10.1016/j.csda.2019.106839

    Article  MathSciNet  MATH  Google Scholar 

  62. Pashaei E, Ozen M, Aydin N (2016) Random forest in splice site prediction of human genome. In: Kyriacou E, Christofides S, Pattichis C (eds) XIV Mediterranean Conference on Medical and Biological Engineering and Computing. IFMBE Proceedings, vol 57. Springer, Cham, pp 518–523

  63. Pashaei E, Yilmaz A, Ozen M, Aydin N (2016) A novel method for splice sites prediction using sequence component and hidden Markov model. In: Proceedings of the 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS. Institute of Electrical and Electronics Engineers Inc., pp 3076–3079

  64. Mirjalili S, Lewis A (2013) S-shaped versus V-shaped transfer functions for binary particle swarm optimization. Swarm Evol Comput 9:1–14. https://doi.org/10.1016/j.swevo.2012.09.002

    Article  Google Scholar 

  65. Beheshti Z (2021) UTF: Upgrade transfer function for binary meta-heuristic algorithms. Appl Soft Comput 106:1–28. https://doi.org/10.1016/j.asoc.2021.107346

    Article  Google Scholar 

  66. Ambroise C, McLachlan GJ (2002) Selection bias in gene extraction on the basis of microarray gene-expression data. Proc Natl Acad Sci 99:6562–6566. https://doi.org/10.1073/pnas.102102699

    Article  MATH  Google Scholar 

  67. Wenric S, Shemirani R (2018) Using supervised learning methods for gene selection in RNA-Seq case-control studies. Front Genet 9:297. https://doi.org/10.3389/FGENE.2018.00297/BIBTEX

    Article  Google Scholar 

  68. Feng J, Niu X, Zhang J, Wang JH (2022) Gene selection and classification of scRNA-seq data combining information gain ratio and genetic algorithm with dynamic crossover. Wirel Commun Mob Comput 2022:1–16. https://doi.org/10.1155/2022/9639304

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Contributions

EP and EP designed the model and the computational framework. Both carried out the implementation and performed the experiment and wrote the manuscript.

Corresponding author

Correspondence to Elham Pashaei.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pashaei, E., Pashaei, E. Hybrid binary arithmetic optimization algorithm with simulated annealing for feature selection in high-dimensional biomedical data. J Supercomput 78, 15598–15637 (2022). https://doi.org/10.1007/s11227-022-04507-2

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-022-04507-2

Keywords

Navigation