Skip to main content

Advertisement

Log in

Modified marine predators algorithm for feature selection: case study metabolomics

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Feature selection (FS) is a necessary process applied to reduce the high dimensionality of the dataset. It is utilized to obtain the most relevant information and reduce the computational efforts of the classification process. Recently, metaheuristics methods have been widely employed for various optimization problems, including FS. In the current study, we present an FS method based on a new modified version of the marine predators algorithm (MPA). In the developed MPASCA model, the sine–cosine algorithm (SCA) is utilized to improve the search ability, which works as a local search of the MPA. To evaluate the performance of the MPASCA algorithm, extensive experiments were carried out using 18 UCI datasets. More so, the metabolomics dataset is used to test the proposed method as a real-world application. Furthermore, we implemented extensive comparisons to several state-of-art methods to verify the efficiency of the MPASCA. The evaluation outcomes showed that the MPASCA has significant performance, and it outperforms the compared methods in terms of classification measures.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Cohen AM, Hersh WR (2005) A survey of current work in biomedical text mining. Brief Bioinform 6(1):57–71

    Article  Google Scholar 

  2. Donoho DL et al (2000) High-dimensional data analysis: the curses and blessings of dimensionality. AMS Math Challenges Lecture 1(2000):32

    Google Scholar 

  3. Fan C, Xiao F, Zhao Y (2017) A short-term building cooling load prediction method using deep learning algorithms. Appl Energy 195:222–233

    Article  Google Scholar 

  4. Tubishat M, Idris N, Shuib L, Abushariah MAM, Mirjalili S (2020) Improved salp swarm algorithm based on opposition based learning and novel local search algorithm for feature selection. Expert Syst Appl 145:113122

    Article  Google Scholar 

  5. Jiang Y, Luo Q, Wei Y, Abualigah L, Zhou Y (2021) An efficient binary gradient-based optimizer for feature selection. Math Biosci Eng MBE 18(4):3813–3854

    Article  MATH  Google Scholar 

  6. Abdel-Basset M, El-Shahat D, El-henawy I, de Albuquerque VHC, Mirjalili S (2020) A new fusion of grey wolf optimizer algorithm with a two-phase mutation for feature selection. Expert Syst Appl 139:112824

    Article  Google Scholar 

  7. De la Hoz E, De La Hoz E, Ortiz A, Ortega J, Martínez-Álvarez A (2014) Feature selection by multi-objective optimisation: application to network anomaly detection by hierarchical self-organising maps. Knowl-Based Syst 71:322–338

    Article  Google Scholar 

  8. Elaziz MA, Ewees AA, Ibrahim RA, Lu S (2020) Opposition-based moth-flame optimization improved by differential evolution for feature selection. Math Comput Simul 168:48–75

    Article  MathSciNet  MATH  Google Scholar 

  9. Wang H, Jing X, Niu B (2017) A discrete bacterial algorithm for feature selection in classification of microarray gene expression cancer data. Knowl-Based Syst 126:8–19

    Article  Google Scholar 

  10. Beniwal S, Arora J (2012) Classification and feature selection techniques in data mining. Int J Eng Res Technol (ijert) 1(6):1–6

    Google Scholar 

  11. Abualigah LM, Khader AT, Hanandeh ES (2018) A new feature selection method to improve the document clustering using particle swarm optimization algorithm. J Comput Sci 25:456–466

    Article  Google Scholar 

  12. Wang X, Yang J, Teng X, Xia W, Jensen R (2007) Feature selection based on rough sets and particle swarm optimization. Pattern Recogn Lett 28(4):459–471

    Article  Google Scholar 

  13. Ibrahim AM, Tawhid MA, Ward RK (2020) A binary water wave optimization for feature selection. Int J Approx Reason 120:74–91

    Article  MathSciNet  MATH  Google Scholar 

  14. Ibrahim RA, Oliva D, Ewees AA, Lu S (2017) Feature selection based on improved runner-root algorithm using chaotic singer map and opposition-based learning. In: International conference on neural information processing, pp 156–166. Springer

  15. Al-Qaness MAA, Fan H, Ewees AA, Yousri D, Elaziz MA (2021) Improved anfis model for forecasting wuhan city air quality and analysis covid-19 lockdown impacts on air quality. Environ Res 194:110607

    Article  Google Scholar 

  16. Li Y, Luo C, Chung SM (2008) Text clustering with feature selection by using statistical data. IEEE Trans Knowl Data Eng 20(5):641–652

    Article  Google Scholar 

  17. Bharti KK, Singh PK (2016) Opposition chaotic fitness mutation based adaptive inertia weight bpso for feature selection in text clustering. Appl Soft Comput 43:20–34

    Article  Google Scholar 

  18. Lei X, Ma A (2021) Coarse-to-fine waterlogging probability assessment based on remote sensing image and social media data. Geo-Spatial Inf Sci 24(2):279–301

    Article  Google Scholar 

  19. Al-qaness MAA, Abbasi AA, Fan H, Ibrahim RA, Alsamhi SH, Hawbani A (2021) An improved yolo-based road traffic monitoring system. Computing 103(2):211–230

    Article  MathSciNet  Google Scholar 

  20. Ibrahim RA, Elaziz MA, Ewees AA, Selim IM, Lu S (2018) Galaxy images classification using hybrid brain storm optimization with moth flame optimization. J Astron Telesc Instrum Syst 4(3):038001

    Article  Google Scholar 

  21. Ambusaidi MA, He X, Nanda P, Tan Z (2016) Building an intrusion detection system using a filter-based feature selection algorithm. IEEE Trans Comput 65(10):2986–2998

    Article  MathSciNet  MATH  Google Scholar 

  22. Aljawarneh S, Aldwairi M, Yassein MB (2018) Anomaly-based intrusion detection system through feature selection analysis and building hybrid efficient model. J Comput Sci 25:152–160

    Article  Google Scholar 

  23. Tsang C-H, Kwong S, Wang H (2007) Genetic-fuzzy rule mining approach and evaluation of feature selection techniques for anomaly intrusion detection. Pattern Recogn 40(9):2373–2391

    Article  MATH  Google Scholar 

  24. Ibrahim RA, Ewees AA, Oliva D, Elaziz MA, Lu S (2019) Improved salp swarm algorithm based on particle swarm optimization for feature selection. J Ambient Intell Humaniz Comput 10(8):3155–3169

    Article  Google Scholar 

  25. Abd Elaziz ME, Ewees AA, Oliva D, Duan P, Xiong S (2017) A hybrid method of sine cosine algorithm and differential evolution for feature selection. In: International conference on neural information processing, pp 145–155. Springer

  26. Al-qaness MAA (2019) Device-free human micro-activity recognition method using wifi signals. Geo-Spatial Inform Sci 22(2):128–137

    Article  Google Scholar 

  27. Yousefpour A, Ibrahim R, Hamed HNA (2017) Ordinal-based and frequency-based integration of feature selection methods for sentiment analysis. Expert Syst Appl 75:80–93

    Article  Google Scholar 

  28. Shao Z, Sumari NS, Portnov A, Ujoh F, Musakwa W, Mandela PJ (2021) Urban sprawl and its impact on sustainable urban development: a combination of remote sensing and social media data. Geo-Spatial Inf Sci 24(2):241–255

    Article  Google Scholar 

  29. Potie N, Giannoukakos S, Hackenberg M, Fernandez A (2020) Applying feature selection to improve predictive performance and explainability in lung cancer detection with soft computing. In: Proceedings of the 53rd Hawaii international conference on system sciences

  30. Suji RJ, Rajagopalan SP (2016) Multi-ranked feature selection algorithm for effective breast cancer detection

  31. Sharif M, Khan MA, Iqbal Z, Azam MF, Lali MIU, Javed MY (2018) Detection and classification of citrus diseases in agriculture based on optimized weighted segmentation and feature selection. Comput Electron Agric 150:220–234

    Article  Google Scholar 

  32. Phadikar S, Sil J, Das AK (2013) Rice diseases classification using feature selection and rule generation techniques. Comput Electron Agric 90:76–85

    Article  Google Scholar 

  33. Sawhney H, Jeyasurya B (2006) A feed-forward artificial neural network with enhanced feature selection for power system transient stability assessment. Electr Power Syst Res 76(12):1047–1054

    Article  Google Scholar 

  34. Abedinia O, Amjady N, Zareipour H (2016) A new feature selection technique for load and price forecast of electrical power systems. IEEE Trans Power Syst 32(1):62–74

    Article  Google Scholar 

  35. Chou T-S, Yen KK, Luo J (2008) Network intrusion detection design using feature selection of soft computing paradigms. Int J Comput Intell 4(3):196–208

    Google Scholar 

  36. Chen XW, Wasikowski M (2008) Fast: a roc-based feature selection metric for small samples and imbalanced data classification problems. In: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 124–132

  37. Alhaj YA, Xiang J, Zhao D, Al-Qaness MAA, Elaziz MA, Dahou A (2019) A study of the effects of stemming strategies on arabic document classification. IEEE Access 7:32664–32671

    Article  Google Scholar 

  38. Uysal AK, Gunal S (2012) A novel probabilistic feature selection method for text classification. Knowl-Based Syst 36:226–235

    Article  Google Scholar 

  39. Wang S, Jia H, Abualigah L, Liu Q, Zheng R (2021) An improved hybrid aquila optimizer and harris hawks algorithm for solving industrial engineering optimization problems. Processes 9(9):1551

    Article  Google Scholar 

  40. Wang S, Liu Q, Liu Y, Jia H, Abualigah L, Zheng R, Wu D (2021) A hybrid ssa and sma with mutation opposition-based learning for constrained engineering problems. Comput Intell Neurosci, 2021

  41. Neggaz N, Ewees AA, Elaziz MA, Mafarja M (2020) Boosting salp swarm algorithm by sine cosine algorithm and disrupt operator for feature selection. Expert Syst Appl 145:113103

    Article  Google Scholar 

  42. Mafarja MM, Mirjalili S (2017) Hybrid whale optimization algorithm with simulated annealing for feature selection. Neurocomputing 260:302–312

    Article  Google Scholar 

  43. Chen Y, Zhu Q, Huarong X (2015) Finding rough set reducts with fish swarm algorithm. Knowl-Based Syst 81:22–29

    Article  Google Scholar 

  44. El Aziz MA, Hassanien AE (2018) An improved social spider optimization algorithm based on rough sets for solving minimum number attribute reduction problem. Neural Comput Appl 30(8):2441–2452

    Article  Google Scholar 

  45. Abualigah LM, Khader AT, Al-Betar MA, Alomari OA (2017) Text feature selection with a robust weight scheme and dynamic dimension reduction to text document clustering. Expert Syst Appl 84:24–36

    Article  Google Scholar 

  46. Macêdo F, Barbosa G, Neto A: A binary water wave optimization algorithm applied to feature selection. In: Anais do XVI Encontro Nacional de Inteligência Artificial e Computacional, pp 448–459. SBC

  47. Thaher T, Heidari AA, Mafarja M, Dong JS, Mirjalili S (2020) Binary harris hawks optimizer for high-dimensional, low sample size feature selection. In: Evolutionary machine learning techniques, pp 251–272. Springer

  48. Zhang X, Xu Y, Yu C, Heidari AA, Li S, Chen H, Li C (2020) Gaussian mutational chaotic fruit fly-built optimization and feature selection. Expert Syst Appl 141:112976

    Article  Google Scholar 

  49. Zakeri A, Hokmabadi A (2019) Efficient feature selection method using real-valued grasshopper optimization algorithm. Expert Syst Appl 119:61–72

    Article  Google Scholar 

  50. Aghdam MH, Ghasem-Aghaee N, Basiri ME (2009) Text feature selection using ant colony optimization. Expert Syst Appl 36(3):6843–6853

    Article  Google Scholar 

  51. Abualigah L, Yousri D, Abd Elaziz M, Ewees AA, Al-qaness MAA, Gandomi AH (2021) Aquila optimizer: a novel meta-heuristic optimization algorithm. Comput Ind Eng 157:107250

    Article  Google Scholar 

  52. Abualigah L, Diabat A, Mirjalili S, Abd Elaziz M, Gandomi AH (2021) The arithmetic optimization algorithm. Comput Methods Appl Mech Eng 376:113609

    Article  MathSciNet  MATH  Google Scholar 

  53. Alsalibi B, Abualigah L, Khader AT (2021) A novel bat algorithm with dynamic membrane structure for optimization problems. Appl Intell 51(4):1992–2017

    Article  Google Scholar 

  54. Faramarzi A, Heidarinejad M, Mirjalili S, Gandomi AH (2020) Marine predators algorithm: a nature-inspired metaheuristic. Expert Syst Appl, pp 113377

  55. Mirjalili S (2016) Sca: a sine cosine algorithm for solving optimization problems. Knowl-Based Syst 96:120–133

    Article  Google Scholar 

  56. Abualigah LMQ (2019) Feature selection and enhanced krill herd algorithm for text document clustering. Springer, Berlin

    Book  Google Scholar 

  57. Faris H, Heidari AA, Ala’M A-Z, Mafarja M, Aljarah I, Eshtay M, Mirjalili S (2020) Time-varying hierarchical chains of salps with random weight networks for feature selection. Expert Syst Appl 140:112898

    Article  Google Scholar 

  58. Kruczyk M, Baltzer N, Mieczkowski J, Dramiński M, Koronacki J, Komorowski J (2013) Random reducts: a monte carlo rough set-based method for feature selection in large datasets. Fund Inform 127(1–4):273–288

    Google Scholar 

  59. Bouzayane S, Saad I (2020) A multicriteria approach based on rough set theory for the incremental periodic prediction. Eur J Oper Res 286(1):282–298

    Article  MathSciNet  MATH  Google Scholar 

  60. Kifah S, Abdullah S, Arajy YZ (2017) Solving feature selection problem using intelligent double treatment iterative composite neighbourhood structure algorithm. Int J Comput Vis Robot 7(3):255–275

    Article  Google Scholar 

  61. Li JR, Lin L, Zhang Y-H, YaoChen X, Liu M, Feng KY, Chen L, Kong XY, Huang T, Cai Y-D (2020) Identification of leukemia stem cell expression signatures through monte carlo feature selection strategy and support vector machine. Cancer Gene Ther 27(1):56–69

    Article  Google Scholar 

  62. Agrawal RK, Kaur B, Sharma S (2020) Quantum based whale optimization algorithm for wrapper feature selection. Appl Soft Comput, pp 106092

  63. Frank A (2010) Uci machine learning repository. http://archive.ics.uci.edu/ml

  64. Hashim FA, Houssein EH, Mabrouk MS, Al-Atabany W, Mirjalili S (2019) Henry gas solubility optimization: a novel physics-based algorithm. Futur Gener Comput Syst 101:646–667

    Article  Google Scholar 

  65. Heidari AA, Mirjalili S, Faris H, Aljarah I, Mafarja M, Chen H (2019) Harris hawks optimization: algorithm and applications. Futur Gener Comput Syst 97:849–872

    Article  Google Scholar 

  66. Mirjalili S, Lewis A (2016) The whale optimization algorithm. Adv Eng Softw 95:51–67

    Article  Google Scholar 

  67. Ibrahim RA, Elaziz MA, Lu S (2018) Chaotic opposition-based grey-wolf optimization algorithm based on differential evolution and disruption operator for global optimization. Expert Syst Appl 108:1–27

    Article  Google Scholar 

  68. Oreski S, Oreski G (2014) Genetic algorithm-based heuristic for feature selection in credit risk assessment. Expert Syst Appl 41(4):2052–2064

    Article  Google Scholar 

  69. Mahadevan S, Shah SL, Slupsky CM, Marrie TJ, Saude E, Adamko DJ (2007) Feature selection and classification of metabolomic data using support vector machines. IFAC Proc Vol 40(4):43–48

    Article  Google Scholar 

  70. Guang-Hui F, Yuan-Jiao W, Zong M-J, Yi L-Z (2020) Feature selection and classification by minimizing overlap degree for class-imbalanced data in metabolomics. Chemom Intell Lab Syst 196:103906

    Article  Google Scholar 

  71. Guang-Hui F, Zhang B-Y, Kou H-D, Yi L-Z (2017) Stable biomarker screening and classification by subsampling-based sparse regularization coupled with support vector machines in metabolomics. Chemom Intell Lab Syst 160:22–31

    Article  Google Scholar 

  72. Guang-Hui F, Feng X, Zhang B-Y, Yi L-Z (2017) Stable variable selection of class-imbalanced data with precision-recall criterion. Chemom Intell Lab Syst 171:241–250

    Article  Google Scholar 

Download references

Funding

The authors received no specific funding for this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammed A. A. Al-qaness.

Ethics declarations

Conflict of interests

The authors declare that they have no conflict of interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Abd Elaziz, M., Ewees, A.A., Yousri, D. et al. Modified marine predators algorithm for feature selection: case study metabolomics. Knowl Inf Syst 64, 261–287 (2022). https://doi.org/10.1007/s10115-021-01641-w

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-021-01641-w

Keywords

Navigation