Abstract
Feature selection (FS) is a necessary process applied to reduce the high dimensionality of the dataset. It is utilized to obtain the most relevant information and reduce the computational efforts of the classification process. Recently, metaheuristics methods have been widely employed for various optimization problems, including FS. In the current study, we present an FS method based on a new modified version of the marine predators algorithm (MPA). In the developed MPASCA model, the sine–cosine algorithm (SCA) is utilized to improve the search ability, which works as a local search of the MPA. To evaluate the performance of the MPASCA algorithm, extensive experiments were carried out using 18 UCI datasets. More so, the metabolomics dataset is used to test the proposed method as a real-world application. Furthermore, we implemented extensive comparisons to several state-of-art methods to verify the efficiency of the MPASCA. The evaluation outcomes showed that the MPASCA has significant performance, and it outperforms the compared methods in terms of classification measures.
Similar content being viewed by others
References
Cohen AM, Hersh WR (2005) A survey of current work in biomedical text mining. Brief Bioinform 6(1):57–71
Donoho DL et al (2000) High-dimensional data analysis: the curses and blessings of dimensionality. AMS Math Challenges Lecture 1(2000):32
Fan C, Xiao F, Zhao Y (2017) A short-term building cooling load prediction method using deep learning algorithms. Appl Energy 195:222–233
Tubishat M, Idris N, Shuib L, Abushariah MAM, Mirjalili S (2020) Improved salp swarm algorithm based on opposition based learning and novel local search algorithm for feature selection. Expert Syst Appl 145:113122
Jiang Y, Luo Q, Wei Y, Abualigah L, Zhou Y (2021) An efficient binary gradient-based optimizer for feature selection. Math Biosci Eng MBE 18(4):3813–3854
Abdel-Basset M, El-Shahat D, El-henawy I, de Albuquerque VHC, Mirjalili S (2020) A new fusion of grey wolf optimizer algorithm with a two-phase mutation for feature selection. Expert Syst Appl 139:112824
De la Hoz E, De La Hoz E, Ortiz A, Ortega J, Martínez-Álvarez A (2014) Feature selection by multi-objective optimisation: application to network anomaly detection by hierarchical self-organising maps. Knowl-Based Syst 71:322–338
Elaziz MA, Ewees AA, Ibrahim RA, Lu S (2020) Opposition-based moth-flame optimization improved by differential evolution for feature selection. Math Comput Simul 168:48–75
Wang H, Jing X, Niu B (2017) A discrete bacterial algorithm for feature selection in classification of microarray gene expression cancer data. Knowl-Based Syst 126:8–19
Beniwal S, Arora J (2012) Classification and feature selection techniques in data mining. Int J Eng Res Technol (ijert) 1(6):1–6
Abualigah LM, Khader AT, Hanandeh ES (2018) A new feature selection method to improve the document clustering using particle swarm optimization algorithm. J Comput Sci 25:456–466
Wang X, Yang J, Teng X, Xia W, Jensen R (2007) Feature selection based on rough sets and particle swarm optimization. Pattern Recogn Lett 28(4):459–471
Ibrahim AM, Tawhid MA, Ward RK (2020) A binary water wave optimization for feature selection. Int J Approx Reason 120:74–91
Ibrahim RA, Oliva D, Ewees AA, Lu S (2017) Feature selection based on improved runner-root algorithm using chaotic singer map and opposition-based learning. In: International conference on neural information processing, pp 156–166. Springer
Al-Qaness MAA, Fan H, Ewees AA, Yousri D, Elaziz MA (2021) Improved anfis model for forecasting wuhan city air quality and analysis covid-19 lockdown impacts on air quality. Environ Res 194:110607
Li Y, Luo C, Chung SM (2008) Text clustering with feature selection by using statistical data. IEEE Trans Knowl Data Eng 20(5):641–652
Bharti KK, Singh PK (2016) Opposition chaotic fitness mutation based adaptive inertia weight bpso for feature selection in text clustering. Appl Soft Comput 43:20–34
Lei X, Ma A (2021) Coarse-to-fine waterlogging probability assessment based on remote sensing image and social media data. Geo-Spatial Inf Sci 24(2):279–301
Al-qaness MAA, Abbasi AA, Fan H, Ibrahim RA, Alsamhi SH, Hawbani A (2021) An improved yolo-based road traffic monitoring system. Computing 103(2):211–230
Ibrahim RA, Elaziz MA, Ewees AA, Selim IM, Lu S (2018) Galaxy images classification using hybrid brain storm optimization with moth flame optimization. J Astron Telesc Instrum Syst 4(3):038001
Ambusaidi MA, He X, Nanda P, Tan Z (2016) Building an intrusion detection system using a filter-based feature selection algorithm. IEEE Trans Comput 65(10):2986–2998
Aljawarneh S, Aldwairi M, Yassein MB (2018) Anomaly-based intrusion detection system through feature selection analysis and building hybrid efficient model. J Comput Sci 25:152–160
Tsang C-H, Kwong S, Wang H (2007) Genetic-fuzzy rule mining approach and evaluation of feature selection techniques for anomaly intrusion detection. Pattern Recogn 40(9):2373–2391
Ibrahim RA, Ewees AA, Oliva D, Elaziz MA, Lu S (2019) Improved salp swarm algorithm based on particle swarm optimization for feature selection. J Ambient Intell Humaniz Comput 10(8):3155–3169
Abd Elaziz ME, Ewees AA, Oliva D, Duan P, Xiong S (2017) A hybrid method of sine cosine algorithm and differential evolution for feature selection. In: International conference on neural information processing, pp 145–155. Springer
Al-qaness MAA (2019) Device-free human micro-activity recognition method using wifi signals. Geo-Spatial Inform Sci 22(2):128–137
Yousefpour A, Ibrahim R, Hamed HNA (2017) Ordinal-based and frequency-based integration of feature selection methods for sentiment analysis. Expert Syst Appl 75:80–93
Shao Z, Sumari NS, Portnov A, Ujoh F, Musakwa W, Mandela PJ (2021) Urban sprawl and its impact on sustainable urban development: a combination of remote sensing and social media data. Geo-Spatial Inf Sci 24(2):241–255
Potie N, Giannoukakos S, Hackenberg M, Fernandez A (2020) Applying feature selection to improve predictive performance and explainability in lung cancer detection with soft computing. In: Proceedings of the 53rd Hawaii international conference on system sciences
Suji RJ, Rajagopalan SP (2016) Multi-ranked feature selection algorithm for effective breast cancer detection
Sharif M, Khan MA, Iqbal Z, Azam MF, Lali MIU, Javed MY (2018) Detection and classification of citrus diseases in agriculture based on optimized weighted segmentation and feature selection. Comput Electron Agric 150:220–234
Phadikar S, Sil J, Das AK (2013) Rice diseases classification using feature selection and rule generation techniques. Comput Electron Agric 90:76–85
Sawhney H, Jeyasurya B (2006) A feed-forward artificial neural network with enhanced feature selection for power system transient stability assessment. Electr Power Syst Res 76(12):1047–1054
Abedinia O, Amjady N, Zareipour H (2016) A new feature selection technique for load and price forecast of electrical power systems. IEEE Trans Power Syst 32(1):62–74
Chou T-S, Yen KK, Luo J (2008) Network intrusion detection design using feature selection of soft computing paradigms. Int J Comput Intell 4(3):196–208
Chen XW, Wasikowski M (2008) Fast: a roc-based feature selection metric for small samples and imbalanced data classification problems. In: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 124–132
Alhaj YA, Xiang J, Zhao D, Al-Qaness MAA, Elaziz MA, Dahou A (2019) A study of the effects of stemming strategies on arabic document classification. IEEE Access 7:32664–32671
Uysal AK, Gunal S (2012) A novel probabilistic feature selection method for text classification. Knowl-Based Syst 36:226–235
Wang S, Jia H, Abualigah L, Liu Q, Zheng R (2021) An improved hybrid aquila optimizer and harris hawks algorithm for solving industrial engineering optimization problems. Processes 9(9):1551
Wang S, Liu Q, Liu Y, Jia H, Abualigah L, Zheng R, Wu D (2021) A hybrid ssa and sma with mutation opposition-based learning for constrained engineering problems. Comput Intell Neurosci, 2021
Neggaz N, Ewees AA, Elaziz MA, Mafarja M (2020) Boosting salp swarm algorithm by sine cosine algorithm and disrupt operator for feature selection. Expert Syst Appl 145:113103
Mafarja MM, Mirjalili S (2017) Hybrid whale optimization algorithm with simulated annealing for feature selection. Neurocomputing 260:302–312
Chen Y, Zhu Q, Huarong X (2015) Finding rough set reducts with fish swarm algorithm. Knowl-Based Syst 81:22–29
El Aziz MA, Hassanien AE (2018) An improved social spider optimization algorithm based on rough sets for solving minimum number attribute reduction problem. Neural Comput Appl 30(8):2441–2452
Abualigah LM, Khader AT, Al-Betar MA, Alomari OA (2017) Text feature selection with a robust weight scheme and dynamic dimension reduction to text document clustering. Expert Syst Appl 84:24–36
Macêdo F, Barbosa G, Neto A: A binary water wave optimization algorithm applied to feature selection. In: Anais do XVI Encontro Nacional de Inteligência Artificial e Computacional, pp 448–459. SBC
Thaher T, Heidari AA, Mafarja M, Dong JS, Mirjalili S (2020) Binary harris hawks optimizer for high-dimensional, low sample size feature selection. In: Evolutionary machine learning techniques, pp 251–272. Springer
Zhang X, Xu Y, Yu C, Heidari AA, Li S, Chen H, Li C (2020) Gaussian mutational chaotic fruit fly-built optimization and feature selection. Expert Syst Appl 141:112976
Zakeri A, Hokmabadi A (2019) Efficient feature selection method using real-valued grasshopper optimization algorithm. Expert Syst Appl 119:61–72
Aghdam MH, Ghasem-Aghaee N, Basiri ME (2009) Text feature selection using ant colony optimization. Expert Syst Appl 36(3):6843–6853
Abualigah L, Yousri D, Abd Elaziz M, Ewees AA, Al-qaness MAA, Gandomi AH (2021) Aquila optimizer: a novel meta-heuristic optimization algorithm. Comput Ind Eng 157:107250
Abualigah L, Diabat A, Mirjalili S, Abd Elaziz M, Gandomi AH (2021) The arithmetic optimization algorithm. Comput Methods Appl Mech Eng 376:113609
Alsalibi B, Abualigah L, Khader AT (2021) A novel bat algorithm with dynamic membrane structure for optimization problems. Appl Intell 51(4):1992–2017
Faramarzi A, Heidarinejad M, Mirjalili S, Gandomi AH (2020) Marine predators algorithm: a nature-inspired metaheuristic. Expert Syst Appl, pp 113377
Mirjalili S (2016) Sca: a sine cosine algorithm for solving optimization problems. Knowl-Based Syst 96:120–133
Abualigah LMQ (2019) Feature selection and enhanced krill herd algorithm for text document clustering. Springer, Berlin
Faris H, Heidari AA, Ala’M A-Z, Mafarja M, Aljarah I, Eshtay M, Mirjalili S (2020) Time-varying hierarchical chains of salps with random weight networks for feature selection. Expert Syst Appl 140:112898
Kruczyk M, Baltzer N, Mieczkowski J, Dramiński M, Koronacki J, Komorowski J (2013) Random reducts: a monte carlo rough set-based method for feature selection in large datasets. Fund Inform 127(1–4):273–288
Bouzayane S, Saad I (2020) A multicriteria approach based on rough set theory for the incremental periodic prediction. Eur J Oper Res 286(1):282–298
Kifah S, Abdullah S, Arajy YZ (2017) Solving feature selection problem using intelligent double treatment iterative composite neighbourhood structure algorithm. Int J Comput Vis Robot 7(3):255–275
Li JR, Lin L, Zhang Y-H, YaoChen X, Liu M, Feng KY, Chen L, Kong XY, Huang T, Cai Y-D (2020) Identification of leukemia stem cell expression signatures through monte carlo feature selection strategy and support vector machine. Cancer Gene Ther 27(1):56–69
Agrawal RK, Kaur B, Sharma S (2020) Quantum based whale optimization algorithm for wrapper feature selection. Appl Soft Comput, pp 106092
Frank A (2010) Uci machine learning repository. http://archive.ics.uci.edu/ml
Hashim FA, Houssein EH, Mabrouk MS, Al-Atabany W, Mirjalili S (2019) Henry gas solubility optimization: a novel physics-based algorithm. Futur Gener Comput Syst 101:646–667
Heidari AA, Mirjalili S, Faris H, Aljarah I, Mafarja M, Chen H (2019) Harris hawks optimization: algorithm and applications. Futur Gener Comput Syst 97:849–872
Mirjalili S, Lewis A (2016) The whale optimization algorithm. Adv Eng Softw 95:51–67
Ibrahim RA, Elaziz MA, Lu S (2018) Chaotic opposition-based grey-wolf optimization algorithm based on differential evolution and disruption operator for global optimization. Expert Syst Appl 108:1–27
Oreski S, Oreski G (2014) Genetic algorithm-based heuristic for feature selection in credit risk assessment. Expert Syst Appl 41(4):2052–2064
Mahadevan S, Shah SL, Slupsky CM, Marrie TJ, Saude E, Adamko DJ (2007) Feature selection and classification of metabolomic data using support vector machines. IFAC Proc Vol 40(4):43–48
Guang-Hui F, Yuan-Jiao W, Zong M-J, Yi L-Z (2020) Feature selection and classification by minimizing overlap degree for class-imbalanced data in metabolomics. Chemom Intell Lab Syst 196:103906
Guang-Hui F, Zhang B-Y, Kou H-D, Yi L-Z (2017) Stable biomarker screening and classification by subsampling-based sparse regularization coupled with support vector machines in metabolomics. Chemom Intell Lab Syst 160:22–31
Guang-Hui F, Feng X, Zhang B-Y, Yi L-Z (2017) Stable variable selection of class-imbalanced data with precision-recall criterion. Chemom Intell Lab Syst 171:241–250
Funding
The authors received no specific funding for this work.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interests
The authors declare that they have no conflict of interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Abd Elaziz, M., Ewees, A.A., Yousri, D. et al. Modified marine predators algorithm for feature selection: case study metabolomics. Knowl Inf Syst 64, 261–287 (2022). https://doi.org/10.1007/s10115-021-01641-w
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-021-01641-w