Abstract
Feature selection (FS) is the process of finding the least possible number of features that are able to describe a dataset in the same way as the original features. Feature selection is a crucial preprocessing step for data mining techniques as it improves the performance of the prediction process in terms of speed and accuracy and also provides a better understanding of stored data. The success of the FS process depends on achieving a balance between two important factors, namely selecting the minimal number of features and maintaining the maximum accuracy in the results. In this paper, two methods are proposed to improve the FS process. Firstly, the mine blast algorithm (MBA) is introduced to optimize the FS process in the exploration phase. Secondly, the MBA is hybridized with simulated annealing as a local search in the exploitation phase to enhance the solutions located by the MBA. The proposed approaches (MBA and MBA–SA) are tested on 18 benchmark datasets from the UCI repository, and the comprehensive experimental results indicate that MBA–SA achieved good performance when compared with five approaches in the literature.
Similar content being viewed by others
References
Aalaei S et al (2016) Feature selection using genetic algorithm for breast cancer diagnosis: experiment on three different datasets. Iran J Basic Med Sci 19(5):476
Abd Elazim SM, Ali ES (2016) Optimal locations and sizing of capacitors in radial distribution systems using mine blast algorithm. Electr Eng 100(1):1–9
Aghdam MH, Kabiri P (2016) Feature selection for intrusion detection system using ant colony optimization. IJ Netw Secur 18(3):420–432
Akil Kumar A et al (2015) Improved fuzzy rule based classification system using feature selection and bagging for large datasets
Alby S, Shivakumar B (2016) A novel approach for prediction of type 2 diabetes. Int J Adv Res Comput Sci 7(4):22–28
Ali E, Elazim SA (2016) Mine blast algorithm for environmental economic load dispatch with valve loading effect. Neural Comput Appl 1–10
Almomani A, Alweshah M, Al S (2019) Metaheuristic algorithms-based feature selection approach for intrusion detection. In: Machine learning for computer and cyber security: principle, algorithms, and practices, p 184
Alshareef AM et al (2015) A case-based reasoning approach for pattern detection in Malaysia rainfall data. Int J Big Data Intell 2(4):285–302
Altman NS (1992) An introduction to kernel and nearest-neighbor nonparametric regression. Am Stat 46(3):175–185
Alweshah M (2014) Firefly algorithm with artificial neural network for time series problems. Res J Appl Sci Eng Technol 7(19):3978–3982
Alweshah M (2018) Construction biogeography-based optimization algorithm for solving classification problems. Neural Comput Appl 29(4):1–10
Alweshah M et al (2017) Solving time series classification problems using combined of support vector machine and neural network. Int J Data Anal Tech Strat 9(3)
Alweshah M, Al-Daradkeh A, Al- Betar MA et al (2019) β-Hill climbing algorithm with probabilistic neural network for classification problems. J Ambient Intell Humaniz Comput. https://doi.org/10.1007/s12652-019-01543-4
Alweshah M, Qadoura MA, Hammouri AI, Azmi MS, AlKhalaileh S (2020) Flower pollination algorithm for solving classification problems. Int J Adv Soft Comput Appl 12(1):15–34
Alweshah M, Al-Sendah M, Dorgham OM et al (2020b) Improved water cycle algorithm with probabilistic neural network to solve classification problems. Cluster Comput. https://doi.org/10.1007/s10586-019-03038-5
Alweshah M, Ramadan E, Ryalat MH, Almi'ani M, Hammouri AI (2020c) Water evaporation algorithm with probabilistic neural network for solving classification problems. Jordanian J Comput Inf Technol (JJCIT) 6(01):1–14
Alweshah M, Abdullah S (2015) Hybridizing firefly algorithms with a probabilistic neural network for solving classification problems. Appl Soft Comput 35:513–524
Alweshah M, Ahmed W, Aldabbas H (2015) Evolution of software reliability growth models: a comparison of auto-regression and genetic programming models. Int J Comput Appl 125(3):20–25
Alweshah M et al (2016) Solving attribute reduction problem using wrapper genetic programming. Int J Comput Sci Netw Secur (IJCSNS) 16(5):77
Alweshah M, Rashaideh H, Hammouri AI, Tayyeb H, Ababneh M (2017a) Solving time series classification problems using support vector machine and neural network. Int J Data Anal Tech Strat 9(3):237–247
Alweshah M, Hammouri AI, Tedmori S (2017b) Biogeography-based optimisation for data classification problems. Int J Data Min Model Manag 9(2):142–162
Azmi R et al (2010) A hybrid GA and SA algorithms for feature selection in recognition of hand-printed Farsi characters. In: IEEE international conference on intelligent computing and intelligent systems (ICIS). IEEE
Babatunde RS, Olabiyisi SO, Omidiora EO (2014) Feature dimensionality reduction using a dual level metaheuristic algorithm. Optimization 7(1):49–52
Barbu A et al (2017) Feature selection with annealing for computer vision and big data learning. IEEE Trans Pattern Anal Mach Intell 39(2):272–286
Basiri ME, Nemati S (2009) A novel hybrid ACO-GA algorithm for text feature selection. In: IEEE congress on evolutionary computation, CEC’09. IEEE
Bermejo P, Gámez JA, Puerta JM (2011) A GRASP algorithm for fast hybrid (filter-wrapper) feature subset selection in high-dimensional datasets. Pattern Recognit Lett 32(5):701–711
Blake CL, Merz CJ (1998) UCI repository of machine learning databases. http://www.ics.uci.edu/~mlearn/. Accessed 1 Feb 2019
Blum C et al (2010) A brief survey on hybrid metaheuristics. In: Proceedings of BIOMA, pp 3–18
Blum C et al (2011) Hybrid metaheuristics in combinatorial optimization: a survey. Appl Soft Comput 11(6):4135–4151
Cerrada M et al (2015) Multi-stage feature selection by using genetic algorithms for fault diagnosis in gearboxes based on vibration signal. Sensors 15(9):23903–23926
Chandrashekar G, Sahin F (2014) A survey on feature selection methods. Comput Electr Eng 40(1):16–28
Chen Z, Lin T, Tang N, Xia X (2016) A parallel genetic algorithm based feature selection and parameter optimization for support vector machine. Sci Program. https://doi.org/10.1155/2016/2739621
Emary E, Zawbaa H, Hassanien AE (2016) Binary ant lion approaches for feature selection. Neurocomputing 213:54–65
Fathy A (2016) A reliable methodology based on mine blast optimization algorithm for optimal sizing of hybrid PV-wind-FC system for remote area in Egypt. Renewable Energy 95:367–380
Fathy A, Rezk H (2016) A novel methodology for simulating maximum power point trackers using mine blast optimization and teaching learning based optimization algorithms for partially shaded photovoltaic system. J Renew Sustain Energy 8(2):023503
Fayyad U, Piatetsky-Shapiro G, Smyth P (1996) From data mining to knowledge discovery in databases. AI Magazine 17(3):37
Ghanem WAHM, Jantan A (2016) Novel multi-objective artificial bee Colony optimization for wrapper based feature selection in intrusion detection. Int J Adv Soft Comput Appl 8(1):70–81
Gupta A, Purohit A (2017) RGAP: a rough set, genetic algorithm and particle swarm optimization based feature selection approach. Int J Comput Appl 161(6):1–5
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182
Javidi MM, Emami N (2016) A hybrid search method of wrapper feature selection by chaos particle swarm optimization and local search. Turk J Electr Eng Comput Sci 24(5):3852–3861
Jona J, Nagaveni N (2014) Ant-cuckoo colony optimization for feature selection in digital mammogram. Pak J Biol Sci PJBS 17(2):266–271
Jović A, Brkić K, Bogunović N (2015) A review of feature selection methods with applications. In: 38th International convention on information and communication technology, electronics and microelectronics (MIPRO). IEEE
Kirkpatrick S, Gelatt CD, Vecchi MP (1983) Optimization by simulated annealing. Science 220(4598):671–680
Koller D, Sahami M (1996) Toward optimal feature selection. Stanford InfoLab
Lenin K (2017) Enhanced mine blast algorithm for solving reactive power problem. Int J Res Granthaalayah 5(9):206–216
Linoff GS, Berry MJ (2011) Data mining techniques: for marketing, sales, and customer relationship management. Wiley, Hoboken
Ma L et al (2017) A novel wrapper approach for feature selection in object-based image classification using polygon-based cross-validation. IEEE Geosci Remote Sens Lett 14(3):409–413
Mafarja M, Eleyan D, Abdullah S, Mirjalili S (2017a) S-shaped vs. V-shaped transfer functions for ant lion optimization algorithm in feature selection problem. In: Proceedings of the international conference on future networks and distributed systems, pp 1–7
Mafarja MM et al (2017b) Binary dragonfly algorithm for feature selection. In: International conference on new trends in computing sciences (ICTCS). IEEE
Mafarja M, Abdullah S (2013) Investigating memetic algorithm in solving rough set attribute reduction. Int J Comput Appl Technol 48(3):195–202
Mafarja MM, Mirjalili S (2017) Hybrid whale optimization algorithm with simulated annealing for feature selection. Neurocomputing
Mafarja M, Mirjalili S (2018) Whale optimization approaches for wrapper feature selection. Appl Soft Comput 62:441–453
Majumdar S, Mandal K, Chakraborty N (2014) Performance study of mine blast algorithm for automatic voltage regulator tuning. In: Annual IEEE India conference (INDICON). IEEE
Metropolis N et al (1953) Equation of state calculations by fast computing machines. J Chem Phys 21(6):1087–1092
Morán-Fernández L, Bolón-Canedo V, Alonso-Betanzos A (2017) Centralized vs. distributed feature selection methods based on data complexity measures. Knowl Based Syst 117:27–45
Neagoe V-E, Neghina E-C (2016) Feature selection with ant colony optimization and its applications for pattern recognition in space imagery. In: International conference on communications (COMM). IEEE
Olabiyisi SO, Fagbola TM, Omidiora EO, Oyeleye AC (2012) Hybrid metaheuristic feature extraction technique for solving timetabling problem. Int J Sci Eng Res 3(8):1–6
Peterson LE (2009) K-nearest neighbor. Scholarpedia 4(2):1883
Ramadan HS, Fathy A, Becherif M (2017) Optimal gain scheduling of VSC-HVDC system sliding mode control via artificial bee colony and mine blast algorithms. IET Gener Transm Distrib 12(3):661–669
Ramírez-Gallego S et al (2018) An information theory-based feature selection framework for big data under apache spark. IEEE Trans Syst Man Cybern Syst 48(9):1441–1453
Rodrigues YE et al (2017) Wrappers feature selection in Alzheimer’s biomarkers using kNN and SMOTE oversampling. Trends Appl Comput Math 18(1):15
Sabeena S, Sarojini B (2015) Optimal feature subset selection using ant colony optimization. Indian J Sci Technol 8(35):1–5
Sadollah A et al (2012) Mine blast algorithm for optimization of truss structures with discrete variables. Comput Struct 102:49–63
Sadollah A et al (2013) Mine blast algorithm: a new population based algorithm for solving constrained engineering optimization problems. Appl Soft Comput 13(5):2592–2612
Sadollah A, Eskandar H, Kim JH (2014) Geometry optimization of a cylindrical fin heat sink using mine blast algorithm. Int J Adv Manuf Technol 73(5–8):795–804
Sadollah A, Yoo DG, Kim JH (2015a) Improved mine blast algorithm for optimal cost design of water distribution systems. Eng Optim 47(12):1602–1618
Sadollah A et al (2015b) Water cycle, mine blast and improved mine blast algorithms for discrete sizing optimization of truss structures. Comput Struct 149:1–16
Sadollah A et al (2018) Mine blast harmony search: a new hybrid optimization method for improving exploration and exploitation capabilities. Appl Soft Comput 68:548–564
Salhi S (2017) Not necessary improving heuristics. In: Heuristic search, Springer, pp 49–76
Samsani S, Suma GJ (2016) A binary approach of artificial bee colony optimization technique for feature subset selection
Shahbeig S, Sadjad K, Sadeghi M (2016) Feature selection from iron direct reduction data based on binary differential evolution optimization. Bull Soc R Sci Liège 85:114–122
Tang J, Alelyani S, Liu H (2014) Feature selection for classification: a review. In: Data classification: algorithms and applications, p 37
Urbanowicz RJ et al (2018) Benchmarking relief-based feature selection methods for bioinformatics data mining. J Biomed Inform 85:168–188
van den Bosch S (2017) Automatic feature generation and selection in predictive analytics solutions. Master’s thesis, Faculty of Science, Radboud University, vol 3, no 1, p 3.1
Wan Y et al (2016) A feature selection method based on modified binary coded ant colony optimization algorithm. Appl Soft Comput 49:248–258
Wang J et al (2016) A differential evolution approach to feature selection and instance selection. In: Pacific rim international conference on artificial intelligence. Springer
Wright M (2003) An overview of neighbourhood search metaheuristics
Wright M (2010) Automating parameter choice for simulated annealing
Wu S (2015) Comparative analysis of particle swarm optimization algorithms for text feature selection
Yu L, Liu H (2004) Efficient feature selection via analysis of relevance and redundancy. J Mach Learn Res 5:1205–1224
Zawbaa HM, Emary E, Parv B (2015) Feature selection based on antlion optimization algorithm. In: Third world conference on complex systems (WCCS). IEEE
Zawbaa HM, Emary E, Grosan C (2016) Feature selection via chaotic antlion optimization. PLoS ONE 11(3):e0150652
Zhang Z (2017) Approaches to feature identification and feature selection for binary and multi-class classification
Zorarpacı E, Özel SA (2016) A hybrid approach of differential evolution and artificial bee colony for feature selection. Expert Syst Appl 62:91–103
Acknowledgements
The research reported in this publication was supported by the Deanship of Scientific Research at Al-Balqa Applied University in Jordan.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
All authors state that there is no conflict of interest.
Human and animal rights
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Communicated by V. Loia.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Alweshah, M., Alkhalaileh, S., Albashish, D. et al. A hybrid mine blast algorithm for feature selection problems. Soft Comput 25, 517–534 (2021). https://doi.org/10.1007/s00500-020-05164-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-020-05164-4