Abstract
In this paper, a multi-criteria feature selection framework is proposed to integrate the wrapper method and Pareto Optimal (PO) method so that cancer-related genes in microarray datasets can be identified. Sequential forward selection is applied for feature selection among cross-validated training sets, and PO is employed as an aggregation method to combine wrapper-based gene selection results from the training sets. The proposed gene selection does not require user intervention and PO also selects each valuable gene when structuring the most representative gene subset. In order to test the performance of the proposed framework, an experimental study has been conducted on three publicly available cancer microarray datasets. The results show that the proposed framework gives robust aggregation and the accuracy is boosted when feature selection results are combined with PO. The findings also demonstrate that the Pareto Optimality based framework is robust against variations in the training sets and is less prone to over fitting.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Saeys, Y., Inza, I., Larrañaga, P.: A review of feature selection techniques in bioinformatics. Bioinformatics 23(19), 2507–2517 (2007)
Bolón-Canedo, V., Sánchez-Maroño, N., Alonso-Betanzos, A.: Feature selection for high-dimensional data. Prog. Artif. Intell. 5(2), 65–75 (2015)
Sarac, F., Uslan, V., Seker, H., Bouridane, A.: Comparison of unsupervised feature selection methods for high-dimensional regression problems in prediction of peptide binding affinity. In: Proceedings Annual International Conference of the IEEE Engineering in Medicine and Biology Society EMBS, pp. 8173–8176 (Nov 2015)
Ang, J.C., Mirzal, A., Haron, H., Hamed, H.N.A.: Supervised, unsupervised, and semi-supervised feature selection: a review on gene selection. IEEE/ACM Trans. Comput. Biol. Bioinforma. 13(5), 971–989 (2016)
Bolón-Canedo, V., Sánchez-Maroño, N., Alonso-Betanzos, A.: An ensemble of filters and classifiers for microarray data classification. Pattern Recognit. 45(1), 531–539 (2012)
Bolón-Canedo, V., Sánchez-Maroño, N., Alonso-Betanzos, A., Benítez, J.M., Herrera, F.: A review of microarray datasets and applied feature selection methods. Inf. Sci. (Ny) 282, 111–135 (2014)
Wang, L., Wang, Y., Chang, Q.: Feature selection methods for big data bioinformatics: a survey from the search perspective. Methods 111, 21–31 (2016)
Jovic, A., Brkic, K., Bogunovic, N.: A review of feature selection methods with applications. In: IEEE, pp. 1200–1205 (2015)
Bolón-Canedo, V., Sánchez-Maroño, N., Alonso-Betanzos, A.: Data classification using an ensemble of filters. Neurocomputing 135, 13–20 (2014)
Seijo-Pardo, B., Bolón-Canedo, V., Porto-Díaz, I., Alonso-Betanzos, A.: Ensemble feature selection for rankings of features. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 9095, pp. 29–42 (2015)
Alpaydın, E.: Introduction to Machine Learning, vol. 1107, 2nd edn. The MIT Press, Cambridge, Massachusetts (2014)
Whitney, W.: A direct method of nonparametric measurement selection. IEEE Trans. Comput. 100(9), 1100–1103 (1971)
Zitzler, E., Thiele, L.: Multiobjective evolutionary algorithms: a comparative case study and the strength Pareto approach. IEEE Trans. Evol. Comput. 3(4), 257–271 (1999)
Aziz, H., Brandt, F., Harrenstein, P.: Pareto optimality in coalition formation. Games Econ. Behav. 82, 562–581 (2013)
Kacem, I., Hammadi, S., Borne, P.: Pareto-optimality approach for flexible job-shop scheduling problems: hybridization of evolutionary algorithms and fuzzy logic. Math. Comput. Simul. 60(3–5), 245–276 (2002)
Feng, B.: Multisourcing suppliers selection in service outsourcing. J. Oper. Res. Soc. 63(5), 582–596 (2012)
Sudeng, S., Wattanapongsakorn, N.: Post Pareto-optimal pruning algorithm for multiple objective optimization using specific extended angle dominance. Eng. Appl. Artif. Intell. 38, 221–236 (2015)
Chuang, L.-Y., Yang, C.-H., Wu, K.-C., Yang, C.-H.: A hybrid feature selection method for DNA microarray data. Comput. Biol. Med. 41(4), 228–237 (2011)
Luo, L., Ye, L., Luo, M., Huang, D., Peng, H., Yang, F.: Methods of forward feature selection based on the aggregation of classifiers generated by single attribute. Comput. Biol. Med. 41(7), 435–441 (2011)
Ogutcen, O.F., Gormez, Z., Tahir, M.A., Seker, H.: An aggregated cross-validation framework for computational discovery of disease-associative genes. In: IFMBE Proceedings, vol. 57 (2016)
Alshamlan, H.M., Badr, G.H., Alohali, Y.A.: Genetic bee colony (GBC) algorithm: a new gene selection method for microarray cancer classification. Comput. Biol. Chem. 56, 49–60 (2015)
Hasnat, A.: Feature selection in cancer microarray data using multi-objective genetic algorithm combined with correlation coefficient. In: 2016 International Conference on Emerging Technological Trends [ICETT] (2016)
Fattah, M.A., Khedr, W.I., Sallam, K.M.: A TOPSIS based method for gene selection for cancer classification. Int. J. Comput. Appl. 67(17), 39–44 (2013)
Dash, R., Misra, B.: Gene selection and classification of microarray data: a Pareto DE approach. Intell. Decis. Technol. 11(1), 93–107 (2017)
Mohamad, M.S., Omatu, S., Deris, S., Yoshioka, M., Abdullah, A., Ibrahim, Z.: An enhancement of binary particle swarm optimization for gene selection in classifying cancer classes. Algorithms Mol. Biol. Biomed Cent. Index. by ISI SCOPUS 8(15), 1–11 (2013)
Nguyen, T., Khosravi, A., Creighton, D., Nahavandi, S.: Hierarchical gene selection and genetic fuzzy system for cancer microarray data classification. PLoS ONE 10(3), 1–23 (2015)
Mortazavi and M.H. Moattar, Robust feature selection from microarray data based on cooperative game theory and qualitative mutual information. Adv. Bioinform. 2016 (2016)
Zhou, Q., Ding, J., Ning, Y., Luo, L., Li, T.: Stable feature selection with ensembles of multirelieff. In: 2014 10th International Conference on Natural Computation, ICNC, pp. 742–747 (2014)
Armanfard, N., Reilly, J.P., Komeili, M.: Local feature selection for data classification. IEEE Trans. Pattern Anal. Mach. Intell. 38(6), 1217–1227 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Ogutcen, O.F., Belatreche, A., Seker, H. (2019). A Multi-objective Pareto-Optimal Wrapper Based Framework for Cancer-Related Gene Selection. In: Arai, K., Kapoor, S., Bhatia, R. (eds) Intelligent Systems and Applications. IntelliSys 2018. Advances in Intelligent Systems and Computing, vol 869. Springer, Cham. https://doi.org/10.1007/978-3-030-01057-7_28
Download citation
DOI: https://doi.org/10.1007/978-3-030-01057-7_28
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01056-0
Online ISBN: 978-3-030-01057-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)