Abstract
Selection of relevant genes is the crucial task for sample classification in microarray data, where researchers try to identify the smallest possible set of genes that can still achieve good predictive performance. Due to the problem of higher risk of overfitting in wrapper methods and sensitivity of the best embedded way to filter out factor that leads to unstable model and significantly different gene subsets, in this paper, we propose a novel model for evaluating and improving techniques for selecting informative genes from microarray data. This model inspired by membrane computing and used the kernel P system (kP) as the variant of the P system to improve the performance of the intelligent algorithm, multi-objective binary particle swarm optimization (MObPSO). The proposed model consists of two main parts. First, kP-MObPSO, which resembles a wrapper type feature selection, and the second part that improves the results of the first part through an embedded feature selection and classification idea based on the kP system. Division, rewriting, and input/output rules are used to make interaction among the genes inside and between the particles. The proposed model applied to the colorectal and breast dataset contains 100 genes with six attributes. The embedded part of the model extracts the marker gene sets indicate more stability and reliability based on ROC measure as well as better error rate in comparison to the wrapper part of the model. In the paper, the lowest error rate by an embedded model is displayed as 0.1111 for breast cancer and 0.0769 for colorectal data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Păun, G.: Computing with membranes. J. Comput. Syst. Sci. 61(1), 108–143 (2000)
Zhang, G., Haina, R., Ferrante, R., Pérez-Jiménez, M.J.: An optimization spiking neural P system for approximately solving combinatorial optimization problems. Int. J. Neural Syst. 24(5), 1440006 (2014)
Huang, L., Wang, N.: An optimization algorithm inspired by membrane computing. In: Jiao, L., Wang, L., Gao, X., Liu, J., Wu, F. (eds.) ICNC 2006. LNCS, vol. 4222, pp. 49–52. Springer, Heidelberg (2006). https://doi.org/10.1007/11881223_7
Frisco, P., Corne, D.W.: Modeling the dynamics of HIV infection with Conformon-P systems and cellular automata. In: Eleftherakis, G., Kefalas, P., Păun, G., Rozenberg, G., Salomaa, A. (eds.) WMC 2007. LNCS, vol. 4860, pp. 21–31. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-77312-2_2
Gutiérrez-Naranjo, M.A., Pérez-Jiménez, M.J., Romero-Campero, F.J.: Simulating avascular tumors with membrane systems. In: Proceedings of the Third Brainstorming Week on Membrane Computing, pp. 185–196. Fénix Editora, Sevilla (Spain) (2005)
Pérez-Jiménez, M.J., Romero-Campero, F.J.: A study of the robustness of the EGFR signalling cascade using continuous membrane systems. In: Mira, J., Álvarez, J.R. (eds.) IWINAC 2005. LNCS, vol. 3561, pp. 268–278. Springer, Heidelberg (2005). https://doi.org/10.1007/11499220_28
Bernardini, F., Gheorghe, M., Krasnogor, N.: Quorum sensing P systems. Theor. Comput. Sci. 371(1), 20–33 (2007)
Muniyandi, R.C., Zin, A.M., Sanders, J.: Converting differential-equation models of biological systems to membrane computing. BioSystems 114(3), 219–226 (2013)
Siegel, R., DeSantis, C., Jemal, A.: Colorectal cancer statistics. CA: Cancer J. Clin. 64(2), 104–117 (2014)
Gheorghe, M., Ipate, F., Dragomir, C., Mierla, L., Valencia-Cabrera, L., Garcia-Quismondo, M., Pérez-Jiménez, M.J.: Kernel P Systems Version I. In: Proceedings of the Eleventh Brainstorming Week on Membrane Computing, pp. 97–124. Fénix Editora, Sevilla(Spain) (2013)
Mohapatra, P., Chakravarty, S.: Modified PSO based feature selection for microarray data classification. In: Proceedings of the 2015 IEEE Power, Communication and Information Technology Conference (PCITC). IEEE, Bhubaneswar (India) (2015)
Kar, S., Sharma, K.D., Maitra, M.: Gene selection from microarray gene expression data for classification of cancer subgroups employing PSO and adaptive K-nearest neighborhood technique. Expert Syst. Appl. 42(1), 612–627 (2015)
Chinnaswamy, A., Srinivasan, R.: Hybrid feature selection using correlation coefficient and particle swarm optimization on microarray gene expression data. In: Snášel, V., Abraham, A., Krömer, P., Pant, M., Muda, A.K. (eds.) Innovations in Bio-Inspired Computing and Applications. AISC, vol. 424, pp. 229–239. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-28031-8_20
Mandal, M., Mukhopadhyay, A.: A graph-theoretic approach for identifying non-redundant and relevant gene markers from microarray data using multi-objective binary PSO. PloS One 9(3), e90949 (2014)
Apolloni, J., Leguizamón, G., Alba, E.: Two-hybrid wrapper-filter feature selection algorithms applied to high-dimensional microarray experiments. Appl. Soft Comput. 38, 922–932 (2016)
Elyasigomari, V., Mirjafari, M.S., Screen, H.R.C., Shaheed, M.H.: Cancer classification using a novel gene selection approach by means of shuffling based on data clustering with optimization. Appl. Soft Comput. 35, 43–51 (2015)
Sheikhpour, R., Sarram, M.A., Sheikhpour, R.: Particle swarm optimization for bandwidth determination and feature selection of kernel density estimation-based classifiers in diagnosis of breast cancer. Appl. Soft Comput. 40, 113–131 (2016)
Duan, K., Rajapakse, J.C.: A variant of SVM-RFE for gene selection in cancer classification with expression data. In: Proceedings of the 2004 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB). IEEE, La Jolla (USA) (2004)
Tang, Y., Zhang, Y.Q., Huang, Z.: Development of two-stage SVM-RFE gene selection strategy for microarray expression data analysis. IEEE/ACM Trans. Comput. Biol. Bioinform. 4(3), 365–381 (2007)
Huerta, E.B., Montiel, A.H., Caporal, R.M., Lopez, M.A: Hybrid framework using multiple-filters and an embedded approach for an efficient and robust selection and classification of microarray data. IEEE/ACM Trans. Comput. Biol. Bioinform 13(1), 12–26 (2015)
Pashaei, E., Ozen, M., Aydin, N.: Gene selection and classification approach for microarray data based on Random Forest Ranking and BBHA. In: Proceedings of the 2016 IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI). IEEE, Las Vegas (USA) (2016)
Shapiro, G.P., Tamayo, P.: Microarray data mining: facing the challenges. ACM SIGKDD Explor. Newslett. 5(2), 1–5 (2003)
Hall, M.A.: Correlation-based feature selection for machine learning. The University of Waikato, Hamilton (New Zealand) (1999). https://www.cs.waikato.ac.nz/~mhall/thesis.pdf. Accessed 20 July 2020
Koller, D., Sahami, M.: Toward optimal feature selection. In: Proceedings of the Thirteenth International Conference on International Conference on Machine Learning (ICML), pp. 284–292. Morgan Kaufmann Publishers Inc., San Francisco (USA) (1996)
Yu, L., Liu, H.: Efficient feature selection via analysis of relevance and redundancy. J. Mach. Learn. Res. 5, 1205–1224 (2004)
Furey, T.S., Cristianini, N., Duffy, N., Bednarski, D.W., Schummer, M., Haussler, D.: Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics 16(10), 906–914 (2000)
Lin, S.W., Ying, K.C., Chen, S.C., Lee, Z.J.: Particle swarm optimization for parameter determination and feature selection of support vector machines. Expert Syst. Appl. 35(4), 1817–1824 (2008)
Rahman, M.A., Muniyandi, R.C.: An enhancement in cancer classification accuracy using a two-step feature selection method based on artificial neural networks with 15 neurons. Symmetry 12, 271 (2020)
Scholkopf, B., Guyon, I., Weston, J.: Statistical Learning and Kernel Methods in Bioinformatics. IOS Press, Amsterdam (2003)
Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Mach. Learn. 46(1–3), 389–422 (2002). https://doi.org/10.1023/A:1012487302797
Burges, C.J.: A tutorial on support vector machines for pattern recognition. Data Min. Knowl. Discov. 2(2), 121–167 (1998)
Schlicker, A., et al.: Subtypes of primary colorectal tumors correlate with response to targeted treatment in colorectal cell lines. BMC Med. Genomics 5(1), 66 (2012). https://doi.org/10.1186/1755-8794-5-66
Elkhani, N., Muniyandi, R.C., Zhang, G.: Multi-objective binary PSO with kernel P system on GPU. Int. J. Comput. Commun. Control 13(3), 323–336 (2018)
Elkhani, N., Muniyandi, R.C.: A multiple core execution for multiobjective binary particle swarm optimization feature selection method with the kernel P system framework. J. Optimiz. 13, 1–14 (2017)
Muniyandi, R.C., Maroosi, A.: A representation of membrane computing with a clustering algorithm on the graphical processing unit. Processes 8(9), 1199 (2020)
Acknowledgements
The efforts of grant for Development of Membrane Computing Software (Universiti Kebangsaan Malaysia (UKM), UKM Grant Code: GGP-2019-023) has been acknowledged, as this support has played a vital role in the successful accomplishment of the research.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Muniyandi, R.C., Elkhani, N. (2021). P System as a Computing Tool for Embedded Feature Selection and Classification Method for Microarray Cancer Data. In: Freund, R., Ishdorj, TO., Rozenberg, G., Salomaa, A., Zandron, C. (eds) Membrane Computing. CMC 2020. Lecture Notes in Computer Science(), vol 12687. Springer, Cham. https://doi.org/10.1007/978-3-030-77102-7_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-77102-7_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-77101-0
Online ISBN: 978-3-030-77102-7
eBook Packages: Computer ScienceComputer Science (R0)