Abstract
The microarray cancer data obtained through microarray technology poses a lot of challenges during classification since the sample size is very small and the dimensionality of the data is very high. It is noticed that usually, the number of classes in multiclass datasets are highly imbalanced. In order to reduce the dimensionality thereby enabling accurate classification, in this work, we propose an L1-regulated feature selection and deep learning is applied for classification. The L1-regulated feature selection is based on Linear Support Vector Machine (LSVM) which is characterized by adding a penalty term to the prediction error in order to reduce the weight of the irrelevant features and to make the relevant features having nonzero weights. For classification purpose, deep learning neural network is initialized with sigmoid activation function in the input and hidden layers and to accommodate multiclass classification, the softmax activation function is used in the output layer. In order to demonstrate the suitability of the proposed approach, experiments are conducted on the six numbers of standard multiclass cancer datasets and to argue the predictive capability of the proposed approach, experiments are conducted on imbalanced class datasets such as 5-class lung cancer dataset, and 4-class Leukemia cancer dataset. Comparative study is also provided with state-of-the-art approaches and the results are presented considering classification accuracy, precision, recall, f-measure, confusion matrix, average precision, and ROC metrics to exhibit the performance of the proposed approach.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Al-Rajab, M., Joan, L., Qiang, X.: Examining applying high performance genetic data feature selection and classification algorithms for colon cancer diagnosis. Comput. Methods Programs Biomed. 146, 11–24 (2017)
Aziz, R., Verma, C.K., Srivastava, N.: A fuzzy based feature selection from independent component subspace for machine learning classification of microarray data. Genomics Data 8, 4–15 (2016)
Bühlmann, P., Van De Geer, S.: Statistics for High-dimensional Data: Methods, Theory and Applications. Springer Science & Business Media (2011)
Chen, K.-H., Wang, K.-J., Wang, K.-M., Angelia, M.-A.: Applying particle swarm optimization-based decision tree classifier for cancer classification on gene expression data. Appl. Soft Comput. 24, 773–780 (2014)
Ebrahimpour, M.K., Eftekhari, M.: Ensemble of feature selection methods: a hesitant fuzzy sets approach. Appl. Soft Comput. 50, 300–312 (2017)
Fonti, V., Belitser, E.: Feature selection using LASSO, VU Amsterdam Research Paper in Business Analytics (2017)
Garro, B.A., Rodríguez, K., Vázquez, R.A.: Classification of DNA microarrays using artificial neural networks and ABC algorithm. Appl. Soft Comput. 38, 548–560 (2016)
Guo, S., Guo, D., Chen, L., Jiang, Q.: A l1-regularized feature selection method for local dimension reduction on microarray data. Comput. Biol. Chem. 67, 92–101 (2017)
Kar, S., Sharma, K.D., Maitra, M.: Gene selection from microarray gene expression data for classification of cancer subgroups employing PSO and adaptive k-nearest neighborhood technique. Expert Syst. Appl. 42(1), 612–627 (2015)
Kumar, M., Rath, N.K., Swain, A., Rath, S.K.: Feature selection and classification of microarray data using mapreduce based ANOVA and k-nearest neighbor. Procedia Comput. Sci. 54, 301–310 (2015)
Lin, T.-C., Liu, R.-S., Chen, C.-Y., Chao, Y.-T., Chen, S.-Y.: Pattern classification in DNA microarray data of multiple tumor types. Pattern Recognit. 39(12), 2426–2438 (2006)
Liu, Z., Tang, D., Cai, Y., Wang, R., Chen, F.: A hybrid method based on ensemble WELM for handling multi class imbalance in cancer microarray data. Neurocomputing 266, 641–650 (2017)
Lv, J., Peng, Q., Chen, X., Sun, Z.: A multi-objective heuristic algorithm for gene expression microarray data classification. Expert Syst. Appl. 59, 13–19 (2016)
Medjahed, S.A., Saadi, T.A., Benyettou, A., Ouali, M.: Kernel-based learning and feature selection analysis for cancer diagnosis. Appl. Soft Comput. 51, 39–48 (2017)
Moayedikia, A., Ong, K.-L., Boo, Y.L., Yeoh, W.G.S., Jensen, R.: Feature selection for high dimensional imbalanced class data using harmony search. Eng. Appl. Artif. Intell. 57, 38–49 (2017)
Mohapatra, P., Chakravarty, S., Dash, P.K.: Microarray medical data classification using kernel ridge regression and modified cat swarm optimization based gene selection system. Swarm Evol. Comput. 28, 144–160 (2016)
Mollaee, M., Moattar, M.H.: A novel feature extraction approach based on ensemble feature selection and modified discriminant independent component analysis for microarray data classification. Biocybern. Biomed. Eng. 36(3), 521–529 (2016)
Nguyen, T., Khosravi, A., Creighton, D., Nahavandi, S.: A novel aggregate gene selection method for microarray data classification. Pattern Recognit. Lett. 60, 16–23 (2015)
Ravı, D., Wong, C., Deligianni, F., Berthelot, M., Andreu- Perez, J., Lo, B., Yang, G.-Z.: Deep learning for health informatics. IEEE J. Biomed. Health Inform. 21(1), 4–21 (2017)
Sasikala, S., Appavu alias Balamurugan, S., Geetha, S.: A novel adaptive feature selector for supervised classification. Inf. Process. Lett. 117, 25–34 (2017)
Sharbaf, F.V., Mosafer, S., Moattar, M.H.: A hybrid gene selection approach for microarray data classification using cellular learning automata and ant colony optimization. Genomics 107(6), 231–238 (2016)
Tabakhi, S., Najafi, A., Ranjbar, R., Moradi, P.: Gene selection for microarray data classification using a novel ant colony optimization. Neurocomputing 168, 1024–1036 (2015)
Tarek, S., Elwahab, R.A., Shoman, M.: Gene expression based cancer classification. Egypt. Inform. J. 18(3), 151–159 (2017)
Wang, H., Jing, X., Niu, B.: A discrete bacterial algorithm for feature selection in classification of microarray gene expression cancer data. Knowl. Based Syst. 126, 8–19 (2017)
You, W., Yang, Z., Ji, G.: Feature selection for high-dimensional multi-category data using PLS-based local recursive feature elimination. Expert Syst. Appl. 41(4), 1463–1475 (2014)
Zhu, Z., Ong, Y.-S., Dash, M.: Markov blanket-embedded genetic algorithm for gene selection. Pattern Recognit. 40(11), 3236–3248 (2007)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Shekar, B.H., Dagnew, G. (2020). L1-Regulated Feature Selection and Classification of Microarray Cancer Data Using Deep Learning. In: Chaudhuri, B., Nakagawa, M., Khanna, P., Kumar, S. (eds) Proceedings of 3rd International Conference on Computer Vision and Image Processing. Advances in Intelligent Systems and Computing, vol 1024. Springer, Singapore. https://doi.org/10.1007/978-981-32-9291-8_19
Download citation
DOI: https://doi.org/10.1007/978-981-32-9291-8_19
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-32-9290-1
Online ISBN: 978-981-32-9291-8
eBook Packages: EngineeringEngineering (R0)