Abstract
Because the original gene microarray data has many characteristics such as high dimension and big redundant, which is not good at classification and diagnosis of cancer. So it is very important to reduce the dimensionality and identify genes which contribute most to the classification of cancer. A method of dimensionality reduction based on the combination of mutual information and PCA is proposed in this paper. We adopted the SVM as the classifier in the experiment to evaluate the effectiveness of our method. The experimental results prove that the proposed method is an effective method for dimensionality reduction which can get very small subset of features and lead to a better classification performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Yuvaraj, N., Vivekanandan, P.: An efficient SVM based cancer classification with symmetry non-negative matrix factorization using gene expression data. In: International Conference on Information Communication and Embedded Systems, pp. 761–768. IEEE Press, Chennai (2013)
Su, Y., Wang, R., Li, C., Chen, P.: A dynamic subspace learning method for tumor classification using microarray gene expression data. In: 7th International Conference on Natural Computation, pp. 396–400. IEEE Press, Shanghai (2011)
Jafari, P., Azuaje, F.: An assessment of recently published gene expression data analyses: reporting experimental design and statistical factors. BMC Med. Inform. Decis. 6(1), 27 (2006)
Golub, T.R., Slonim, D.K., Tamayo, P., et al.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(5439), 531–537 (1999)
Novovičová, J., Malík, A., Pudil, P.: Feature selection using improved mutual information for text classification. In: Fred, A., Caelli, T.M., Duin, R.P.W., Campilho, A.C., Ridder, D. (eds.) SSPR&SPR 2004. LNCS, vol. 3138, pp. 1010–1017. Springer, Heidelberg (2004)
Marohnic, V., Debeljak, Z., Bogunovic, N.: Mutual information based reduction of data mining dimensionality in gene expression analysis. In: 26th International Conference on Information Technology Interfaces, pp. 249–254. IEEE Press, Cavtat (2004)
Huang, D.S., Zheng, C.H.: Independent component analysis-based penalized discriminant method for cancer classification using gene expression data. Bioinformatics 22(15), 1855–1862 (2006)
Wang, S.-L., You, H.-Z., Lei, Y.-K., Li, X.-L.: Performance comparison of tumor classification based on linear and non-linear dimensionality reduction methods. In: Huang, D.-S., Zhao, Z., Bevilacqua, V., Figueroa, J.C. (eds.) ICIC 2010. LNCS, vol. 6215, pp. 291–300. Springer, Heidelberg (2010)
Clausen, C., Wechsler, H.: Color image compression using PCA and back propagation learning. Pattern Recogn. 33(9), 1555–1560 (2000)
Gottumukkal, R., Asari, V.K.: An improved face recognition technique based on modular PCA approach. Pattern Recogn. Lett. 25(4), 429–436 (2004)
Lu, H.J.: A Study of Cancer Classification Algorithms Using Gene Expression Data. Xuzhou, China (2012)
Luo, W., Wang, L., Sun, J.: Feature selection for cancer classification based on support vector machine. In: WRI Global Congress on Intelligent Systems, pp. 422–426. IEEE Press, Xiamen (2009)
Zhu, L., Han, B., Li, L., et al.: A novel two-stage cancer classification method for microarray data based on supervised manifold learning. In: 2nd IEEE International Conference on Bioinformatics and Biomedical Engineering, pp. 1908–1911. IEEE Press, Shanghai (2008)
Shannon, C.E.: A mathematical theory of communication. In: 5th ACM SIGMOBILE Mobile Computing and Communications Review, New York, pp. 3–55 (2001)
LS-SVM toolbox Download. http://www.esat.kuleuven.be/sista/lssvmlab
Alon, U., Barkai, N., Notterman, D.A., et al.: Broad patterns of gene expression revealed by clustering analysis of cancer and normal colon tissues probed by oligonucleotide arrays. P. Natl. Acad. Sci. 96(12), 6745–6750 (1999)
Singh, D., Febbo, P.G., Ross, K., et al.: Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1(2), 203–209 (2002)
Chandra, B., Gupta, M.: An efficient statistical feature selection approach for classification of gene expression data. J. Biomed. Inform. 44(4), 529–535 (2011)
Acknowledgements
This work is supported by the National Natural Science Foundation of China (Nos. 61425002, 61402066, 61402067, 31370778, 61370005, 31170797), the Basic Research Program of the Key Lab in Liaoning Province Educational Department (Nos. LZ2014049, LZ2015004), the Project Supported by Natural Science Foundation of Liaoning Province (No. 2014020132), the Project Supported by Scientific Research Fund of Liaoning Provincial Education (No. L2014499), and by the Program for Liaoning Key Lab of Intelligent Information Processing and Network Technology in University.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Han, L., Zhou, C., Wang, B., Zhang, Q. (2015). A Combining Dimensionality Reduction Approach for Cancer Classification. In: Bikakis, A., Zheng, X. (eds) Multi-disciplinary Trends in Artificial Intelligence. MIWAI 2015. Lecture Notes in Computer Science(), vol 9426. Springer, Cham. https://doi.org/10.1007/978-3-319-26181-2_32
Download citation
DOI: https://doi.org/10.1007/978-3-319-26181-2_32
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-26180-5
Online ISBN: 978-3-319-26181-2
eBook Packages: Computer ScienceComputer Science (R0)