Abstract
This paper proposes a multi-population χ 2 test method for informative gene selection of a tumor from microarray data based on the statistical multi-population χ 2 test with the sample data being grouped evenly. To test the effectiveness of the multi-population χ 2 test method, we use the support vector machine (SVM) to construct a tumor diagnosis system (i.e., a binary classifier) based on the identified informative genes on the colon and leukemia data. It is shown by the experiments that the constructed diagnosis system with the multi-population χ 2 test method can 100% correctness rate of diagnosis on colon dataset and 97.1% correctness rate of diagnosis on leukemia dataset, respectively.
This work was supported by the Natural Science Foundation of China for Project 60471054
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Golub, T.R., Slonim, D.K., Tamayo, P., et al.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1999)
Ding, C.: Analysis of gene expression profiles: class discovery and leaf ordering. In: Proceedings of the 6th Annual International Conference on Computational Molecular Biology (RECOMB 2002), Washington, D. C., USA, April 18-21, pp. 601–680 (2002)
Ben-Dor, A., Friedman, N., Yakhini, Z.: Scoring Genes for Relevance. Agilent Technical Report, no. AGL-2000-13 (2000)
Xing, E.P.: Feature selection for high-dimensional genomic microarray data. In: Proceedings of the 18th International Conference of Machine Learning (ICML 2001), Massachusetts, USA, June 28-July 1, pp. 601–608 (2001)
Brown, M.P., Brown, S., Grundy, W.N., Lin, D., Cristianini, N., et al.: Knowledgebased analysis of microarray gene expression data by using support vector machines. Proc. Nat’l. Acad. Sci. 97(1), 262–267 (2000)
Dudoit, D.S., Fridyand, J., Speed, T.P.: Comparison of discrimination methods for the classification of tumor using gene expression data. Univ. of California, Dept. of Statistics, Tech Report, no.576 (2000)
Furey, T., Cristianini, N., Duffy, N., et al.: Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics 16(10), 909–914 (2000)
Guyon, G.I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machine. Machine Learning 46(1/3), 389–422 (2002)
Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)
Deng, L., Ma, J., Pei, J.: Rank sum method for related gene selection and its application to tumor diagnosis. Chinese Science Bulletin 49(15), 1652–1657 (2004)
Hollander, M., Wolfe, D.A.: Nonparametric statistical method. Wiley, New York (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Luo, J., Ma, J. (2005). A Multi-population χ 2 Test Approach to Informative Gene Selection. In: Gallagher, M., Hogan, J.P., Maire, F. (eds) Intelligent Data Engineering and Automated Learning - IDEAL 2005. IDEAL 2005. Lecture Notes in Computer Science, vol 3578. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11508069_53
Download citation
DOI: https://doi.org/10.1007/11508069_53
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26972-4
Online ISBN: 978-3-540-31693-0
eBook Packages: Computer ScienceComputer Science (R0)