Virtual Gene: Using Correlations Between Genes to Select Informative Genes on Microarray Datasets

Xu, Xian; Zhang, Aidong

doi:10.1007/11567752_10

Virtual Gene: Using Correlations Between Genes to Select Informative Genes on Microarray Datasets

Xian Xu²¹ &
Aidong Zhang²¹

Conference paper

329 Accesses
4 Citations

Part of the book series: Lecture Notes in Computer Science ((TCSB,volume 3680))

Abstract

Gene Selection is one class of most used data analysis algorithms on microarray datasets. The goal of gene selection algorithms is to filter out a small set of informative genes that best explains experimental variations. Traditional gene selection algorithms are mostly single-gene based. Some discriminative scores are calculated and sorted for each gene. Top ranked genes are then selected as informative genes for further study. Such algorithms ignore completely correlations between genes, although such correlations is widely known. Genes interact with each other through various pathways and regulative networks. In this paper, we propose to use, instead of ignoring, such correlations for gene selection. Experiments performed on three public available datasets show promising results.

This research is partly supported by National Science Foundation Grants DBI-0234895, IIS-0308001 and National Institutes of Health Grant 1 P20 GM067650-01A1. All opinions, findings, conclusions and recommendations in this paper are those of the authors and do not necessarily reflect the views of the National Science Foundation or the National Institutes of Health.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Alon, U., Barkai, N., Notterman, D., Gish, K., Ybarra, S., Mack, D., Levine, A.: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissue probed by oligonucleotide arrays. Proc. Natl. Acad. Sci. U.S.A. 96(12), 6745–6750 (1999)
Article Google Scholar
Ben-Dor, A., Bruhn, L., Friedman, N., Nachman, I., Schummer, M., Yakhini, Z.: Tissue classification with gene expression profiles 7, 559–583 (2000)
Google Scholar
Bø, T., Jonassen, I.: New feature subset selection procedures for classification of expression profiles. Genome Biology 3(4), research0017.1–0017.11 (2002)
Google Scholar
Bobashev, G.V., Das, S., Das, A.: Experimental design for gene microarray experiments and differential expression analysis. In: Methods of Microarray Data Analysis II, pp. 23–41 (2001)
Google Scholar
Chang, C.-C., Lin, C.-J.: Libsvm: a library for support vector machines
Google Scholar
Dudoit, S., Fridlyand, J., Speed, T.P.: Comparison of discrimination methods for the classification of tumors using gene expression data. Journal of the American Statistical Association 97(457), 77–87 (2002)
Article MATH MathSciNet Google Scholar
Golub, T.R., et al.: Molecular classifications of cancer: Class discovery and class prediction by gene expression monitoring. Science 286(5439), 531–537 (1999)
Article Google Scholar
Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Machine Learning 46, 389–422 (2002)
Article MATH Google Scholar
Hastie, T., Tibshirani, R., Eisen, M., Alizadeh, A., Levy, R., Staudt, L., Chan, W., Botstein, D., Brown, P.: ’gene shaving’ as a method for identifying distinct sets of genes with similar expression patterns. Genome Biology 1(2) (2000)
Google Scholar
Jaeger, J., Sengupta, R., Ruzzo, W.L.: Improved gene selection for classification of microarrays. In: Proc. PSB (2003)
Google Scholar
Jain, A.K., Duin, R.P., Mao, J.: Statistical pattern recognition: A review. IEEE Transactions on pattern analysis and machine intelligence 22(1), 4–37 (2000)
Article Google Scholar
Jiang, D., Tang, C., Zhang, A.: Cluster analysis for gene expression data: A survey. IEEE Transactions on Knowledge and Data Engineering 16(11), 1370–1386 (2004)
Article Google Scholar
Khan, J., Wei, J., Ringner, M., Saal, L., Ladanyi, M., Westermann, F., Berthold, F., Schwab, M., Antonescu, C., Peterson, C.: Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nature Medicine 7(6), 673–679 (2001)
Article Google Scholar
Li, T., Zhang, C., Ogihara, M.: A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression. Bioinformatics 20, 2429–2437 (2004)
Article Google Scholar
Li, W., Grosse, I.: Gene selection criterion for discriminant microarray data analysis based on extreme value distributions. In: Proc. RECOMB (2003)
Google Scholar
Lu, Y., Han, J.: Cancer classification using gene expression data. Genome Inform 28, 243–268 (2003)
MATH Google Scholar
Mardia, K., Kent, J., Bibby, J.: Multivariate Analysis. Academic Press, London (1979)
MATH Google Scholar
Ramaswamy, S., Tamayo, P., Rifkin, R., Mukherjee, S., Yeang, C., Angelo, M., Ladd, C., Reich, M., Latulippe, E., Mesirov, J., Poggio, T., Gerald, W., Loda, M., Lander, E.S., Golub, T.: Multiclass cancer diagnosis using tumor gene expression signatures. PNAS 98(26), 15149–15154 (2001)
Article Google Scholar
Tusher, V.G., Tibshirani, R., Chu, G.: Significance analysis of microarrays applied to the ionizing radiation response. PNAS 98(9), 5116–5121 (2001)
Article MATH Google Scholar
Wang, Y., Makedon, F.S., Ford, J.C., Pearlman, J.: Hykgene: a hybrid approach for selecting marker genes for phenotype classification using microarray gene expression data. Bioinformatics 21(8), 1530–1537 (2005)
Article Google Scholar
Wu, Y., Zhang, A.: Feature selection for classifying high-dimensional numerical data. In: IEEE Conference on Computer Vision and Pattern Recognition 2004, vol. 2, pp. 251–258 (2004)
Google Scholar
Xing, E.P., Jordan, M.I., Karp, R.M.: Feature selection for high-dimensional genomic microarray data. In: Proc. 18th International Conf. on Machine Learning, pp. 601–608. Morgan Kaufmann, San Francisco (2001)
Google Scholar
Yu, L., Liu, H.: Redundancy based feature selection for microarray data. In: Proc. of SIGKDD (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, State University of New York at Buffalo, Buffalo, NY, 14260, USA
Xian Xu & Aidong Zhang

Authors

Xian Xu
View author publications
You can also search for this author in PubMed Google Scholar
Aidong Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Centre for Computational and Systems Biology, The Microsoft Research - University of Trento, Piazza Manci, 17, 38050, Povo (TN), Italy
Corrado Priami
Department of Computer Science, Georgia State University, GA 30303, Atlanta, USA
Alexander Zelikovsky

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xu, X., Zhang, A. (2005). Virtual Gene: Using Correlations Between Genes to Select Informative Genes on Microarray Datasets. In: Priami, C., Zelikovsky, A. (eds) Transactions on Computational Systems Biology II. Lecture Notes in Computer Science(), vol 3680. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11567752_10

Download citation

DOI: https://doi.org/10.1007/11567752_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29401-6
Online ISBN: 978-3-540-31661-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics