Skip to main content

Virtual Gene: Using Correlations Between Genes to Select Informative Genes on Microarray Datasets

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((TCSB,volume 3680))

Abstract

Gene Selection is one class of most used data analysis algorithms on microarray datasets. The goal of gene selection algorithms is to filter out a small set of informative genes that best explains experimental variations. Traditional gene selection algorithms are mostly single-gene based. Some discriminative scores are calculated and sorted for each gene. Top ranked genes are then selected as informative genes for further study. Such algorithms ignore completely correlations between genes, although such correlations is widely known. Genes interact with each other through various pathways and regulative networks. In this paper, we propose to use, instead of ignoring, such correlations for gene selection. Experiments performed on three public available datasets show promising results.

This research is partly supported by National Science Foundation Grants DBI-0234895, IIS-0308001 and National Institutes of Health Grant 1 P20 GM067650-01A1. All opinions, findings, conclusions and recommendations in this paper are those of the authors and do not necessarily reflect the views of the National Science Foundation or the National Institutes of Health.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alon, U., Barkai, N., Notterman, D., Gish, K., Ybarra, S., Mack, D., Levine, A.: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissue probed by oligonucleotide arrays. Proc. Natl. Acad. Sci. U.S.A. 96(12), 6745–6750 (1999)

    Article  Google Scholar 

  2. Ben-Dor, A., Bruhn, L., Friedman, N., Nachman, I., Schummer, M., Yakhini, Z.: Tissue classification with gene expression profiles 7, 559–583 (2000)

    Google Scholar 

  3. Bø, T., Jonassen, I.: New feature subset selection procedures for classification of expression profiles. Genome Biology 3(4), research0017.1–0017.11 (2002)

    Google Scholar 

  4. Bobashev, G.V., Das, S., Das, A.: Experimental design for gene microarray experiments and differential expression analysis. In: Methods of Microarray Data Analysis II, pp. 23–41 (2001)

    Google Scholar 

  5. Chang, C.-C., Lin, C.-J.: Libsvm: a library for support vector machines

    Google Scholar 

  6. Dudoit, S., Fridlyand, J., Speed, T.P.: Comparison of discrimination methods for the classification of tumors using gene expression data. Journal of the American Statistical Association 97(457), 77–87 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  7. Golub, T.R., et al.: Molecular classifications of cancer: Class discovery and class prediction by gene expression monitoring. Science 286(5439), 531–537 (1999)

    Article  Google Scholar 

  8. Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Machine Learning 46, 389–422 (2002)

    Article  MATH  Google Scholar 

  9. Hastie, T., Tibshirani, R., Eisen, M., Alizadeh, A., Levy, R., Staudt, L., Chan, W., Botstein, D., Brown, P.: ’gene shaving’ as a method for identifying distinct sets of genes with similar expression patterns. Genome Biology 1(2) (2000)

    Google Scholar 

  10. Jaeger, J., Sengupta, R., Ruzzo, W.L.: Improved gene selection for classification of microarrays. In: Proc. PSB (2003)

    Google Scholar 

  11. Jain, A.K., Duin, R.P., Mao, J.: Statistical pattern recognition: A review. IEEE Transactions on pattern analysis and machine intelligence 22(1), 4–37 (2000)

    Article  Google Scholar 

  12. Jiang, D., Tang, C., Zhang, A.: Cluster analysis for gene expression data: A survey. IEEE Transactions on Knowledge and Data Engineering 16(11), 1370–1386 (2004)

    Article  Google Scholar 

  13. Khan, J., Wei, J., Ringner, M., Saal, L., Ladanyi, M., Westermann, F., Berthold, F., Schwab, M., Antonescu, C., Peterson, C.: Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nature Medicine 7(6), 673–679 (2001)

    Article  Google Scholar 

  14. Li, T., Zhang, C., Ogihara, M.: A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression. Bioinformatics 20, 2429–2437 (2004)

    Article  Google Scholar 

  15. Li, W., Grosse, I.: Gene selection criterion for discriminant microarray data analysis based on extreme value distributions. In: Proc. RECOMB (2003)

    Google Scholar 

  16. Lu, Y., Han, J.: Cancer classification using gene expression data. Genome Inform 28, 243–268 (2003)

    MATH  Google Scholar 

  17. Mardia, K., Kent, J., Bibby, J.: Multivariate Analysis. Academic Press, London (1979)

    MATH  Google Scholar 

  18. Ramaswamy, S., Tamayo, P., Rifkin, R., Mukherjee, S., Yeang, C., Angelo, M., Ladd, C., Reich, M., Latulippe, E., Mesirov, J., Poggio, T., Gerald, W., Loda, M., Lander, E.S., Golub, T.: Multiclass cancer diagnosis using tumor gene expression signatures. PNAS 98(26), 15149–15154 (2001)

    Article  Google Scholar 

  19. Tusher, V.G., Tibshirani, R., Chu, G.: Significance analysis of microarrays applied to the ionizing radiation response. PNAS 98(9), 5116–5121 (2001)

    Article  MATH  Google Scholar 

  20. Wang, Y., Makedon, F.S., Ford, J.C., Pearlman, J.: Hykgene: a hybrid approach for selecting marker genes for phenotype classification using microarray gene expression data. Bioinformatics 21(8), 1530–1537 (2005)

    Article  Google Scholar 

  21. Wu, Y., Zhang, A.: Feature selection for classifying high-dimensional numerical data. In: IEEE Conference on Computer Vision and Pattern Recognition 2004, vol. 2, pp. 251–258 (2004)

    Google Scholar 

  22. Xing, E.P., Jordan, M.I., Karp, R.M.: Feature selection for high-dimensional genomic microarray data. In: Proc. 18th International Conf. on Machine Learning, pp. 601–608. Morgan Kaufmann, San Francisco (2001)

    Google Scholar 

  23. Yu, L., Liu, H.: Redundancy based feature selection for microarray data. In: Proc. of SIGKDD (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Xu, X., Zhang, A. (2005). Virtual Gene: Using Correlations Between Genes to Select Informative Genes on Microarray Datasets. In: Priami, C., Zelikovsky, A. (eds) Transactions on Computational Systems Biology II. Lecture Notes in Computer Science(), vol 3680. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11567752_10

Download citation

  • DOI: https://doi.org/10.1007/11567752_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-29401-6

  • Online ISBN: 978-3-540-31661-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics