Skip to main content

Gene-Pair Representation and Incorporation of GO-based Semantic Similarity into Classification of Gene Expression Data

  • Conference paper
Rough Sets and Current Trends in Computing (RSCTC 2010)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6086))

Included in the following conference series:

  • 1492 Accesses

Abstract

To emphasize gene interactions in the classification algorithms, a new representation is proposed, comprising gene-pairs and not single genes. Each pair is represented by L 1 difference in the corresponding expression values. The novel representation is evaluated on benchmark datasets and is shown to often increase classification accuracy for genetic datasets. Exploiting the gene-pair representation and the Gene Ontology (GO), the semantic similarity of gene pairs can be incorporated to pre-select pairs with a high similarity value. The GO-based feature selection approach is compared to the plain data driven selection and is shown to often increase classification accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alizadeh, A., Eisen, M., Davis, R., et al.: Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403(3), 503–511 (2000)

    Article  Google Scholar 

  2. Alon, U., Barkai, N., Notterman, D.A., et al.: Broad patterns of gene expression revealed by clustering of tumor and normal colon tissues probed by oligonucleotide arrays. Proc. Natl. Acad. Sci. USA 96(12), 6745–6750 (1999)

    Article  Google Scholar 

  3. Ashburner, M., Ball, C., Blake, J., et al.: Gene ontology: tool for the unification of biology. Nature Genetics 25(1), 25–29 (2000)

    Article  Google Scholar 

  4. Baechler, E., Batliwalla, F., Karypis, G., et al.: Interferon-inducible gene expression signature in peripheral blood cells of patients with severe lupus. Proc. Natl. Acad. Sci. 100(5) (2003)

    Google Scholar 

  5. Bar-Hillel, A.: Learning from weak representations using distance functions and generative models. PhD thesis, The Hebrew University of Jerusalem (2006)

    Google Scholar 

  6. Blake, C., Merz, C.J.: UCI repository of machine learning databases (1998), http://archive.ics.uci.edu/ml/

  7. Chen, Z., Tang, J.: Using gene ontology to enhance effectiveness of similarity measures for microarray data. In: IEEE Inter. Conf. on Bioinformatics and Biomedicine, pp. 66–71 (2008)

    Google Scholar 

  8. Golub, T.R., Slonim, D.K., Tamayo, P., et al.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1999)

    Article  Google Scholar 

  9. Gordon, G., Jensen, R., Hsiao, L., et al.: Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma. Cancer Research 62(17), 4963–4967 (2002)

    Google Scholar 

  10. Guyon, I., Gunn, S., Nikravesh, M., Zadeh, L.A.: Feature Extraction, Foundations and Applications. Springer, Heidelberg (2006)

    Book  MATH  Google Scholar 

  11. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: An update. SIGKDD Explorations 11(1) (2009)

    Google Scholar 

  12. Hertz, T.: Learning Distance Functions: Algorithms and Applications. PhD thesis, The Hebrew University of Jerusalem (2006)

    Google Scholar 

  13. International Human Genome Sequencing Consortium: Finishing the euchromatic sequence of the human genome. Nature 431(7011), 931–945 (2004)

    Google Scholar 

  14. Ionasec, R.I., Tsymbal, A., Vitanovski, D., Georgescu, B., Zhou, S.K., Navab, N., Comaniciu, D.: Shape-based diagnosis of the aortic valve. In: Proc. SPIE Medical Imaging (2009)

    Google Scholar 

  15. Jiang, J.J., Conrath, D.W.: Semantic similarity based on corpus statistics and lexical taxonomy. In: Int. Conf. Research on Computational Linguistics (1997)

    Google Scholar 

  16. Kustra, R., Zagdanski, A.: Incorporating Gene Ontology in clustering gene expression data. In: Proc.19th IEEE Symposium on Computer-Based Medical Systems, CBMS 2006 (2006)

    Google Scholar 

  17. Larranaga, P., Calvo, B., Santana, R., et al.: Machine learning in bioinformatics. Brief Bioinform. 7(1), 86–112 (2006)

    Article  Google Scholar 

  18. Lin, D.: An information-theoretic definition of similarity. Morgan Kaufmann, San Francisco (1998)

    Google Scholar 

  19. Pesquita, C., Faria, D., Bastos, H., Falcão, A.O., Couto, F.M.: Evaluating GO-based semantic similarity measures. In: Proc. 10th Annual Bio-Ontologies Meeting (2007)

    Google Scholar 

  20. Pomeroy, S., Tamayo, P., Gaasenbeek, M., et al.: Prediction of central nervous system embryonal tumour outcome based on gene expression. Nature 415, 436–442 (2002)

    Article  Google Scholar 

  21. Qi, J., Tang, J.: Integrating gene ontology into discriminative powers of genes for feature selection in microarray data. In: Proc. ACM Symposium on Applied Computing (2007)

    Google Scholar 

  22. Quackenbush, J.: Computational analysis of microarray data. Nature Reviews Genetics 2(6), 418–427 (2001)

    Article  Google Scholar 

  23. Resnik, P.: Using information content to evaluate semantic similarity in a taxonomy. In: Proc. 14th Int. Joint Conf. on Artificial Intelligence (1995)

    Google Scholar 

  24. Saeys, Y., Inza, I., Larranaga, P.: A review of feature selection techniques in bioinformatics. Bioinformatics 23(19), 2507–2517 (2007)

    Article  Google Scholar 

  25. Sevilla, J., Segura, V., Podhorski, A., et al.: Correlation between gene expression and GO semantic similarity. IEEE/ACM Trans. Comput. Biol. Bioinformatics 2(4), 330–338 (2005)

    Article  Google Scholar 

  26. Tsymbal, A., Huber, M., Zhou, K.: Neighbourhood graph and learning discriminative distance functions for clinical decision support. In: Proc. IEEE Eng. Med. Biol. Soc. Conf. (2009)

    Google Scholar 

  27. van ’t Veer, L.J., Dai, H., van de Vijver, M.J., et al.: Gene expression profiling predicts clinical outcome of breast cancer. Nature 415, 530–536 (2002)

    Article  Google Scholar 

  28. Wang, H., Azuaje, F.: An ontology-driven clustering method for supporting gene expression analysis. In: Proc. 18th IEEE Symposium on Computer-Based Medical Systems, CBMS 2005, pp. 389–394. IEEE Computer Society, Los Alamitos (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Schön, T., Tsymbal, A., Huber, M. (2010). Gene-Pair Representation and Incorporation of GO-based Semantic Similarity into Classification of Gene Expression Data. In: Szczuka, M., Kryszkiewicz, M., Ramanna, S., Jensen, R., Hu, Q. (eds) Rough Sets and Current Trends in Computing. RSCTC 2010. Lecture Notes in Computer Science(), vol 6086. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13529-3_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-13529-3_24

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-13528-6

  • Online ISBN: 978-3-642-13529-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics