Skip to main content

Integrating Multiple-Platform Expression Data through Gene Set Features

  • Conference paper
Bioinformatics Research and Applications (ISBRA 2009)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 5542))

Included in the following conference series:

Abstract

We demonstrate a set-level approach to the integration of multiple platform gene expression data for predictive classification and show its utility for boosting classification performance when single- platform samples are rare. We explore three ways of defining gene sets, including a novel way based on the notion of a fully coupled flux related to metabolic pathways. In two tissue classification tasks, we empirically show that the gene set based approach is useful for combining heterogeneous expression data, while surprisingly, in experiments constrained to a single platform, biologically meaningful gene sets acting as sample features are often outperformed by random gene sets with no biological relevance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bild, A., Febbo, P.G.: Application of a priori established gene sets to discover biologically important differential expression in microarray data. PNAS 102(43), 15278–15279 (2005)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Bolstad, B.M., Irizarry, R.A., Astrand, M., Speed, T.P.: A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 19(2), 185–193 (2003)

    Article  CAS  PubMed  Google Scholar 

  3. The Gene Ontology Consortium. Gene ontology: tool for the unification of biology. Nature Genetics 25 (2000)

    Google Scholar 

  4. Gentleman, R.C., Carey, V.J., Bates, D.M., et al.: Bioconductor: Open software development for computational biology and bioinformatics. Genome Biology 5, R80 (2004)

    Article  Google Scholar 

  5. Goeman, J., Bühlmann, P.: Analyzing gene expression data in terms of gene sets: methodological issues. Bioinformatics 23(8), 980–987 (2007)

    Article  CAS  PubMed  Google Scholar 

  6. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. Springer, Heidelberg (2001)

    Book  Google Scholar 

  7. Ho, T.K.: The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(8), 832–844 (1998)

    Article  Google Scholar 

  8. Holec, M., Zelezny, F., Klema, J., et al.: Using bio-pathways in relational learning. In: Late Breaking Papers, 18th International Conference on Inductive Logic Programming (ILP 2008) (2008)

    Google Scholar 

  9. Huang, D.W., Sherman, B.T., Lempick, R.A.: Systematic and integrative analysis of large gene lists using david bioinformatics resources. Nature Protocols 4, 44–57 (2009)

    Article  CAS  Google Scholar 

  10. Kanehisa, M., Goto, S., Kawashima, S., Okuno, Y., Hattori, M.: The KEGG resource for deciphering the genome. Nucleic Acids Res. 32, 277–280 (2004)

    Article  Google Scholar 

  11. Mootha, V.K., Lindgren, C., Laureta, S., et al.: Pgc-1-alpha-responsive genes involved in oxidative phosphorylation are coorinately down regulated in human diabetes. Nature Genetics 34, 267–273 (2003)

    Article  CAS  PubMed  Google Scholar 

  12. Nicolae, D.L., De la Cruz, O., Wen, W., Ke, B., Song, M.: Invited keynote talk: Set-level analyses for genome-wide association data. In: Măndoiu, I., Sunderraman, R., Zelikovsky, A. (eds.) ISBRA 2008. LNCS (LNBI), vol. 4983, p. 1. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  13. Notebaart, R.A., Teusink, B., Siezen, R.J., Papp, B.: Co-regulation of metabolic genes is better explained by flux coupling than by network distance. PLOS Computational Biology 4(1) (2008)

    Google Scholar 

  14. Edgar, R., Domrachev, M., Lash, A.E.: Gene expression omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res. 30(1), 207–210 (2002)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Rapaport, F., Zinovyev, A., Dutreix, M., Barillot, E., Vert, J.-P.: Classification of microarray data using gene networks. BMC Bioinformatics 8, 35 (2007)

    Article  PubMed  PubMed Central  Google Scholar 

  16. Shaw, A.S., Filbert, E.L.: Scaffold proteins and immune-cell signalling. Nat. Rev. Immunol. 9(1), 47–56 (2009)

    Article  CAS  PubMed  Google Scholar 

  17. Stalteri, M.A., Harrison, A.P.: Interpretation of multiple probe sets mapping to the same gene in affymetrix genechips. BMC Bioinformatics 8, 13 (2007)

    Article  PubMed  PubMed Central  Google Scholar 

  18. Tomfohr, J., Lu, J., Kepler, T.B.: Pathway level analysis of gene expression using singular value decomposition. BMC Bioinformatics 6 (2005)

    Google Scholar 

  19. Weichhart, T., Semann, M.D.: The PI3K/Akt/mTOR pathway in innate immune cells: emerging therapeutic applications. Ann Rheum Dis. suppl. 3, iii:70–74 (2008)

    Google Scholar 

  20. Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)

    Google Scholar 

  21. Sun, Y., Chen, J.: mTOR signaling: PLD takes center stage. Cell Cycle 7(20), 3118–3123 (2008)

    Article  CAS  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Holec, M., Železný, F., Kléma, J., Tolar, J. (2009). Integrating Multiple-Platform Expression Data through Gene Set Features. In: Măndoiu, I., Narasimhan, G., Zhang, Y. (eds) Bioinformatics Research and Applications. ISBRA 2009. Lecture Notes in Computer Science(), vol 5542. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01551-9_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-01551-9_2

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-01550-2

  • Online ISBN: 978-3-642-01551-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics