Abstract
Recent advances in various forms of omics technologies have generated huge amount of data. To fully exploit these data sets that in many cases are publicly available, robust computational methodologies need to be developed to deal with the storage, integration, analysis, visualization, and dissemination of these data. In this paper, we describe some of our research activities in data integration leading to novel knowledge discovery in life sciences. Our multi-strategy approach with integration of prior knowledge facilitates a novel means to identify informative genes that could have been missed by the commonly used methods. Our transcriptomics-proteomics integrative framework serves as a means to enhance the confidence of and also to complement transcriptomics discovery. Our new research direction in integrative data analysis of omics data is targeted to identify molecular associations to disease and therapeutic response signatures. The ultimate goal of this research is to facilitate the development of clinical test-kits for early detection, accurate diagnosis/prognosis of disease, and better personalized therapeutic management.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Joyce, A.R., Palsson, B.O.: The model organism as a system: integrating ‘omics’ data sets. Nat. Rev. Mol. Cell Biol. 7, 198–210 (2006)
Baxevanis, A.D.: The importance of biological databases in biological discovery. Curr. Protoc. Bioinformatics Chapter 1: Unit 1.1 (2009)
Galperin, M.Y., Cochrane, G.R.: Nucleic acids research annual database issue and the nar online molecular biology database collection in 2009. Nucleic Acids Res. 37, D1–D4 (2009)
Fleischmann, R.D., Adams, M.D., et al.: Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science 269, 496–512 (1995)
National Center for Biotechnology Information (NCBI): Genome sequencing projects statistics, http://www.ncbi.nlm.nih.gov (retrieved December 6, 2009)
Brent, M.R.: Steady progress and recent breakthroughs in the accuracy of automated genome annotation. Nat. Rev. Genet. 9, 62–73 (2008)
ENCODE Project Consortium: The ENCODE (ENCyclopedia Of DNA Elements) Project. Science 306, 636–640 (2004)
Allison, D.B., Cui, X., et al.: Microarray data analysis: from disarray to consolidation and consensus. Nat. Rev. Genet. 7, 55–65 (2006)
Mockler, T.C., Chan, S., et al.: Applications of DNA tiling arrays for whole-genome analysis. Genomics 85, 1–15 (2005)
Shendure, J., Ji, H.: Next-generation DNA sequencing. Nat. Biotechnol. 26, 1135–1145 (2008)
Ostrowski, J., Wyrwicz, L.S.: Integrating genomics, proteomics and bioinformatics in translational studies of molecular medicine. Expert. Rev. Mol. Diagn. 9, 623–630 (2009)
Hu, Q., Noll, R.J., et al.: The Orbitrap: a new mass spectrometer. J. Mass. Spectrom. 40, 430–443 (2005)
Lubec, G., Afjehi-Sadat, L.: Limitations and pitfalls in protein identification by mass spectrometry. Chem. Rev. 107, 3568–3584 (2007)
Nie, L., Wu, G., et al.: Integrative analysis of transcriptomic and proteomic data: challenges, solutions and applications. Crit. Rev. Biotechnol. 27, 63–75 (2007)
Liu, Z., Phan, S., Famili, F., Pan, Y., Lenferink, A., Cantin, C., Collins, C., O’Connor-McCourt, M.: A multi-strategy approach to informative genes identification from gene expression data. J. Bioinfo. Comput. Biol (2010) (in press)
Phan, S., Shearer, H., Tchagang, A., Liu, Z., Famili, F., Fobert, F., Pan, Y.: Arabidopsis thaliana defense gene response under pathogen challenge. In: The 9th GHI-AGM, Montreal, June 8-10 (2009)
Subramanian, A., Tamayo, P., et al.: Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl. Acad. Sci. 102, 15545–15550 (2005)
Goeman, J.J., Buhlmann, P.: Analyzing gene expression data in terms of gene sets: methodological issues. Bioinformatics 23, 980–987 (2007)
Ogata, H., Goto, S., et al.: KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 27, 29–34 (1999)
Ashburner, M., Ball, C.A., et al.: Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25, 25–29 (2000)
Fobert, P., Després, C.: Redox control of systemic acquired resistance. Curr. Op. Plant Biol. 8, 378–382 (2005)
Kesarwani, M., Yoo, J., Dong, X.: Genetic Interactions of TGA transcription factors in the regulation of pathogenesis-related genes and disease resistance in Arabidopsis. Plant Physiol. 14, 336–346 (2007)
Lenferink, A.E.G., Magoon, J., Cantin, C., O’Connor-McCourt, M.D.: Investigation of three new mouse mammary tumor cell lines as models for transforming growth factor (TGF)-β and Neu pathway signaling studies: identification of a novel model for TGF-β-induced epithelial-to-mesenchymal transition. Breast Cancer Res. 6, 514–530 (2004)
Hill, J.J., Tremblay, T.L., Cantin, C., O’Connor-McCourt, M.D., Kelly, J.F., Lenferink, A.E.G.: Glycoproteomic analysis of two mouse mammary cell lines during transforming growth factor (TGF)-β induced epithelial to mesenchymal transition. Proteome Science 7(2) (2009)
Tainsky, M.A.: Genomic and proteomic biomarkers for cancer: a multitude of opportunities. Biochim. Biophys. Acta 1796, 176–193 (2009)
Chin, L., Gray, J.W.: Translating insights from the cancer genome into clinical practice. Nature 452, 553–563 (2008)
Ross, J.S.: Multigene classifiers, prognostic factors, and predictors of breast cancer clinical outcome. Adv. Anat. Pathol. 16, 204–215 (2009)
The Cancer Genome Atlas Research Network: Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature 455, 1061–1068 (2008)
Dinu, I., Potter, J.D., et al.: Gene-set analysis and reduction. Brief Bioinform. 10, 24–34 (2009)
Khatri, P., Draghici, S.: Ontological analysis of gene expression data: current tools, limitations, and open problems. Bioinformatics 21, 3587–3595 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Famili, F., Phan, S., Fauteux, F., Liu, Z., Pan, Y. (2010). Data Integration and Knowledge Discovery in Life Sciences. In: GarcÃa-Pedrajas, N., Herrera, F., Fyfe, C., BenÃtez, J.M., Ali, M. (eds) Trends in Applied Intelligent Systems. IEA/AIE 2010. Lecture Notes in Computer Science(), vol 6098. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13033-5_11
Download citation
DOI: https://doi.org/10.1007/978-3-642-13033-5_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13032-8
Online ISBN: 978-3-642-13033-5
eBook Packages: Computer ScienceComputer Science (R0)