Abstract
The biological interpretation of large-scale gene expression data is one of the challenges in current bioinformatics. The state-of-the-art approach is to perform clustering and then compute a functional characterization via enrichments by Gene Ontology terms [1]. To better assist the interpretation of results, it may be useful to establish connections among different clusters. This machine learning step is sometimes termed cluster meta-analysis, and several approaches have already been proposed; in particular, they usually rely on enrichments based on flat lists of GO terms. However, GO terms are organized in taxonomical graphs, whose structure should be taken into account when performing enrichment studies. To tackle this problem, we propose a kernel approach that can exploit such structured graphical nature. Finally, we compare our approach against a specific flat list method by analyzing the cdc15-subset of the well known Spellman’s Yeast Cell Cycle dataset [2].
This work has been supported by EC “Marie Curie” grant MIRG-CT-2005-031140.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Gene Ontology Consortium: The Gene Ontology (GO) project in 2006. Nucleic Acid Research 34(Database issue), D322–D326 (2006)
Spellman, P.T., et al.: Comprehensive Identification of Cell Cycle-Regulated Genes of the Yeast Saccharomyces Cerevisiae by Microarray Hybridization. Molecular Biology of the Cell 9, 3273–3297 (1998)
Li, X., Quigg, R.J.: An Integrated Strategy for the Optimization of Microarray Data Interpretation. Gene Expression 12(4-6), 223–230 (2005)
Khatri, P., Draghici, S.: Ontological analysis of gene expression data: current tools, limitations and open problems. Bioinformatics 21 (2005)
Doherty, J.M., Carmichael, L.K., Mills, J.C.: GOurmet: a tool for Quantitative Comparison and Visualization of Gene Expression Profiles Based on Gene Ontology (GO) Distributions. BMC Bioinformatics 7(151) (2006)
Bar-Joseph, Z.: Analyzing time series gene expression data. Bioinformatics 20(16), 2493–2503 (2004)
Ernst, J., Bar-Joseph, Z.: STEM: a tool for the analysis of short time series expression data. BMC Bioinformatics 7(191) (2006)
Antoniotti, M., et al.: Remembrance of Experiments Past: Analyzing Time Course Datasets to Discover Complex Temporal Invariants. Technical Report CIMS TR2005-858, Bioinformatics Group, Courant Institute of Mathematical Sciences, New York University (February 2005)
Ramakrishnan, N., Antoniotti, M., Mishra, B.: Reconstructing Formal Temporal Models of Cellular Events using the GO Process Ontology. In: Bio-Ontologies SIG Meeting, ISMB, Detroit MI, U.S.A. (2005)
Kleinberg, S., et al.: Remembrance of Experiments Past: A Redescription Based Tool for Discovery in Complex Systems. In: Proceedings of the International Conference on Complex Systems, Boston, MA, U.S.A. (June 2006)
Antoniotti, M.: GOALIE site (2004-2007), http://bioinformatics.nyu.edu/Projects/GOALIE/
Ben-Hur, A., Noble, W.S.: Kernel methods for predicting protein-protein interactions. Bioinformatics 21 (2005)
Schölkopf, B., Tsuda, K., Vert, J.P.: Kernel Methods in Computational Biology. MIT Press, Cambridge (2004)
Haffner, P., Mohri, M., Cortes, C.: Positive Definite Rational Kernels. In: Schölkopf, B., Warmuth, M.K. (eds.) COLT/Kernel 2003. LNCS (LNAI), vol. 2777, pp. 41–56. Springer, Heidelberg (2003)
Gärtner, P., Flach, P., Wrobel, S.: On Graph Kernels: Hardness Results and Efficient Alternatives. In: Schölkopf, B., Warmuth, M.K. (eds.) COLT/Kernel 2003. LNCS (LNAI), vol. 2777, pp. 129–143. Springer, Heidelberg (2003)
Kashima, H., Tsuda, K., Inokuchi, A.: Marginalized Kernels between Labelled Graphs. In: Proceedings of ICML (2003)
Kondor, R.S., Lafferty, J.: Diffusion Kernels on Graphs and Other Discrete Structures. In: Proceedings of ICML (2002)
Borgwardt, K.M., Cheng, S.O., Schönauer, S.: Protein Function Prediction via Graph Kernel. Bioinformatics 21 (2005)
Joslyn, C.A., et al.: The Gene Ontology Categorizer. Bioinformatics 20 (2004)
Lord, P.W., et al.: Investigating semantic similarity measures across the Gene Ontology: the relationship between sequence and annotation. Bioinformatics 19(10) (2003)
Schölkopf, B., Smola, A.J.: Learning with Kernels. MIT Press, Cambridge (2002)
Loganantharaj, R., Cheepala, S., Clifford, J.: Metric for Measuring the Effectiveness of Clustering of DNA Microarray Expression. BMC Bioinformatics 7 (2006)
Al-Shahrour, F., Diaz-Uriarte, R., Dopazo, J.: FatiGO: a web tool for finding significant associations of Gene Ontology terms with groups of genes. Bioinformatics 20, 578–580 (2004)
Beißbarth, T., Speed, T.P.: GOstat: find statistically overrepresented Gene Ontologies within a group of genes. Bioinformatics 20(9), 1464–1465 (2004)
Robinson, P.N., et al.: Ontologizing gene-expression microarray dat: characterizing clusters with Gene Ontology. Bioinformatics 20(6), 979–981 (2004)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zoppis, I., Merico, D., Antoniotti, M., Mishra, B., Mauri, G. (2007). Discovering Relations Among GO-Annotated Clusters by Graph Kernel Methods. In: Măndoiu, I., Zelikovsky, A. (eds) Bioinformatics Research and Applications. ISBRA 2007. Lecture Notes in Computer Science(), vol 4463. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72031-7_15
Download citation
DOI: https://doi.org/10.1007/978-3-540-72031-7_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72030-0
Online ISBN: 978-3-540-72031-7
eBook Packages: Computer ScienceComputer Science (R0)