Abstract
Comparative analysis of biomolecular networks constructed using measurements from different conditions, tissues, and organisms offer a powerful approach to understanding the structure, function, dynamics, and evolution of complex biological systems. We explore a class of algorithms for aligning large biomolecular networks by breaking down such networks into subgraphs and computing the alignment of the networks based on the alignment of their subgraphs. The resulting subnetworks are compared using graph kernels as scoring functions. We provide implementations of the resulting algorithms as part of BiNA, an open source biomolecular network alignment toolkit. Our experiments using Drosophila melanogaster, Saccharomyces cerevisiae, Mus musculus and Homo sapiens protein-protein interaction networks extracted from the DIP repository of protein-protein interaction data demonstrate that the performance of the proposed algorithms (as measured by % GO term enrichment of subnetworks identified by the alignment) is competitive with some of the state-of-the-art algorithms for pair-wise alignment of large protein-protein interaction networks. Our results also show that the inter-species similarity scores computed based on graph kernels can be used to cluster the species into a species tree that is consistent with the known phylogenetic relationships among the species.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Aittokallio, T., Schwikowski, B.: Graph-based methods for analysing networks in cell biology. Briefings in Bioinformatics 7(3), 243 (2006)
Altschul, S.F., Madden, T.L., Schäffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J.: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Research 25(17), 3390 (1997)
Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T., et al.: Gene ontology: Tool for the unification of biology. The Gene Ontology Consortium. Nature genetics 25(1), 25 (2000)
Bairoch, A., Apweiler, R., Wu, C.H., Barker, W.C., Boeckmann, B., Ferro, S., Gasteiger, E., Huang, H., Lopez, R., Magrane, M., et al.: The Universal Protein Resource (UniProt). Nucleic Acids Research 33, D154 (2005)
Barabasi, A.L., Oltvai, Z.N.: Network biology: understanding the cell’s functional organization. Nature Reviews Genetics 5(2), 101–113 (2004)
Borgwardt, K.M., Kriegel, H.P.: Shortest-Path Kernels on Graphs. In: Proceedings of the Fifth IEEE International Conference on Data Mining, pp. 74–81 (2005)
Borgwardt, K.M., Kriegel, H.P., Vishwanathan, S.V.N., Schraudolph, N.N.: Graph Kernels For Disease Outcome Prediction From Protein-Protein Interaction Networks. In: Proceedings of the Pacific Symposium of Biocomputing (2007)
Boyle, E.I., Weng, S., Gollub, J., Jin, H., Botstein, D., Cherry, J.M., Sherlock, G.O.: GO: TermFinder–open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes. Bioinformatics (Oxford, England) 20(18), 3710 (2004)
Bruggeman, F.J., Westerhoff, H.V.: The nature of systems biology. Trends Microbiol. 15(1), 45–50 (2007)
Burrus, L.W., McMahon, A.P.: Biochemical analysis of murine Wnt proteins reveals both shared and distinct properties. Experimental cell research 220(2), 363–373 (1995)
Cristianini, N., Shawe-Taylor, J.: An introduction to support vector machines. Cambridge University Press, Cambridge (2000)
Enright, A.J., Van Dongen, S., Ouzounis, C.A.: An efficient algorithm for large-scale detection of protein families. Nucleic Acids Research 30(7), 1575 (2002)
Flannick, J., Novak, A., Do, C.B., Srinivasan, B.S., Batzoglou, S.: Automatic parameter learning for multiple network alignment. In: Vingron, M., Wong, L. (eds.) RECOMB 2008. LNCS (LNBI), vol. 4955, pp. 214–231. Springer, Heidelberg (2008)
Forst, C.V., Flamm, C., Hofacker, I.L., Stadler, P.F.: Algebraic comparison of metabolic networks, phylogenetic inference, and metabolic innovation. BMC Bioinformatics 7(1), 67 (2006)
Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. WH Freeman & Co., New York (1979)
Ge, H., Walhout, A.J.M., Vidal, M.: Integrating ‘omic’ information: a bridge between genomics and systems biology. Trends in Genetics 19(10), 551–560 (2003)
Han, J.D., Bertin, N., Hao, T., Goldberg, D.S., Berriz, G.F., Zhang, L.V., Dupuy, D., Walhout, A.J., Cusick, M.E., Roth, F.P., Vidal, M.: Evidence for dynamically organized modularity in the yeast protein-protein interaction network. Nature 430(6995), 88–93 (2004)
Harary, F.: Graph theory (1969)
Hartwell, L.H., Hopfield, J.J., Leibler, S., Murray, A.W.: From molecular to modular cell biology. Nature 402(6761 suppl.), C47–C52 (1999)
Hedges, S.B.: The origin and evolution of model organisms. Nature Reviews Genetics 3(11), 838–849 (2002)
Hirsh, A.E., Fraser, H.B.: Protein dispensability and rate of evolution. Nature 411(6841), 1046–1049 (2001)
Ideker, T., Sharan, R.: Protein networks in disease. Genome Research 18(4), 644 (2008)
Kalaev, M., Bafna, V., Sharan, R.: Fast and accurate alignment of multiple protein networks. In: Vingron, M., Wong, L. (eds.) RECOMB 2008. LNCS (LNBI), vol. 4955, pp. 246–256. Springer, Heidelberg (2008)
Kalaev, M., Smoot, M., Ideker, T., Sharan, R.: NetworkBLAST: comparative analysis of protein networks. Bioinformatics 24(4), 594 (2008)
Kelley, B.P., Sharan, R., Karp, R., Sittler, E.T., Root, D.E., Stockwell, B.R., Ideker, T.: Conserved pathways within bacteria and yeast as revealed by global protein network alignment. Proc. Natl. Acad. Sci. 100, 11394–11399 (2003)
Kharchenko, P., Church, G.M., Vitkup, D.: Expression dynamics of a cellular metabolic network. Molecular Systems Biology 1 (2005)
Kirac, M., Ozsoyoglu, G.: Protein Function Prediction Based on Patterns in Biological Networks. In: Vingron, M., Wong, L. (eds.) RECOMB 2008. LNCS (LNBI), vol. 4955, pp. 197–213. Springer, Heidelberg (2008)
Koonin, E.: Orthologs, paralogs and evolutionary genomics. Annu. Rev. Genet. 39, 309–338 (2005)
Koyuturk, M., Kim, Y., Topkara, U., Subramaniam, S., Szpankowski, W., Grama, A.: Pairwise alignment of protein interaction networks. Journal of Computational Biology 13(2), 182–199 (2006)
Kuchaiev, O., Milenkovic, T., Memisevic, V., Hayes, W., Przulj, N.: Topological network alignment uncovers biological function and phylogeny. Arxiv, 0810.3280v2 (2009)
Lim, J., Hao, T., Shaw, C., Patel, A.J., Szabó, G., Rual, J.F., Fisk, C.J., Li, N., Smolyar, A., Hill, D.E., et al.: A Protein–Protein Interaction Network for Human Inherited Ataxias and Disorders of Purkinje Cell Degeneration. Cell 125(4), 801–814 (2006)
Manber, U.: Introduction to algorithms: a creative approach. Addison-Wesley Longman Publishing Co., Inc., Boston (1989)
Ng, A., Jordan, M., Weiss, Y.: On spectral clustering: Analysis and an algorithm. In: Advances in Neural Information Processing Systems 14: Proceedings of the 2002 [sic] Conference, p. 849. MIT Press, Cambridge (2002)
O’Brien, K.P., Remm, M., Sonnhammer, E.L.L.: Inparanoid: a comprehensive database of eukaryotic orthologs. Nucleic Acids Research 33(Database issue), D476 (2005)
O’Madadhain, J., Fisher, D., White, S., Boey, Y.: The JUNG (Java Universal Network/Graph) Framework. University of California, California (2003)
Ravasz, E., Somera, A.L., Mongru, D.A., Oltvai, Z.N., Barabasi, A.L.: Hierarchical organization of modularity in metabolic networks. Science 297(5586), 1551–1555 (2002)
Ross, J., Schreiber, I., Vlad, M.O.: Determination of Complex Reaction Mechanisms: Analysis of Chemical, Biological, and Genetic Networks. Oxford University Press, USA (2006)
Salwinski, L., Miller, C.S., Smith, A.J., Pettit, F.K., Bowie, J.U., Eisenberg, D.: The database of interacting proteins: 2004 update. Nucleic Acids Research 32(Database issue), D449 (2004)
Scott, J., Ideker, T., Karp, R.M., Sharan, R.: Efficient Algorithms for Detecting Signaling Pathways in Protein Interaction Networks. Journal of Computational Biology 13(2), 133–144 (2006)
Sharan, R., Ideker, T.: Modeling cellular machinery through biological network comparison. Nature Biotechnology 24, 427–433 (2006)
Steinbeck, C., Hoppe, C., Kuhn, S., Floris, M., Guha, R., Willighagen, E.L.: Recent Developments of the Chemistry Development Kit (CDK)-An Open-Source Java Library for Chemo-and Bioinformatics. Current Pharmaceutical Design 12(17), 2111–2120 (2006)
Stuart, J.M., Segal, E., Koller, D., Kim, S.K.: A Gene-Coexpression Network for Global Discovery of Conserved Genetic Modules. Science 302(5643), 249–255 (2003)
Taylor, N.: proWeb Tree Viewer, http://www.proweb.org/treeviewer/
Tian, W., Samatova, N.F.: Pairwise alignment of interaction networks by fast identification of maximal conserved patterns. In: Proc. of the Pacific Symposium on Biocomputing (2009)
Vishwanathan, S.V.N., Borgwardt, K.M., Schraudolph, N.N.: Fast Computation of Graph Kernels. Technical report, NICTA (2006)
White, S., Smyth, P.: Algorithms for estimating relative importance in networks. In: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 266–275. ACM, New York (2003)
Wong, S.L., Zhang, L.V., Tong, A.H.Y., Li, Z., Goldberg, D.S., King, O.D., Lesage, G., Vidal, M., Andrews, B., Bussey, H., et al.: Combining biological networks to predict genetic interactions. Proceedings of the National Academy of Sciences 101(44), 15682–15687 (2004)
Zhou, X., Kao, M.C.J., Wong, W.H.: Transitive functional annotation by shortest-path analysis of gene expression data. Proceedings of the National Academy of Sciences 99(20), 12783–12788 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Towfic, F., Greenlee, M.H.W., Honavar, V. (2009). Aligning Biomolecular Networks Using Modular Graph Kernels. In: Salzberg, S.L., Warnow, T. (eds) Algorithms in Bioinformatics. WABI 2009. Lecture Notes in Computer Science(), vol 5724. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04241-6_29
Download citation
DOI: https://doi.org/10.1007/978-3-642-04241-6_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04240-9
Online ISBN: 978-3-642-04241-6
eBook Packages: Computer ScienceComputer Science (R0)