Abstract
Probabilistic functional integrated networks are powerful tools with which to draw inferences from high-throughput data. However, network analyses are generally not tailored to specific biological functions or processes. This problem may be overcome by extracting process-specific sub-networks, but this approach discards useful information and is of limited use in poorly annotated areas of the network. Here we describe an extension to existing integration methods which exploits dataset biases in order to emphasise interactions relevant to specific processes, without loss of data. We apply the method to high-throughput data for the yeast Saccharomyces cerevisiae, using Gene Ontology annotations for ageing and telomere maintenance as test processes. The resulting networks perform significantly better than unbiased networks for assigning function to unknown genes, and for clustering to identify important sets of interactions. We conclude that this integration method can be used to enhance network analysis with respect to specific processes of biological interest.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Cusick, M.E., Klitgord, N., Vidal, M., Hill, D.E.: Interactome: Gateway into Systems Biology. Hum. Mol. Genet. 14(2), 171–181 (2005)
Adourian, A., Jennings, E., Balasubramanian, R., Hines, W.M., Damian, D., Plasterer, T.N., Clish, C.B., Stroobant, P., McBurney, R., Verheij, E.R., Bobeldijk, I., van der Greef, J., Lindberg, J., Kenne, K., Andersson, U., Hellmold, H., Nilsson, K., Salter, H., Schuppe-Koistinen, I.: Correlation Network Analysis for Data Integration and Biomarker Selection. Mol. Biosyst. 4, 249–259 (2008)
Li, C., Li, H.: Network-Constrained Regularization and Variable Selection for Analysis of Genomic Data. Bioinformatics 24, 1175–1182 (2008)
Godzik, A., Jambon, M., Friedberg, I.: Computational Protein Function Prediction: Are We Making Progress? Cell Mol. Life Sci. 64, 2505–2511 (2007)
Mellor, J.C., Yanai, I., Clodfelter, K.H., Mintseris, J., DeLisi, C.: Predictome: A Database of Putative Functional Links between Proteins. Nucleic Acids Res. 30, 306–309 (2002)
von Mering, C., Jensen, L.J., Kuhn, M., Chaffron, S., Doerks, T., Krüger, B., Snel, B., Bork, P.: String 7–Recent Developments in the Integration and Prediction of Protein Interactions. Nucleic Acids Res. 35, 358–362 (2007)
De Las Rivas, J., de Luis, A.: Interactome Data and Databases: Different Types of Protein Interaction. Comp. Funct. Genomics. 5, 173–178 (2004)
Galperin, M.Y.: The Molecular Biology Database Collection: 2008 Update. Nucleic Acids Res. 36, 2–4 (2008)
Marcotte, E., Date, S.: Exploiting Big Biology: Integrating Large-Scale Biological Data for Function Inference. Brief. Bioinform. 2, 363–374 (2001)
Mathivanan, S., Periaswamy, B., Gandhi, T.K.B., Kandasamy, K., Suresh, S., Mohmood, R., Ramachandra, Y.L., Pandey, A.: An Evaluation of Human Protein-Protein Interaction Data in the Public Domain. BMC Bioinformatics 7(suppl. 5) (2006)
Rigaut, G., Shevchenko, A., Rutz, B., Wilm, M., Mann, M., Seraphin, B.: A Generic Protein Purification Method for Protein Complex Characterization and Proteome Exploration. Nat. Biotechnol. 17, 1030–1032 (1999)
Fields, S., Song, O.: A Novel Genetic System to Detect Protein-Protein Interactions. Nature 340, 245–246 (1989)
Kaganman, I.: Fretting for a More Detailed Interactome. Nat. Methods 4, 112–113 (2007)
Bader, G.D., Hogue, C.W.V.: Analyzing Yeast Protein-Protein Interaction Data Obtained from Different Sources. Nat. Biotechnol. 20, 991–997 (2002)
Collins, S.R., Kemmeren, P., Zhao, X.-C., Greenblatt, J.F., Spencer, F., Holstege, F.C.P., Weissman, J.S., Krogan, N.J.: Toward a Comprehensive Atlas of the Physical Interactome of Saccharomyces Cerevisiae. Mol. Cell Proteomics. 6, 439–450 (2007)
Futschik, M.E., Chaurasia, G., Herzel, H.: Comparison of Human Protein-Protein Interaction Maps. Bioinformatics 23, 605–611 (2007)
Hart, G.T., Lee, I., Marcotte, E.R.: A High-Accuracy Consensus Map of Yeast Protein Complexes Reveals Modular Nature of Gene Essentiality. BMC Bioinformatics 8, 236 (2007)
Huttenhower, C., Troyanskaya, O.G.: Assessing the Functional Structure of Genomic Data. Bioinformatics 24, 330–338 (2008)
Beyer, A., Bandyopadhyay, S., Ideker, T.: Integrating Physical and Genetic Maps: From Genomes to Interaction Networks. Nat. Rev. Genet. 8, 699–710 (2007)
Hallinan, J.S., Wipat, A.: Motifs and Modules in Fractured Functional Yeast Networks. In: IEEE Symposium on Computational Intelligence and Bioinformatics and Computational Biology (CIBCB 2007), pp. 189–196 (2007)
Lee, I., Date, S.V., Adai, A.T., Marcotte, E.M.: A Probabilistic Functional Network of Yeast Genes. Science 306, 1555–1558 (2004)
Koehler, J., Baumbach, J., Taubert, J., Specht, M., Skusa, A., Rüegg, A., Rawlings, C., Verrier, P., Philippi, S.: Graph-Based Analysis and Visualization of Experimental Results with Ondex. Bioinformatics 22, 1383–1390 (2006)
Liu, Y., Kim, I., Zhao, H.: Protein Interaction Predictions from Diverse Sources. Drug Discov. Today 13, 409–416 (2008)
Asthana, S., King, O.D., Gibbons, F.D., Roth, F.P.: Predicting Protein Complex Membership Using Probabilistic Network Reliability. Genome Res. 14, 1170–1175 (2004)
Bader, G.D., Hogue, C.W.V.: An Automated Method for Finding Molecular Complexes in Large Protein Interaction Networks. BMC Bioinformatics 4, 2 (2003)
Brun, C., Herrmann, C., Guenoche, A.: Clustering Proteins from Interaction Networks for the Prediction of Cellular Functions. BMC Bioinformatics 5, 95 (2004)
Chua, H.N., Sung, W.-K., Wong, L.: Using Indirect Protein Interactions for the Prediction of Gene Ontology Functions. BMC Bioinformatics 8(suppl. 4) (2007)
Karaoz, U., Murali, T.M., Letovsky, S., Zheng, Y., Ding, C., Cantor, C.R., Kasif, S.: Whole-Genome Annotation by Using Evidence Integration in Functional-Linkage Networks. Proc. Natl. Acad. Sci. U. S. A. 101, 2888–2893 (2004)
Clauset, A., Moore, C., Newman, M.E.J.: Hierarchical Structure and the Prediction of Missing Links in Networks. Nature 453, 98–101 (2008)
Gilchrist, M.A., Salter, L.A., Wagner, A.: A Statistical Framework for Combining and Interpreting Proteomic Datasets. Bioinformatics 20, 689–700 (2004)
Myers, C.L., Troyanskaya, O.G.: Context-Sensitive Data Integration and Prediction of Biological Networks. Bioinformatics 23, 2322–2330 (2007)
Li, J., Li, X., Su, H., Chen, H., Galbraith, D.W.: A Framework of Integrating Gene Relations from Heterogeneous Data Sources: An Experiment on Arabidopsis Thaliana. Bioinformatics 22, 2037–2043 (2006)
Yellaboina, S., Goyal, K., Mande, S.C.: Inferring Genome-Wide Functional Linkages in E. Coli by Combining Improved Genome Context Methods: Comparison with High-Throughput Experimental Data. Genome Res. 17, 527–535 (2007)
Deng, M., Chen, T., Sun, F.: An Integrated Probabilistic Model for Functional Prediction of Proteins. J. Comput. Biol. 11, 463–475 (2004)
Jaimovich, A., Elidan, G., Margalit, H., Friedman, N.: Towards an Integrated Protein-Protein Interaction Network: A Relational Markov Network Approach. J. Comput. Biol. 13, 145–164 (2006)
Chen, Y., Xu, D.: Global Protein Function Annotation through Mining Genome-Scale Data in Yeast Saccharomyces Cerevisiae. Nucleic Acids Res. 32, 6414–6424 (2004)
Kiemer, L., Costa, S., Ueffing, M., Cesareni, G.: Wi-Phi: A Weighted Yeast Interactome Enriched for Direct Physical Interactions. Proteomics 7, 932–943 (2007)
Guan, Y., Myers, C.L., Lu, R., Lemischka, I.R., Bult, C.J., Troyanskaya, O.G.: A Genomewide Functional Network for the Laboratory Mouse. PLoS Comput. Biol. 4 (2008)
Kim, W.K., Krumpelman, C., Marcotte, E.M.: Inferring Mouse Gene Functions from Genomic-Scale Data Using a Combined Functional Network/Classification Strategy. Genome Biol. 9(suppl. 1) (2008)
Kann, M.G.: Protein Interactions and Disease: Computational Approaches to Uncover the Etiology of Diseases. Brief Bioinform. 8, 333–346 (2007)
Geisler-Lee, J., O’Toole, N., Ammar, R., Provart, N.J., Millar, A.H., Geisler, M.: A Predicted Interactome for Arabidopsis. Plant Physiol. 145, 317–329 (2007)
Lin, X., Liu, M., Chen, X.-w.: Protein-Protein Interaction Prediction and Assessment from Model Organisms. In: BIBM 2008: Proceedings of the 2008 IEEE International Conference on Bioinformatics and Biomedicine, pp. 187–192 (2008)
Mrowka, R., Patzak, A., Herzel, H.: Is There a Bias in Proteome Research? Genome Res. 11, 1971–1973 (2001)
Tanay, A., Sharan, R., Kupiec, M., Shamir, R.: Revealing Modularity and Organization in the Yeast Molecular Network by Integrated Analysis of Highly Heterogeneous Genomewide Data. Proc. Natl. Acad. Sci. U. S. A. 101, 2981–2986 (2004)
Myers, C.L., Barrett, D.R., Hibbs, M.A., Huttenhower, C., Troyanskaya, O.G.: Finding Function: Evaluation Methods for Functional Genomic Data. BMC Genomics 7, 187 (2006)
Chen, J., Hsu, W., Lee, M.L., Ng, S.-K.: Discovering Reliable Protein Interactions from High-Throughput Experimental Data Using Network Topology. Artif. Intell. Med. 35, 37–47 (2005)
Chen, J., Hsu, W., Lee, M.L., Ng, S.-K.: Increasing Confidence of Protein Interactomes Using Network Topological Metrics. Bioinformatics 22, 1998–2004 (2006)
Lee, I., Li, Z., Marcotte, E.M.: An Improved, Bias-Reduced Probabilistic Functional Gene Network of Baker’s Yeast, Saccharomyces Cerevisiae. PLoS ONE 2 (2007)
Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T., Harris, M.A., Hill, D.P., Issel-Tarver, L., Kasarskis, A., Lewis, S., Matese, J.C., Richardson, J.E., Ringwald, M., Rubin, G.M., Sherlock, G.: Gene Ontology: Tool for the Unification of Biology. The Gene Ontology Consortium. Nat. Genet. 25, 25–29 (2000)
Guo, Z., Li, Y., Gong, X., Yao, C., Ma, W., Wang, D., Li, Y., Zhu, J., Zhang, M., Yang, D., Wang, J.: Edge-Based Scoring and Searching Method for Identifying Condition-Responsive Protein-Protein Interaction Sub-Network. Bioinformatics 23, 2121–2128 (2007)
Li, Y., Ma, W., Guo, Z., Yang, D., Wang, D., Zhang, M., Zhu, J., Li, Y.: Characterizing Proteins with Finer Functions: A Case Study for Translational Functions of Yeast Proteins. In: Bioinformatics and Biomedical Engineering, 2007. ICBBE 2007 pp. 141–144 (2007)
Wodak, S.J., Pu, S., Vlasblom, J., Seraphin, B.: Challenges and Rewards of Interaction Proteomics. Mol. Cell Proteomics 8, 3–18 (2009)
Blackburn, E.H.: Switching and Signaling at the Telomere. Cell 106, 661–673 (2001)
Sozou, P.D., Kirkwood, T.B.: A Stochastic Model of Cell Replicative Senescence Based on Telomere Shortening, Oxidative Stress, and Somatic Mutations in Nuclear and Mitochon-drial DNA. J. Theor. Biol. 213, 573–586 (2001)
Stark, C., Breitkreutz, B.-J., Reguly, T., Boucher, L., Breitkreutz, A., Tyers, M.: Biogrid: A General Repository for Interaction Datasets. Nucleic Acids Res. 34, 535–539 (2006)
Reguly, T., Breitkreutz, A., Boucher, L., Breitkreutz, B.-J., Hon, G.C., Myers, C.L., Parsons, A., Friesen, H., Oughtred, R., Tong, A., Stark, C., Ho, Y., Botstein, D., Andrews, B., Boone, C., Troyanskya, O.G., Ideker, T., Dolinski, K., Batada, N.N., Tyers, M.: Comprehensive Curation and Analysis of Global Interaction Networks in Saccharomyces Cerevisiae. J. Biol. 5, 11 (2006)
Kanehisa, M., Goto, S.: KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 28, 27–30 (2000)
Dwight, S.S., Harris, M.A., Dolinski, K., Ball, C.A., Binkley, G., Christie, K.R., Fisk, D.G., Issel-Tarver, L., Schroeder, M., Sherlock, G., Sethuraman, A., Weng, S., Botstein, D., Cherry, J.M.: Saccharomyces Genome Database (SGD) Provides Secondary Gene Annotation Using the Gene Ontology (GO). Nucleic Acids Res. 30, 69–72 (2002)
Linghu, B., Snitkin, E.S., Holloway, D.T., Gustafson, A.M., Xia, Y., DeLisi, C.: High-Precision High-Coverage Functional Inference from Integrated Data Sources. BMC Bioinformatics 9, 119 (2008)
Hanley, J.A., McNeil, B.J.: The Meaning and Use of the Area under a Receiver Operating Characteristic (ROC) Curve. Radiology 143, 29–36 (1982)
Henderson, A.R.: Assessing Test Accuracy and Its Clinical Consequences: A Primer for Receiver Operating Characteristic Curve Analysis. Ann. Clin. Biochem. 30(Pt 6), 521–539 (1993)
Enright, A.J., Van Dongen, S., Ouzounis, C.A.: An Efficient Algorithm for Large-Scale Detection of Protein Families. Nucleic Acids Res. 30, 1575–1584 (2002)
Tetko, I.V., Brauner, B., Dunger-Kaltenbach, I., Frishman, G., Montrone, C., Fobo, G., Ruepp, A., Antonov, A.V., Surmeli, D., Mewes, H.-W.: MIPS Bacterial Genomes Functional Annotation Benchmark Dataset. Bioinformatics 21, 2520–2521 (2005)
Kirkwood, T.: Ageing: Too Fast by Mistake. Nature 444, 1015–1017 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
James, K., Wipat, A., Hallinan, J. (2009). Integration of Full-Coverage Probabilistic Functional Networks with Relevance to Specific Biological Processes. In: Paton, N.W., Missier, P., Hedeler, C. (eds) Data Integration in the Life Sciences. DILS 2009. Lecture Notes in Computer Science(), vol 5647. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02879-3_4
Download citation
DOI: https://doi.org/10.1007/978-3-642-02879-3_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02878-6
Online ISBN: 978-3-642-02879-3
eBook Packages: Computer ScienceComputer Science (R0)