Reconstructing the Topology of Protein Complexes

Bernard, Allister; Vaughn, David S.; Hartemink, Alexander J.

doi:10.1007/978-3-540-71681-5_3

Allister Bernard¹,
David S. Vaughn¹ &
Alexander J. Hartemink¹

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 4453))

Included in the following conference series:

Annual International Conference on Research in Computational Molecular Biology

1558 Accesses
4 Citations

Abstract

Recent advances in high-throughput experimental techniques have enabled the production of a wealth of protein interaction data, rich in both quantity and variety. While the sheer quantity and variety of data present special difficulties for modeling, they also present unique opportunities for gaining insight into protein behavior by leveraging multiple perspectives. Recent work on the modularity of protein interactions has revealed that reasoning about protein interactions at the level of domain interactions can be quite useful. We present proctor, a learning algorithm for reconstructing the internal topology of protein complexes by reasoning at the domain level about both direct protein interaction data (Y2H) and protein co-complex data (AP-MS). While other methods have attempted to use data from both these kinds of assays, they usually require that co-complex data be transformed into pairwise interaction data under a spoke or clique model, a transformation we do not require. We apply proctor to data from eight high-throughput datasets, encompassing 5,925 proteins, essentially all of the yeast proteome. First we show that proctor outperforms other algorithms for predicting domain-domain and protein-protein interactions from Y2H and AP-MS data. Then we show that our algorithm can reconstruct the internal topology of AP-MS purifications, revealing known complexes like Arp2/3 and RNA polymerase II, as well as suggesting new complexes along with their corresponding topologies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aloy, P., Bottcher, B., Ceulemans, H., Leutwein, C., Mellwig, C., Fischer, S., Gavin, A.C., Bork, P., Superti-Furga, G., Serrano, L., Russell, R.B.: Structure-based assembly of protein complexes in yeast. Nature 303, 2026–2029 (2004)
Google Scholar
Aloy, P., Russell, R.B.: Structural systems biology: modeling protein interactions. Nature Reviews in Molecular Cell Biology 7, 188–197 (2006)
Article Google Scholar
Bader, G.D., Donaldson, I., Wolting, C., Ouellette, B.F., Pawson, T., Hogue, C.W.: BIND - the biomolecular interaction network database. Nucleic Acids Research 29, 242–245 (2001)
Article Google Scholar
Bateman, A., et al.: The pfam protein families database. Nucleic Acids Research 32, D138–D141 (2004)
Article Google Scholar
Ben-Hur, A., Noble, W.S.: Kernel methods for predicting protein-protein interactions. In: ISMB. ISCB (June 2005)
Google Scholar
Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., Bourne, P.E.: The protein data bank. Nucleic Acids Research 28, 235–242 (2000)
Article Google Scholar
Broder, A.Z.: Generating random spanning trees. In: Foundations of Computer Science, pp. 442–447. IEEE Computer Society Press, Los Alamitos (1989)
Google Scholar
Chu, W., Ghahramani, Z., Krause, R., Wild, D.L.: Identifying protein complexes in high-throughput protein interaction screens using an infinite latent feature model. In: PSB, pp. 231–242 (2006)
Google Scholar
Davis, J., Goadrich, M.: The relationship between precision-recall and roc curves. In: Proc. 23rd International Conference on Machine Learning, pp. 233–240 (2006)
Google Scholar
Deng, M., Mehta, S., Sun, F., Chen, T.: Inferring domain-domain interactions from protein-protein interactions. In: RECOMB ’02: Proceedings of the sixth annual international conference on Computational biology, pp. 117–126. ACM Press, New York (2002)
Chapter Google Scholar
D’haeseleer, P., Church, G.M.: Estimating and improving protein interaction error rates. In: CSB, IEEE, Los Alamitos (Aug. 2004)
Google Scholar
Drawid, A., Gerstein, M.: A Bayesian system integrating expression data with sequence patterns for localizing proteins: comprehensive application to the yeast genome. Journal of Molecular Biology 301, 1059–1075 (2000)
Article Google Scholar
Edwards, R., Glass, L.: Combinatorial explosion in model gene networks. Chaos 10, 691–704 (2000)
Article MATH MathSciNet Google Scholar
Finn, R., Marshall, M., Bateman, A.: ipfam: visualization of protein-protein interactions in pdb at domain and amino acid resolutions. Bioinformatics 21, 410–412 (2005)
Article Google Scholar
Gavin, A.-C., et al.: Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature 415, 141–147 (2002)
Article Google Scholar
Gavin, A.-C., et al.: Proteome survey reveals modularity of the yeast cell machinery. Nature 440, 631–636 (2006)
Article Google Scholar
Gilchrist, M.A., Salter, L.A., Wagner, A.: A statistical framework for combining and interpreting proteomic datasets. Bioinformatics 20, 689–700 (2004)
Article Google Scholar
Gomez, S.M., Noble, W.S., Rzhetsky, A.: Learning to predict protein-protein interactions. Bioinformatics 19, 1875–1881 (2003)
Article Google Scholar
Gomez, S.M., Rzhetsky, A.: Towards the prediction of complete protein-protein interaction networks. PSB 7, 413–424 (2002)
Google Scholar
Ho, Y., et al.: Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 415, 123–124 (2002)
Article Google Scholar
Ito, T., Chiba, T., Ozawa, R., Yoshida, M., Hattori, M., Sakaki, Y.: A comprehensive two-hybrid analysis to explore the yeast protein interactome. PNAS 98, 4569–4574 (2001)
Article Google Scholar
Jansen, R., Yu, H., Greenbaum, D., Kluger, Y., Krogan, N.J., Chung, S., Emili, A., Snyder, M., Greenblatt, J.F., Gerstein, M.: A Bayesian networks approach for predicting protein-protein interactions from genomic data. Science 302, 449–453 (2003)
Article Google Scholar
Krogan, N.J., et al.: High-definition macromolecular composition of yeast rna-processing complexes. Molecular Cell 13, 225–239 (2004)
Article Google Scholar
Krogan, N.J., et al.: Global landscape of protein complexes in the yeast saccharomyces cerevisiae. Nature 440, 637–643 (2006)
Article Google Scholar
Lyons, R., Peres, Y.: Probability on trees and networks. Cambridge University Press, Cambridge (in progress, 2005)
Google Scholar
Martin, S., Roe, D., Faulon, J.-L.: Predicting protein-protein interactions using signature products. Bioinformatics 21, 218–226 (2005)
Article Google Scholar
Mewes, H.W., Frishman, D., Guldener, U., Mannhaupt, G., Mayer, K., Mokrejs, M., Morgenstern, B., Munsterkotter, M., Rudd, S., Weil, B.: MIPS: a database for genomes and protein sequences. Nucleic Acids Research 30, 31–34 (2002)
Article Google Scholar
Mulder, N.J., et al.: Interpro, progress and status in 2005. Nucleic Acids Research 33, D201–D205 (2005)
Article Google Scholar
Nye, T.M., Berzuini, C., Gilks, W.R., Babu, M.M., Teichmann, S.A.: Statistical analysis of domains in interacting protein pairs. Bioinformatics 21, 993–1001 (2005)
Article Google Scholar
Nye, T.M., Berzuini, C., Gilks, W.R., Babu, M.M., Teichmann, S.A.: Predicting the strongest domain-domain contact in interacting protein pairs. Statistical Applications in Genetics and Molecular Biology 5 (2006)
Google Scholar
Scholtens, D., Vidal, M., Gentleman, R.: Local modeling of global interactome networks. Bioinformatics 21, 3548–3557 (2005)
Article Google Scholar
Sprinzak, E., Margalit, H.: Correlated sequence-signatures as markers of protein-protein interaction. J. Mol. Biol. 311, 681–692 (2001)
Article Google Scholar
Stein, A., Russell, R., Aloy, P.: 3did: interacting protein domains of known three-dimensional structure. Nucleic Acids Research 33, D413–D417 (2005)
Article Google Scholar
Uetz, P., Giot, L., Cagney, G., Mansfield, T.A., Judson, R.S., Narayan, V., Lockshon, D., Srinivasan, M., Pochart, P., Qureshi-Emili, A., Li, Y., Godwin, B., Conover, D., Kalbfleisch, T., Vijayadamodar, G., Yang, M., Johnston, M., Fields, S., Rothberg, J.M.: A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature 403(6770), 623–627 (2000)
Article Google Scholar
von Mering, C., Krause, R., Snel, B., Cornell, M., Oliver, S.G., Fields, S., Bork, P.: Comparative assessment of large-scale data sets of protein-protein interactions. Nature 417, 399–403 (2002)
Article Google Scholar
Wang, H., Segal, E., Ben-Hur, A., Koller, D., Brutlag, D.: Identifying protein-protein interaction sites on a genome-wide scale. In: Advances in Neural Information Processing Systems (NIPS 2004), Vancouver, Canada (2004)
Google Scholar
Wilson, D.B.: Generating random spanning trees more quickly than the cover time. In: Symposium on Theory of Computing, pp. 296–303. ACM Press, New York (1996)
Google Scholar
Xenarios, I., Salwinski, L., Duan, X.J., Higney, P., Kim, S.M., Eisenberg, D.: DIP, the database of interacting proteins: a research tool for studying cellular networks of protein interactions. Nucleic Acids Research 30, 303–305 (2002)
Article Google Scholar
Zhang, L.V., Wong, S.L., King, O.D., Roth, F.P.: Predicting co-complexed protein pairs using genomic and proteomic data integration. BMC Bioinformatics 5, 1–15 (2004)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Duke University, Durham, NC 27708-0129,
Allister Bernard, David S. Vaughn & Alexander J. Hartemink

Authors

Allister Bernard
View author publications
You can also search for this author in PubMed Google Scholar
David S. Vaughn
View author publications
You can also search for this author in PubMed Google Scholar
Alexander J. Hartemink
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Terry Speed Haiyan Huang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bernard, A., Vaughn, D.S., Hartemink, A.J. (2007). Reconstructing the Topology of Protein Complexes. In: Speed, T., Huang, H. (eds) Research in Computational Molecular Biology. RECOMB 2007. Lecture Notes in Computer Science(), vol 4453. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71681-5_3

Download citation

DOI: https://doi.org/10.1007/978-3-540-71681-5_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71680-8
Online ISBN: 978-3-540-71681-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics