Skip to main content

Learning Cellular Sorting Pathways Using Protein Interactions and Sequence Motifs

  • Conference paper
Research in Computational Molecular Biology (RECOMB 2011)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 6577))

  • 1248 Accesses

Abstract

Proper subcellular localization is critical for proteins to perform their roles in cellular functions. Proteins are transported by different cellular sorting pathways, some of which take a protein through several intermediate locations until reaching its final destination. The pathway a protein is transported through is determined by carrier proteins that bind to specific sequence motifs. In this paper we present a new method that integrates sequence, motif and protein interaction data to model how proteins are sorted through these targeting pathways. We use a hidden Markov model (HMM) to represent protein targeting pathways. The model is able to determine intermediate sorting states and to assign carrier proteins and motifs to the sorting pathways. In simulation studies, we show that the method can accurately recover an underlying sorting model. Using data for yeast, we show that our model leads to accurate prediction of subcellular localization. We also show that the pathways learned by our model recover many known sorting pathways and correctly assign proteins to the path they utilize. The learned model identified new pathways and their putative carriers and motifs and these may represent novel protein sorting mechanisms.

Supplementary results and software implementation are available from http://murphylab.web.cmu.edu/software/2010_RECOMB_pathways/

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bairoch, A., Apweiler, R., Wu, C.H., Barker, W.C., Boeckmann, B., Ferro, S., Gasteiger, E., Huang, H., Lopez, R., Magrane, M., Martin, M.J., Natale, D.A., O’Donovan, C., Redaschi, N., Su, L.: The Universal Protein Resource (UniProt). Nucleic Acids Res. 33(Database issue), D154–D159 (2005), http://dx.doi.org/10.1093/nar/gki070

    Article  Google Scholar 

  2. Bannai, H., Tamada, Y., Maruyama, O., Nakai, K., Miyano, S.: Extensive feature detection of n-terminal protein sorting signals. Bioinformatics 18(2), 298–305 (2002)

    Article  Google Scholar 

  3. Barbe, L., Lundberg, E., Oksvold, P., Stenius, A., Lewin, E., Björling, E., Asplund, A., Pontén, F., Brismar, H., Uhlén, M., Svahn, H.A.: Toward a confocal subcellular atlas of the human proteome. Mol. Cell Proteomics 7(3), 499–508 (2008), http://dx.doi.org/10.1074/mcp.M700325-MCP200

    Article  Google Scholar 

  4. Bendtsen, J.D., Jensen, L.J., Blom, N., Von Heijne, G., Brunak, S.: Feature-based prediction of non-classical and leaderless protein secretion. Protein Eng. Des. Sel. 17(4), 349–356 (2004), http://view.ncbi.nlm.nih.gov/pubmed/15115854

    Article  Google Scholar 

  5. Bendtsen, J.D., Nielsen, H., von Heijne, G., Brunak, S.: Improved prediction of signal peptides: SignalP 3.0. J. Mol. Biol. 340(4), 783–795 (2004), http://dx.doi.org/10.1016/j.jmb.2004.05.028

    Article  Google Scholar 

  6. Chen, S.C., Zhao, T., Gordon, G.J., Murphy, R.F.: Automated image analysis of protein localization in budding yeast. Bioinformatics 23(13), i66–i71 (2007), http://dx.doi.org/10.1093/bioinformatics/btm206

    Article  Google Scholar 

  7. Cherry, J.M., Adler, C., Ball, C., Chervitz, S.A., Dwight, S.S., Hester, E.T., Jia, Y., Juvik, G., Roe, T., Schroeder, M., Weng, S., Botstein, D.: SGD: Saccharomyces genome database. Nucleic Acids Research 26(1), 73–79 (1998), http://dx.doi.org/10.1093/nar/26.1.73

    Article  Google Scholar 

  8. Cohen, A.A., Geva-Zatorsky, N., Eden, E., Frenkel-Morgenstern, M., Issaeva, I., Sigal, A., Milo, R., Cohen-Saidon, C., Liron, Y., Kam, Z., Cohen, L., Danon, T., Perzov, N., Alon, U.: Dynamic proteomics of individual cancer cells in response to a drug. Science 322(5907), 1511–1516 (2008), http://dx.doi.org/10.1126/science.1160165

    Article  Google Scholar 

  9. De Strooper, B., Beullens, M., Contreras, B., Levesque, L., Craessaerts, K., Cordell, B., Moechars, D., Bollen, M., Fraser, P., St. George-Hyslop, P., Van Leuven, F.: Phosphorylation, subcellular localization, and membrane orientation of the Alzheimer’s disease-associated presenilins. Journal of Biological Chemistry 272(6), 3590–3598 (1997), http://dx.doi.org/10.1074/jbc.272.6.3590

    Article  Google Scholar 

  10. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society. Series B (Methodological) 39(1), 1–38 (1977), http://dx.doi.org/10.2307/2984875 , doi:10.2307/2984875

    MathSciNet  MATH  Google Scholar 

  11. Emanuelsson, O., Nielsen, H., Brunak, S., von Heijne, G.: Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. J. Mol. Biol. 300(4), 1005–1016 (2000), http://dx.doi.org/10.1006/jmbi.2000.3903

    Article  Google Scholar 

  12. Gao, X., Xiao, B., Tao, D., Li, X.: A survey of graph edit distance. Pattern Analysis & Applications 13(1), 113–129 (2010), http://dx.doi.org/10.1007/s10044-008-0141-y

    Article  MathSciNet  Google Scholar 

  13. Gladden, A.B., Diehl, A.A.: Location, location, location: the role of cyclin D1 nuclear localization in cancer. Journal of cellular biochemistry 96(5), 906–913 (2005), http://dx.doi.org/10.1002/jcb.20613

    Article  Google Scholar 

  14. Horton, P., Park, K.J., Obayashi, T., Fujita, N., Harada, H., Collier, C.J.A., Nakai, K.: WoLF PSORT: protein localization predictor. Nucleic Acids Res. 35(Web Server issue), W585–W587 (2007), http://dx.doi.org/10.1093/nar/gkm259

    Article  Google Scholar 

  15. Huh, W.K., Falvo, J.V., Gerke, L.C., Carroll, A.S., Howson, R.W., Weissman, J.S., O’Shea, E.K.: Global analysis of protein localization in budding yeast. Nature 425(6959), 686–691 (2003), http://dx.doi.org/10.1038/nature02026

    Article  Google Scholar 

  16. Kau, T.R., Way, J.C., Silver, P.A.: Nuclear transport and cancer: from mechanism to intervention. Nat. Rev. Cancer 4(2), 106–117 (2004), http://dx.doi.org/10.1038/nrc1274

    Article  Google Scholar 

  17. Lee, K., Chuang, H.Y., Beyer, A., Sung, M.K., Huh, W.K., Lee, B., Ideker, T.: Protein networks markedly improve prediction of subcellular localization in multiple eukaryotic species. Nucleic Acids Research 36(20), e136+ (2008), http://dx.doi.org/10.1093/nar/gkn619

    Article  Google Scholar 

  18. Lin, T.H., Murphy, R.F., Bar-Joseph, Z.: Discriminative motif finding for predicting protein subcellular localization. IEEE/ACM Trans. Comput. Biol. Bioinform. (2009) (to appear)

    Google Scholar 

  19. Lodish, H.F.: Molecular cell biology, 5threv. edn. W.H. Freeman and Company, New York (August 2003), http://www.worldcat.org/isbn/0716743663

    Google Scholar 

  20. Mulder, N.J., Apweiler, R., Attwood, T.K., Bairoch, A., Barrell, D., Bateman, A., Binns, D., Biswas, M., Bradley, P., Bork, P., Bucher, P., Copley, R.R., Courcelle, E., Das, U., Durbin, R., Falquet, L., Fleischmann, W., Jones, S.G., Haft, D., Harte, N., Hulo, N., Kahn, D., Kanapin, A., Krestyaninova, M., Lopez, R., Letunic, I., Lonsdale, D., Silventoinen, V., Orchard, S.E., Pagni, M., Peyruc, D., Ponting, C.P., Selengut, J.D., Servant, F., Sigrist, C.J.A., Vaughan, R., Zdobnov, E.M.: The InterPro database, 2003 brings increased coverage and new features. Nucleic Acids Res. 31(1), 315–318 (2003)

    Article  Google Scholar 

  21. Nair, R., Rost, B.: Mimicking cellular sorting improves prediction of subcellular localization. J. Mol. Biol. 348(1), 85–100 (2005), http://dx.doi.org/10.1016/j.jmb.2005.02.025

    Article  Google Scholar 

  22. Newberg, J.Y., Li, J., Rao, A., Pontén, F., Uhlén, M., Lundberg, E., Murphy, R.F.: Automated analysis of human protein atlas immunofluorescence images. In: Proceedings of the 2009 IEEE International Symposium on Biomedical Imaging, pp. 1023–1026 (2009)

    Google Scholar 

  23. Osuna, E.G., Hua, J., Bateman, N.W., Zhao, T., Berget, P.B., Murphy, R.F.: Large-scale automated analysis of location patterns in randomly tagged 3T3 cells. Ann. Biomed. Eng. 35(6), 1081–1087 (2007), http://dx.doi.org/10.1007/s10439-007-9254-5

    Article  Google Scholar 

  24. Pierleoni, A., Martelli, P.L., Fariselli, P., Casadio, R.: Bacello: a balanced subcellular localization predictor. Bioinformatics 22 (2006), http://view.ncbi.nlm.nih.gov/pubmed/16873501

  25. Purdue, P.E., Takada, Y., Danpure, C.J.: Identification of mutations associated with peroxisome-to-mitochondrion mistargeting of alanine/glyoxylate aminotransferase in primary hyperoxaluria type 1. J. Cell Biol. 111(6), 2341–2351 (1990), http://dx.doi.org/10.1083/jcb.111.6.2341

    Article  Google Scholar 

  26. Rashid, M., Saha, S., Raghava, G.P.: Support Vector Machine-based method for predicting subcellular localization of mycobacterial proteins using evolutionary information and motifs. BMC Bioinformatics 8, 337 (2007), http://dx.doi.org/10.1186/1471-2105-8-337

    Article  Google Scholar 

  27. Rubartelli, A., Sitia, R.: Secretion of mammalian proteins that lack a signal sequence. In: Unusual Secretory Pathways: From Bacteria to Man, pp. 87–104. RG Landes, Austin (1997)

    Chapter  Google Scholar 

  28. Scott, M.S., Calafell, S.J., Thomas, D.Y., Hallett, M.T.: Refining protein subcellular localization. PLoS Comput. Biol. 1(6) (November 2005), http://dx.doi.org/10.1371/journal.pcbi.0010066

  29. Shatkay, H., Höglund, A., Brady, S., Blum, T., Dönnes, P., Kohlbacher, O.: SherLoc: high-accuracy prediction of protein subcellular localization by integrating text and protein sequence data. Bioinformatics 23(11), 1410–1417 (2007), http://dx.doi.org/10.1093/bioinformatics/btm115

    Article  Google Scholar 

  30. Shen, Y.Q., Burger, G.: ’unite and conquer’: enhanced prediction of protein subcellular localization by integrating multiple specialized tools. BMC Bioinformatics 8, 420+ (2007), http://dx.doi.org/10.1186/1471-2105-8-420

    Article  Google Scholar 

  31. Sinha, S.: On counting position weight matrix matches in a sequence, with application to discriminative motif finding. Bioinformatics 22(14), e454–e463 (2006), http://dx.doi.org/10.1093/bioinformatics/btl227

    Article  Google Scholar 

  32. Skach, W.R.: Defects in processing and trafficking of the cystic fibrosis transmembrane conductance regulator. Kidney International 57(3), 825–831 (2000), http://dx.doi.org/10.1046/j.1523-1755.2000.00921.x

    Article  Google Scholar 

  33. Stark, C., Breitkreutz, B.J., Reguly, T., Boucher, L., Breitkreutz, A., Tyers, M.: BioGRID: a general repository for interaction datasets. Nucleic Acids Research 34(suppl 1), D535–D539 (2006), http://dx.doi.org/10.1093/nar/gkj109

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lin, Th., Bar-Joseph, Z., Murphy, R.F. (2011). Learning Cellular Sorting Pathways Using Protein Interactions and Sequence Motifs. In: Bafna, V., Sahinalp, S.C. (eds) Research in Computational Molecular Biology. RECOMB 2011. Lecture Notes in Computer Science(), vol 6577. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20036-6_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-20036-6_20

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-20035-9

  • Online ISBN: 978-3-642-20036-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics