skip to main content
10.1145/2506583.2506634acmconferencesArticle/Chapter ViewAbstractPublication PagesbcbConference Proceedingsconference-collections
tutorial

Improving phosphopeptide identification in shotgun proteomics by supervised filtering of peptide-spectrum matches

Authors Info & Claims
Published:22 September 2013Publication History

ABSTRACT

One of the important objectives in mass spectrometry-based proteomics is the identification of post-translationally modified sites in cellular and extracellular proteomes. Proteomics techniques have been particularly effective in studying protein phosphorylation, where tens of thousands of new sites have been recently discovered in all domains of life. Such massive discovery of new sites has been facilitated by progress in affinity enrichment techniques, high-throughput analytical platforms that couple liquid chromatography (LC) and tandem mass spectrometry (MS/MS), and also powerful computational tools that assign peptides to tandem mass spectra. In this work we focus on computational protocols for identifying phosphoproteins, phosphopeptides, and phosphosites. Although the current tools already provide solid results, most methods have not been tuned to exploit particular sequence and physicochemical properties of phosphopeptides or the peculiarities of their fragment spectra. Therefore, novel algorithms can be designed to increase the sensitivity of phosphosite identification. Here we describe a machine learning-based method that improves the identification of phosphopeptides in LC-MS/MS experiments. Our algorithm is applied as a post-processing step to a standard database search. It assigns a probability score to each peptide-spectrum match (PSM) corresponding to a phosphopeptide, based on the sequence and spectral features of the peptide and its assigned fragment spectra as well as the biological propensity of particular residues in the peptide to be phosphorylated. The algorithm is based on a simple but robust logistic regression model and is used together with a conventional search engine (here, MASCOT) to filter out the PSMs with the lowest probability of being correctly identified. Our protocol was tested on two large phosphoproteomics data sets on which it increased the number of identified phosphopeptides by 10-15% compared to the conventional scoring algorithms at the same false discovery rate threshold of 1%.

References

  1. Walsh, C.T., Posttranslational modification of proteins: expanding nature's inventory. 2006, Roberts and Co. Publishers.Google ScholarGoogle Scholar
  2. Gsponer, J., et al., Tight regulation of unstructured proteins: from transcript synthesis to protein degradation. Science, 2008. 322(5906): 1365--8.Google ScholarGoogle Scholar
  3. Iakoucheva, L.M., et al., The importance of intrinsic disorder for protein phosphorylation. Nucleic Acids Res, 2004. 32(3): 1037--49.Google ScholarGoogle ScholarCross RefCross Ref
  4. Johnson, L.N. and R.J. Lewis, Structural basis for control by phosphorylation. Chem Rev, 2001. 101(8): 2209--42.Google ScholarGoogle Scholar
  5. Xin, F. and P. Radivojac, Post-translational modifications induce significant yet not extreme changes to protein structure. Bioinformatics, 2012. 28(22): 2905--13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Nussinov, R., et al., Allosteric post-translational modification codes. Trends Biochem Sci, 2012. 37(10): 447--55.Google ScholarGoogle Scholar
  7. Landry, C.R., E.D. Levy, and S.W. Michnick, Weak functional constraints on phosphoproteomes. Trends Genet, 2009. 25(5): 193--7.Google ScholarGoogle Scholar
  8. Gray, V.E. and S. Kumar, Rampant purifying selection conserves positions with posttranslational modifications in human proteins. Mol Biol Evol, 2011. 28(5): 1565--8.Google ScholarGoogle Scholar
  9. Gnad, F., et al., High-accuracy identification and bioinformatic analysis of in vivo protein phosphorylation sites in yeast. Proteomics, 2009. 9(20): 4642--52.Google ScholarGoogle Scholar
  10. Gnad, F., et al., Evolutionary constraints of phosphorylation in eukaryotes, prokaryotes, and mitochondria. Mol Cell Proteomics, 2010. 9(12): 2642--53.Google ScholarGoogle Scholar
  11. Gnad, F., J. Gunawardena, and M. Mann, PHOSIDA 2011: the posttranslational modification database. Nucleic Acids Res, 2011. 39(Database issue): D253--60.Google ScholarGoogle Scholar
  12. Bodenmiller, B., et al., PhosphoPep-a database of protein phosphorylation sites in model organisms. Nat Biotechnol, 2008. 26(12): 1339--1340.Google ScholarGoogle Scholar
  13. Eisenhaber, B. and F. Eisenhaber, Prediction of posttranslational modification of proteins from their amino acid sequence. Methods Mol Biol, 2010. 609: 365--84.Google ScholarGoogle Scholar
  14. Li, S., et al., Loss of post-translational modification sites in disease. Pac Symp Biocomput, 2010: 337--47.Google ScholarGoogle Scholar
  15. Mort, M., et al., In silico functional profiling of human disease-associated and polymorphic amino acid substitutions. Hum Mutat, 2010. 31(3): 335--46.Google ScholarGoogle Scholar
  16. Radivojac, P., et al., Gain and loss of phosphorylation sites in human cancer. Bioinformatics, 2008. 24(16): i241--7. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Li, B., et al., Automated inference of molecular mechanisms of disease from amino acid substitutions. Bioinformatics, 2009. 25(21): 2744--2750. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Hata, J.A. and W.J. Koch, Phosphorylation of G protein-coupled receptors: GPCR kinases in heart disease. Mol Interv, 2003. 3(5): 264--72.Google ScholarGoogle Scholar
  19. Buee, L., et al., Tau protein isoforms, phosphorylation and role in neurodegenerative disorders. Brain Res Brain Res Rev, 2000. 33(1): 95--130.Google ScholarGoogle Scholar
  20. Olsen, J.V. and M. Mann, Improved peptide identification in proteomics by two consecutive stages of mass spectrometric fragmentation. Proc Natl Acad Sci USA, 2004. 101(37): 13417--22.Google ScholarGoogle Scholar
  21. Domon, B. and R. Aebersold, Mass spectrometry and protein analysis. Science, 2006. 312(5771): 212--7.Google ScholarGoogle Scholar
  22. Witze, E.S., et al., Mapping protein post-translational modifications with mass spectrometry. Nat Methods, 2007. 4(10): 798--806.Google ScholarGoogle ScholarCross RefCross Ref
  23. Perkins, D.N., et al., Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis, 1999. 20(18): 3551--67.Google ScholarGoogle Scholar
  24. Eng, J.K., A.L. McCormack, and J.R. Yates, 3rd, An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J Am Soc Mass Spectrom, 1994. 5: 976--989.Google ScholarGoogle Scholar
  25. Ruse, C.I., et al., Motif-specific sampling of phosphoproteomes. J Proteome Res, 2008. 7(5): 2140--50.Google ScholarGoogle Scholar
  26. Beausoleil, S.A., et al., Large-scale characterization of HeLa cell nuclear phosphoproteins. Proc Natl Acad Sci USA, 2004. 101(33): 12130--5.Google ScholarGoogle Scholar
  27. Berry, N.B., M. Fan, and K.P. Nephew, Estrogen receptor-alpha hinge-region lysines 302 and 303 regulate receptor degradation by the proteasome. Mol Endocrinol, 2008. 22(7): 1535--51.Google ScholarGoogle Scholar
  28. Tanner, S., et al., Accurate annotation of peptide modifications through unrestrictive database search. J Proteome Res, 2008. 7(1): 170--181.Google ScholarGoogle Scholar
  29. Martin, D.M., et al., Prophossi: automating expert validation of phosphopeptide-spectrum matches from tandem mass spectrometry. Bioinformatics, 2010. 26(17): 2153--9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Lu, B., et al., Automatic validation of phosphopeptide identifications from tandem mass spectra. Anal Chem, 2007. 79(4): 1301--10.Google ScholarGoogle Scholar
  31. Olsen, J.V., et al., Global, in vivo, and site-specific phosphorylation dynamics in signaling networks. Cell, 2006. 127(3): 635--48.Google ScholarGoogle Scholar
  32. Beausoleil, S.A., et al., A probability-based approach for high-throughput protein phosphorylation analysis and site localization. Nat Biotechnol, 2006. 24(10): 1285--92.Google ScholarGoogle Scholar
  33. Ruttenberg, B.E., et al., PhosphoScore: an open-source phosphorylation site assignment tool for MSn data. J Proteome Res, 2008. 7(7): 3054--9.Google ScholarGoogle Scholar
  34. Villen, J., et al., Large-scale phosphorylation analysis of mouse liver. Proc Natl Acad Sci USA, 2007. 104(5): 1488--93.Google ScholarGoogle Scholar
  35. Manning, G., et al., The protein kinase complement of the human genome. Science, 2002. 298(5600): 1912--34.Google ScholarGoogle Scholar
  36. Obenauer, J.C., L.C. Cantley, and M.B. Yaffe, Scansite 2.0: proteome-wide prediction of cell signaling interactions using short sequence motifs. Nucleic Acids Res, 2003. 31(13): 3635--3641.Google ScholarGoogle Scholar
  37. Blom, N., S. Gammeltoft, and S. Brunak, Sequence and structure-based prediction of eukaryotic protein phosphorylation sites. J Mol Biol, 1999. 294(5): 1351--62.Google ScholarGoogle Scholar
  38. Ahmad, W., et al., Serine 204 phosphorylation and O-beta-GlcNAC interplay of IGFBP-6 as therapeutic indicator to regulate IGF-II functions in viral mediated hepatocellular carcinoma. Virol J, 2011. 8: 208.Google ScholarGoogle Scholar
  39. Xu, H., et al., Toward a complete in silico, multi-layered embryonic stem cell regulatory network. Wiley Interdiscip Rev Syst Biol Med, 2010. 2(6): 708--33.Google ScholarGoogle Scholar
  40. Whisenant, T.C., et al., Computational prediction and experimental verification of new MAP kinase docking sites and substrates including Gli transcription factors. PLoS Comput Biol, 2010. 6(8).Google ScholarGoogle ScholarCross RefCross Ref
  41. Iliuk, A.B., et al., In-depth analyses of kinase-dependent tyrosine phosphoproteomes based on metal ion-functionalized soluble nanopolymers. Mol Cell Proteomics, 2010. 9(10): 2162--72.Google ScholarGoogle Scholar
  42. Kim, M.S., et al., Systematic evaluation of alternating CID and ETD fragmentation for phosphorylated peptides. Proteomics, 2011. 11(12): 2568--72.Google ScholarGoogle Scholar
  43. Golub, G.H. and C.F. Van Loan, Matrix computations. 3rd ed. 1996, Johns Hopkins University Press. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Improving phosphopeptide identification in shotgun proteomics by supervised filtering of peptide-spectrum matches

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        BCB'13: Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics
        September 2013
        987 pages
        ISBN:9781450324342
        DOI:10.1145/2506583

        Copyright © 2013 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 22 September 2013

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • tutorial
        • Research
        • Refereed limited

        Acceptance Rates

        BCB'13 Paper Acceptance Rate43of148submissions,29%Overall Acceptance Rate254of885submissions,29%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader