Abstract
The protein inference problem represents a major challenge in shotgun proteomics. Here we describe a novel Bayesian approach to address this challenge that incorporates the predicted peptide detectabilities as the prior probabilities of peptide identification. Our model removes some unrealistic assumptions used in previous approaches and provides a rigorious probabilistic solution to this problem. We used a complex synthetic protein mixture to test our method, and obtained promising results.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Aebersold, R., Mann, M.: Mass spectrometry-based proteomics. Nature 422, 198–207 (2003)
McDonald, W.H., Yates, J.R.: Shotgun proteomics: integrating technologies to answer biological questions. Curr. Opin. Mol. Ther. 5(3), 302–309 (2003)
Kislinger, T., Emili, A.: Multidimensional protein identification technology: current status and future prospects. Expert Rev. Proteomics 2(1), 27–39 (2005)
Swanson, S.K., Washburn, M.P.: The continuing evolution of shotgun proteomics. Drug Discov. Today 10(10), 719–725 (2005)
Marcotte, E.M.: How do shotgun proteomics algorithms identify proteins?. Nat. Biotechnol. 25(7), 755–757 (2007)
Nesvizhskii, A.I.: Protein identification by tandem mass spectrometry and sequence database searching. Methods Mol Biol 367, 87–119 (2007)
Yates, J.R., Eng, J.K., McCormack, A.L., Schieltz, D.: Method to correlate tandem mass spectra of modified peptides to amino acid sequences in the protein database. Anal Chem 67, 1426–1436 (1995)
Perkins, D.N., Pappin, D.J., Creasy, D.M., Cottrell, J.S.: Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 20(18), 3551–3567 (1999)
Craig, R., Beavis, R.C.: TANDEM: matching proteins with tandem mass spectra. Bioinformatics 20(9), 1466–1467 (2004)
Nesvizhskii, A.I., Aebersold, R.: Interpretation of shotgun proteomic data: the protein inference problem. Mol Cell Proteomics 4(10), 1419–1440 (2005)
Nesvizhskii, A.I., Keller, A., Kolker, E., Aebersold, R.: A statistical model for identifying proteins by tandem mass spectrometry. Anal Chem 75(17), 4646–4658 (2003)
Alves, P., Arnold, R.J., Novotny, M.V., Radivojac, P., Reilly, J.P., Tang, H.: Advancement in protein inference from shotgun proteomics using peptide detectability. In: PSB 2007: Pacific Symposium on Biocomputing, pp. 409–420. World Scientific, Singapore (2007)
Zhang, B., Chambers, M.C., Tabb, D.L.: Proteomic Parsimony through Bipartite Graph Analysis Improves Accuracy and Transparency. J Proteome Res. 6(9), 3549–3557 (2007)
Tang, H., Arnold, R.J., Alves, P., Xun, Z., Clemmer, D.E., Novotny, M.V., Reilly, J.P., Radivojac, P.: A computational approach toward label-free protein quantification using predicted peptide detectability. Bioinformatics 22(14), 481–488 (2006)
Lu, P., Vogel, C., Wang, R., Yao, X., Marcotte, E.M.: Absolute protein expression profiling estimates the relative contributions of transcriptional and translational regulation. Nat. Biotechnol. 25(1), 117–124 (2007)
Elias, J.E., Haas, W., Faherty, B.K., Gygi, S.P.: Comparative evaluation of mass spectrometry platforms used in large-scale proteomics investigations. Nat. Methods 2(9), 667–675 (2005)(Comparative Study)
Elias, J.E., Gygi, S.P.: Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry. Nat. Methods 4(3), 207–214 (2007) (Evaluation Studies)
Keller, A., Nesvizhskii, A.I., Kolker, E., Aebersold, R.: Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. Anal Chem 74(20), 5383–5392 (2002)
Wu, F.-X., Gagne, P., Droit, A., Poirier, G.G.: RT-PSM, a real-time program for peptide-spectrum matching with statistical significance. Rapid Commun Mass Spectrom 20(8), 1199–1208 (2006)
Bern, M., Goldberg, D.: Improved ranking functions for protein and modification-site identifications. In: Speed, T., Huang, H. (eds.) RECOMB 2007. LNCS (LNBI), vol. 4453, pp. 444–458. Springer, Heidelberg (2007)
Geman, S., Geman, D.: Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Trans. on Pattern Analysis and Machine Intelligence 6, 721–741 (1984)
Liu, J.S.: Monte Carlo strategies in scientific computing. Springer, Heidelberg (2002)
Brunner, E., Ahrens, C.H., Mohanty, S., Baetschmann, H., Loevenich, S., Potthast, F., Deutsch, E.W., Panse, C., de Lichtenberg, U., Rinner, O., Lee, H., Pedrioli, P.G.A., Malmstrom, J., Koehler, K., Schrimpf, S., Krijgsveld, J., Kregenow, F., Heck, A.J.R., Hafen, E., Schlapbach, R., Aebersold, R.: A high-quality catalog of the Drosophila melanogaster proteome. Nat Biotechnol. 25(5), 576–583 (2007)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Li, Y.F., Arnold, R.J., Li, Y., Radivojac, P., Sheng, Q., Tang, H. (2008). A Bayesian Approach to Protein Inference Problem in Shotgun Proteomics. In: Vingron, M., Wong, L. (eds) Research in Computational Molecular Biology. RECOMB 2008. Lecture Notes in Computer Science(), vol 4955. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78839-3_15
Download citation
DOI: https://doi.org/10.1007/978-3-540-78839-3_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-78838-6
Online ISBN: 978-3-540-78839-3
eBook Packages: Computer ScienceComputer Science (R0)