Abstract
Mass spectrometry (MS) instruments and experimental protocols are rapidly advancing, but de novo peptide sequencing algorithms to analyze tandem mass (MS/MS) spectra are lagging behind. While existing de novo sequencing tools perform well on certain types of spectra (e.g., Collision Induced Dissociation (CID) spectra of tryptic peptides), their performance often deteriorates on other types of spectra, such as Electron Transfer Dissociation (ETD), Higher-energy Collisional Dissociation (HCD) spectra, or spectra of non-tryptic digests. Thus, rather than developing a new algorithm for each type of spectra, we develop a universal de novo sequencing algorithm called UniNovo that works well for all types of spectra or even for spectral pairs (e.g., CID/ETD spectral pairs). The performance of UniNovo is compared with PepNovo+, PEAKS, and pNovo using various types of spectra. The results show that the performance of UniNovo is superior to other tools for ETD spectra and superior or comparable to others for CID and HCD spectra. UniNovo also estimates the probability that each reported reconstruction is correct, using simple statistics that are readily obtained from a small training dataset. We demonstrate that the estimation is accurate for all tested types of spectra (including CID, HCD, ETD, CID/ETD, and HCD/ETD spectra of trypsin, LysC, or AspN digested peptides). The appendix is available online at http://proteomics.ucsd.edu/Software/UniNovo.html
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bandeira, N., Olsen, J.V., Mann, M., Pevzner, P.A.: Multi-spectra peptide sequencing and its applications to multistage mass spectrometry. Bioinformatics 24(13), i416–i423 (2008)
Barton, S.J., Whittaker, J.C.: Review of factors that influence the abundance of ions produced in a tandem mass spectrometer and statistical methods for discovering these factors. Mass Spectrometry Reviews 28(1), 177–187 (2009)
Bern, M., Cai, Y., Goldberg, D.: Lookup peaks: a hybrid of de novo sequencing and database search for protein identification by tandem mass spectrometry. Anal. Chem. 79(4), 1393–1400 (2007)
Breci, L.A., Tabb, D.L., Yates, J.R., Wysocki, V.H.: Cleavage n-terminal to proline: analysis of a database of peptide tandem mass spectra. Analytical Chemistry 75(9), 1963–1971 (2003)
Chen, T., Kao, M.Y., Tepel, M., Rush, J., Church, G.M.: A dynamic programming approach to de novo peptide sequencing via tandem mass spectrometry. Journal of Computational Biology: A Journal of Computational Molecular Cell Biology 8(3), 325–337 (2001)
Chi, H., Sun, R., Yang, B., Song, C., Wang, L., Liu, C., Fu, Y., Yuan, Z., Wang, H., He, S., Dong, M.: pNovo: de novo peptide sequencing and identification using HCD spectra. J. Proteome Res. 9(5), 2713–2724 (2010)
Dancik, V., Addona, T.A., Clauser, K.R., Vath, J.E., Pevzner, P.A.: De novo peptide sequencing via tandem mass spectrometry. Journal of Computational Biology 6(3-4), 327–342 (1999)
Datta, R., Bern, M.: Spectrum fusion: using multiple mass spectra for de novo peptide sequencing. Journal of Computational Biology 16(8), 1169–1182 (2009)
Elias, J.E., Gygi, S.P.: Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry. Nature Methods 4(3), 207–214 (2007)
Eng, J.K., McCormack, A.L., Yates, J.R.: An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. Journal of the American Society for Mass Spectrometry 5(11), 976–989 (1994)
Frank, A.: A ranking-based scoring function for peptide-spectrum matches. Journal of Proteome Research 8(5), 2241–2252 (2009)
Frank, A., Pevzner, P.: PepNovo: de novo peptide sequencing via probabilistic network modeling. Anal. Chem. 77(4), 964–973 (2005)
Frese, C.K., Altelaar, A.F.M., Hennrich, M.L., Nolting, D., Zeller, M., Griep-Raming, J., Heck, A.J.R., Mohammed, S.: Improved peptide identification by targeted fragmentation using CID, HCD and ETD on an LTQ-Orbitrap velos. J. Proteome Res. 10(5), 2377–2388 (2011)
He, L., Ma, B.: ADEPTS: advanced peptide de novo sequencing with a pair of tandem mass spectra. Journal of Bioinformatics and Computational Biology 8(6), 981–994 (2010)
Huang, Y., Triscari, J.M., Tseng, G.C., Pasa-Tolic, L., Lipton, M.S., Smith, R.D., Wysocki, V.H.: Statistical characterization of the charge state and residue dependence of low-energy CID peptide dissociation patterns. Analytical Chemistry 77(18), 5800–5813 (2005)
Hunter, D.: An upper bound for the probability of a union. Journal of Applied Probability 13(3), 597–603 (1976)
Jeong, K., Kim, S., Bandeira, N., Pevzner, P.A.: Gapped spectral dictionaries and their applications for database searches of tandem mass spectra. Molecular & Cellular Proteomics 10(6), M110.002220 (2011)
Johnson, R.S., Martin, S.A., Biemann, K., Stults, J.T., Watson, J.T.: Novel fragmentation process of peptides by collision-induced decomposition in a tandem mass spectrometer: differentiation of leucine and isoleucine. Anal. Chem. 59(21), 2621–2625 (1987)
Käll, L., Canterbury, J.D., Weston, J., Noble, W.S., MacCoss, M.J.: Semi-supervised learning for peptide identification from shotgun proteomics datasets. Nature Methods 4(11), 923–925 (2007)
Keller, A., Nesvizhskii, A., Kolker, E., Aebersold, R.: Empirical statistical model to estimate the accuracy of peptide identifications made by ms/ms and database search. Anal. Chem. 74, 5383–5392 (2002)
Kersey, P.J., Duarte, J., Williams, A., Karavidopoulou, Y., Birney, E., Apweiler, R.: The international protein index: an integrated database for proteomics experiments. Proteomics 4(7), 1985–1988 (2004)
Kim, S., Bandeira, N., Pevzner, P.A.: Spectral profiles, a novel representation of tandem mass spectra and their applications for de novo peptide sequencing and identification. Molecular & Cellular Proteomics 8(6), 1391–1400 (2009)
Kim, S., Gupta, N., Bandeira, N., Pevzner, P.A.: Spectral dictionaries. Molecular & Cellular Proteomics 8(1), 53–69 (2009)
Kim, S., Mischerikow, N., Bandeira, N., Navarro, J.D., Wich, L., Mohammed, S., Heck, A.J.R., Pevzner, P.A.: The generating function of CID, ETD, and CID/ETD pairs of tandem mass spectra: Applications to database search. Molecular & Cellular Proteomics 9(12), 2840–2852 (2010)
Liu, X., Shan, B., Xin, L., Ma, B.: Better score function for peptide identification with ETD MS/MS spectra. BMC Bioinformatics 11(suppl. 1), S4 (2010)
Ma, B., Johnson, R.: De novo sequencing and homology searching. Molecular & Cellular Proteomics, O111.014902 (2011)
Ma, B., Zhang, K., Hendrie, C., Liang, C., Li, M., Doherty-Kirby, A., Lajoie, G.: PEAKS: powerful software for peptide de novo sequencing by tandem mass spectrometry. Rapid Communications in Mass Spectrometry: RCM 17(20), 2337–2342 (2003)
Nesvizhskii, A.I.: A survey of computational methods and error rate estimation procedures for peptide and protein identification in shotgun proteomics. J. Proteomics 73(11), 2092–2123 (2010)
Ng, J., Amir, A., Pevzner, P.A.: Blocked Pattern Matching Problem and Its Applications in Proteomics. In: Bafna, V., Sahinalp, S.C. (eds.) RECOMB 2011. LNCS, vol. 6577, pp. 298–319. Springer, Heidelberg (2011)
Olsen, J.V., Macek, B., Lange, O., Makarov, A., Horning, S., Mann, M.: Higher-energy c-trap dissociation for peptide modification analysis. Nature Methods 4(9), 709–712 (2007)
Perkins, D.N., Pappin, D.J., Creasy, D.M., Cottrell, J.S.: Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 20(18), 3551–3567 (1999)
Savitski, M.M., Nielsen, M.L., Kjeldsen, F., Zubarev, R.A.: Proteomics-Grade de novo sequencing approach. J. Proteome Res. 4(6), 2348–2354 (2005)
Swaney, D.L., McAlister, G.C., Coon, J.J.: Decision tree-driven tandem mass spectrometry for shotgun proteomics. Nature Methods 5(11), 959–964 (2008)
Swaney, D.L., Wenger, C.D., Coon, J.J.: Value of using multiple proteases for Large-Scale mass Spectrometry-Based proteomics. J. Proteome Res. 9(3), 1323–1329 (2010)
Tabb, D.L., Huang, Y., Wysocki, V.H., Yates, J.R.: Influence of basic residue content on fragment ion peak intensities in Low-Energy Collision-Induced dissociation spectra of peptides. Anal. Chem. 76(5), 1243–1248 (2004)
Wysocki, V.H., Tsaprailis, G., Smith, L.L., Breci, L.A.: Mobile and localized protons: a framework for understanding peptide dissociation. Journal of Mass Spectrometry 35(12), 1399–1406 (2000)
Zubarev, R.A., Zubarev, A.R., Savitski, M.M.: Electron Capture/Transfer versus collisionally Activated/Induced dissociations: Solo or duet? Journal of the American Society for Mass Spectrometry 19, 753–761 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jeong, K., Kim, S., Pevzner, P.A. (2013). UniNovo : A Universal Tool for de Novo Peptide Sequencing. In: Deng, M., Jiang, R., Sun, F., Zhang, X. (eds) Research in Computational Molecular Biology. RECOMB 2013. Lecture Notes in Computer Science(), vol 7821. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37195-0_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-37195-0_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37194-3
Online ISBN: 978-3-642-37195-0
eBook Packages: Computer ScienceComputer Science (R0)