Abstract
The similarity search in theoretical mass spectra generated from protein sequence databases is a widely accepted approach for identification of peptides from query mass spectra generated by shotgun proteomics. Since query spectra contain many inaccuracies and the sizes of databases grow rapidly in recent years, demands on more accurate mass spectra similarities and on the utilization of database indexing techniques are still desirable. We propose a statistical comparison of parameterized Hausdorff distance with freely available tools OMSSA, X!Tandem and with the cosine similarity. We show that a precursor mass filter in combination with a modification of previously proposed parameterized Hausdorff distance outperforms state-of-the-art tools in both – the speed of search and the number of identified peptide sequences (even though the q-value is only 0.001). Our method is implemented in the freely available application SimTandem which can be used in the framework TOPP based on OpenMS.
This work was supported in part by FEBS Short-Term Fellowship, by Czech Science Foundation (GAČR) project Nr. 202/11/0968 and by the grant SVV-2013-267312.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Beck, M., et al.: The quantitative proteome of a human cell line. Molecular Systems Biology 7, 549 (2011)
Craig, R., Beavis, R.C.: TANDEM: matching proteins with tandem mass spectra. Bioinformatics 20(9), 1466–1467 (2004)
Eidhammer, I., Flikka, K., Martens, L., Mikalsen, S.O.: Computational Methods for Mass Spectrometry Proteomics. John Wiley & Sons, England (2007)
Eng, J., McCormack, A., Yates, J.: An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J. of the Am. Soc. for Mass Spec. 5, 976–989 (1994)
Geer, L.Y., et al.: Open Mass Spectrometry Search Algorithm. Journal of Proteome Research 3, 958–964 (2004)
Käll, L., et al.: Assigning Significance to Peptides Identified by Tandem Mass Spectrometry Using Decoy Databases. Journal of Proteome Research 7, 29–34 (2008)
Kohlbacher, O., et al.: TOPP – the OpenMS proteomics pipeline. Bioinformatics 23(2), e191–e197 (2007)
Liu, J., et al.: Methods for peptide identification by spectral comparison. Proteome Science 5(3) (2007)
NCBI RefSeq, http://www.ncbi.nlm.nih.gov/RefSeq/
Nesvizhskii, A.I.: A survey of computational methods and error rate estimation procedures for peptide and protein identification in shotgun proteomics. Journal of Proteomics 73(11), 2092–2123 (2010)
Novák, J., Hoksza, D.: Parametrised Hausdorff Distance as a Non-Metric Similarity Model for Tandem Mass Spectrometry. In: CEUR Proc. DATESO, pp. 1–12 (2010)
Perkins, D.N., et al.: Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 20(18), 3551–3567 (1999)
Pevzner, P.A., et al.: Efficiency of Database Search for Identification of Mutated and Modified Proteins via Mass Spectrometry. Genome Research 11(2), 290–299 (2001)
Sturm, M., et al.: OpenMS – An open-source software framework for mass spectrometry. BMC Bioinformatics 9, 163 (2008)
UniProtKB/Swiss-Prot, http://www.uniprot.org/
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer International Publishing Switzerland
About this paper
Cite this paper
Novák, J., Sachsenberg, T., Hoksza, D., Skopal, T., Kohlbacher, O. (2013). A Statistical Comparison of SimTandem with State-of-the-Art Peptide Identification Tools. In: Mohamad, M., Nanni, L., Rocha, M., Fdez-Riverola, F. (eds) 7th International Conference on Practical Applications of Computational Biology & Bioinformatics. Advances in Intelligent Systems and Computing, vol 222. Springer, Heidelberg. https://doi.org/10.1007/978-3-319-00578-2_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-00578-2_14
Publisher Name: Springer, Heidelberg
Print ISBN: 978-3-319-00577-5
Online ISBN: 978-3-319-00578-2
eBook Packages: EngineeringEngineering (R0)