Skip to main content

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 222))

Abstract

The similarity search in theoretical mass spectra generated from protein sequence databases is a widely accepted approach for identification of peptides from query mass spectra generated by shotgun proteomics. Since query spectra contain many inaccuracies and the sizes of databases grow rapidly in recent years, demands on more accurate mass spectra similarities and on the utilization of database indexing techniques are still desirable. We propose a statistical comparison of parameterized Hausdorff distance with freely available tools OMSSA, X!Tandem and with the cosine similarity. We show that a precursor mass filter in combination with a modification of previously proposed parameterized Hausdorff distance outperforms state-of-the-art tools in both – the speed of search and the number of identified peptide sequences (even though the q-value is only 0.001). Our method is implemented in the freely available application SimTandem which can be used in the framework TOPP based on OpenMS.

This work was supported in part by FEBS Short-Term Fellowship, by Czech Science Foundation (GAČR) project Nr. 202/11/0968 and by the grant SVV-2013-267312.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Beck, M., et al.: The quantitative proteome of a human cell line. Molecular Systems Biology 7, 549 (2011)

    Article  Google Scholar 

  2. Craig, R., Beavis, R.C.: TANDEM: matching proteins with tandem mass spectra. Bioinformatics 20(9), 1466–1467 (2004)

    Article  Google Scholar 

  3. Eidhammer, I., Flikka, K., Martens, L., Mikalsen, S.O.: Computational Methods for Mass Spectrometry Proteomics. John Wiley & Sons, England (2007)

    Book  Google Scholar 

  4. Eng, J., McCormack, A., Yates, J.: An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J. of the Am. Soc. for Mass Spec. 5, 976–989 (1994)

    Article  Google Scholar 

  5. Geer, L.Y., et al.: Open Mass Spectrometry Search Algorithm. Journal of Proteome Research 3, 958–964 (2004)

    Article  Google Scholar 

  6. Käll, L., et al.: Assigning Significance to Peptides Identified by Tandem Mass Spectrometry Using Decoy Databases. Journal of Proteome Research 7, 29–34 (2008)

    Article  Google Scholar 

  7. Kohlbacher, O., et al.: TOPP – the OpenMS proteomics pipeline. Bioinformatics 23(2), e191–e197 (2007)

    Article  Google Scholar 

  8. Liu, J., et al.: Methods for peptide identification by spectral comparison. Proteome Science 5(3) (2007)

    Google Scholar 

  9. MSDB, http://www.proteomics.leeds.ac.uk/bioinf/

  10. NCBI RefSeq, http://www.ncbi.nlm.nih.gov/RefSeq/

  11. Nesvizhskii, A.I.: A survey of computational methods and error rate estimation procedures for peptide and protein identification in shotgun proteomics. Journal of Proteomics 73(11), 2092–2123 (2010)

    Article  Google Scholar 

  12. Novák, J., Hoksza, D.: Parametrised Hausdorff Distance as a Non-Metric Similarity Model for Tandem Mass Spectrometry. In: CEUR Proc. DATESO, pp. 1–12 (2010)

    Google Scholar 

  13. Perkins, D.N., et al.: Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 20(18), 3551–3567 (1999)

    Article  Google Scholar 

  14. Pevzner, P.A., et al.: Efficiency of Database Search for Identification of Mutated and Modified Proteins via Mass Spectrometry. Genome Research 11(2), 290–299 (2001)

    Article  Google Scholar 

  15. Sturm, M., et al.: OpenMS – An open-source software framework for mass spectrometry. BMC Bioinformatics 9, 163 (2008)

    Google Scholar 

  16. UniProtKB/Swiss-Prot, http://www.uniprot.org/

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jiří Novák .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer International Publishing Switzerland

About this paper

Cite this paper

Novák, J., Sachsenberg, T., Hoksza, D., Skopal, T., Kohlbacher, O. (2013). A Statistical Comparison of SimTandem with State-of-the-Art Peptide Identification Tools. In: Mohamad, M., Nanni, L., Rocha, M., Fdez-Riverola, F. (eds) 7th International Conference on Practical Applications of Computational Biology & Bioinformatics. Advances in Intelligent Systems and Computing, vol 222. Springer, Heidelberg. https://doi.org/10.1007/978-3-319-00578-2_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-00578-2_14

  • Publisher Name: Springer, Heidelberg

  • Print ISBN: 978-3-319-00577-5

  • Online ISBN: 978-3-319-00578-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics