Skip to main content

Revealing the Proteome Complexity by Mass Spectrometry

  • Conference paper
Research in Computational Molecular Biology (RECOMB 2006)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 3909))

  • 1262 Accesses

Abstract

The complexity of higher biological organisms is astounding, but the source of this complexity is far from obvious. With the emergence of epigenetics, the assumed main source of complexity has been shifted from the genome to pre- and post-translational modifications in proteins. There are estimated 100,000 different protein sequences in the human organism, and perhaps 10-100 times as many different protein forms. Analysis of the human proteome is a much more challenging task than that of the human genome. The challenge is to provide sufficient amount of information in experimental datasets to match the underlying complexity.

Mass spectrometry (MS) is one of the most informative techniques, widely used today for protein characterization. MS is the fastest growing spectroscopy area, which in 2005 has overtaken NMR as the prime research field. After a major revolution in the late 1980s (awarded by the Nobel prize in Chemistry in 2002), MS has continued to develop rapidly, showing amazing ability for innovation. Today, several different types of mass analyzers are competing with each other for the future. This diversity means that the field of MS, although a century old, is still in the fast evolving phase and is far from saturation.

Despite the rapid progress, today’s MS tools are still largely insufficient. Mathematical models of the MS-based proteomics analysis as well as experimental assessments showed large disproportions between the information content of the experimental MS datasets and the underlying sample complexity. One of the most desired improvements would be the higher quality of ion fragmentation in tandem mass spectrometry (MS/MS). The latter parameter boils down to the ability to specifically fragment each of the chemical bonds (C-C, C-N and N-C) linking amino acid residues in a polypeptide sequence. This formidable physico-chemical challenge is met by recently emerged techniques involving ion-electron reactions.

Characterization of primary polypeptide sequences of unmodified amino acids is a basic task in proteomics. Recent large-scale evaluation has shown that de novo sequencing by conventional MS/MS is insufficiently reliable. Fortunately, novel fragmentation techniques improved the situation and allowed the first proteomics-grade de novo sequencing routine to be developed.

Another group of challenges relates to the ability to extract maximum information from MS/MS data. The database search technologies developed in the late 1990s are still the backbone of routine proteomics analyses, but they are rapidly becoming insufficient. Typically, only 5 to 15% of all MS/MS data produce “hits” in the database, with the bulk of the data being discarded. Research in that issue has led to the emergence of a quality factor for MS/MS data (S-score). S-score analysis has shown that only half of the data are discarded for a good reason, while another half could be utilized by improved algorithms. Such algorithms specially designed to deal with any mutation or modification have recently uncovered hundreds of new types of modifications in the human proteome. High mass accuracy reveals the elemental compositions of these modifications, and MS/MS determines their positions. The potential of such algorithms for unearthing the vast and previously invisible world of modifications and thus tackling proteome’s enormous complexity will be discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zubarev, R.A. (2006). Revealing the Proteome Complexity by Mass Spectrometry. In: Apostolico, A., Guerra, C., Istrail, S., Pevzner, P.A., Waterman, M. (eds) Research in Computational Molecular Biology. RECOMB 2006. Lecture Notes in Computer Science(), vol 3909. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11732990_38

Download citation

  • DOI: https://doi.org/10.1007/11732990_38

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-33295-4

  • Online ISBN: 978-3-540-33296-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics