Revealing the Proteome Complexity by Mass Spectrometry

Zubarev, Roman A.

doi:10.1007/11732990_38

Roman A. Zubarev²⁴

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 3909))

Included in the following conference series:

Annual International Conference on Research in Computational Molecular Biology

1262 Accesses

Abstract

The complexity of higher biological organisms is astounding, but the source of this complexity is far from obvious. With the emergence of epigenetics, the assumed main source of complexity has been shifted from the genome to pre- and post-translational modifications in proteins. There are estimated 100,000 different protein sequences in the human organism, and perhaps 10-100 times as many different protein forms. Analysis of the human proteome is a much more challenging task than that of the human genome. The challenge is to provide sufficient amount of information in experimental datasets to match the underlying complexity.

Mass spectrometry (MS) is one of the most informative techniques, widely used today for protein characterization. MS is the fastest growing spectroscopy area, which in 2005 has overtaken NMR as the prime research field. After a major revolution in the late 1980s (awarded by the Nobel prize in Chemistry in 2002), MS has continued to develop rapidly, showing amazing ability for innovation. Today, several different types of mass analyzers are competing with each other for the future. This diversity means that the field of MS, although a century old, is still in the fast evolving phase and is far from saturation.

Despite the rapid progress, today’s MS tools are still largely insufficient. Mathematical models of the MS-based proteomics analysis as well as experimental assessments showed large disproportions between the information content of the experimental MS datasets and the underlying sample complexity. One of the most desired improvements would be the higher quality of ion fragmentation in tandem mass spectrometry (MS/MS). The latter parameter boils down to the ability to specifically fragment each of the chemical bonds (C-C, C-N and N-C) linking amino acid residues in a polypeptide sequence. This formidable physico-chemical challenge is met by recently emerged techniques involving ion-electron reactions.

Characterization of primary polypeptide sequences of unmodified amino acids is a basic task in proteomics. Recent large-scale evaluation has shown that de novo sequencing by conventional MS/MS is insufficiently reliable. Fortunately, novel fragmentation techniques improved the situation and allowed the first proteomics-grade de novo sequencing routine to be developed.

Another group of challenges relates to the ability to extract maximum information from MS/MS data. The database search technologies developed in the late 1990s are still the backbone of routine proteomics analyses, but they are rapidly becoming insufficient. Typically, only 5 to 15% of all MS/MS data produce “hits” in the database, with the bulk of the data being discarded. Research in that issue has led to the emergence of a quality factor for MS/MS data (S-score). S-score analysis has shown that only half of the data are discarded for a good reason, while another half could be utilized by improved algorithms. Such algorithms specially designed to deal with any mutation or modification have recently uncovered hundreds of new types of modifications in the human proteome. High mass accuracy reveals the elemental compositions of these modifications, and MS/MS determines their positions. The potential of such algorithms for unearthing the vast and previously invisible world of modifications and thus tackling proteome’s enormous complexity will be discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Author information

Authors and Affiliations

Laboratory for Biological and Medical Mass Spectrometry, Uppsala University, Sweden
Roman A. Zubarev

Authors

Roman A. Zubarev
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Georgia Institute of Technology and Università di Padova,
Alberto Apostolico
Topic Chairs, P.O. Box
Concettina Guerra
Center for Molecular Biology and Computer Sciecne Department, Brown University, 115 Waterman St., 02912, Providence, RI, USA
Sorin Istrail
University of California, San Diego, USA
Pavel A. Pevzner
Department of Molecular and Computational Biology, University of Southern California, 1050 Childs Way, 90089-2910, Los Angeles, CA, USA
Michael Waterman

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zubarev, R.A. (2006). Revealing the Proteome Complexity by Mass Spectrometry. In: Apostolico, A., Guerra, C., Istrail, S., Pevzner, P.A., Waterman, M. (eds) Research in Computational Molecular Biology. RECOMB 2006. Lecture Notes in Computer Science(), vol 3909. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11732990_38

Download citation

DOI: https://doi.org/10.1007/11732990_38
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33295-4
Online ISBN: 978-3-540-33296-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics