skip to main content
10.1145/1276958.1277382acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
Article

Peptide detectability following ESI mass spectrometry: prediction using genetic programming

Published: 07 July 2007 Publication History

Abstract

The accurate quantification of proteins is important in several areas of cell biology, biotechnology and medicine. Both relative and absolute quantification of proteins is often determined following mass spectrometric analysis of one or more of their constituent peptides. However, in order for quantification to be successful, it is important that the experimenter knows which peptides are readily detectable under the mass spectrometric conditions used for analysis. In this paper, genetic programming is used to develop a function which predicts the detectability of peptides from their calculated physico-chemical properties. Classification is carried out in two stages: the selection of a good classifier using the AUROC objective function and the setting of an appropriate threshold. This allows the user to select the balance point between conflicting priorities in an intuitive way. The success of this method is found to be highly dependent on the initial selection of input parameters. The use of brood recombination and a modified version of the multi-objective FOCUS method are also investigated. While neither has a significant effect on predictive accuracy, the use of the FOCUS method leads to considerably more compact solutions.

References

[1]
Aebersold, R. and Mann, M. Mass spectrometry-based proteomics. In Nature, 422 (Mar. 2003), 198--207.
[2]
Altenberg, L. The evolution of Evolvability in Genetic Programming. In Advances in Genetic Programming, K.E. Kinnear, K.E. (ed.), 47--74. MIT Press, Cambridge, MA, 1994.
[3]
Beynon, R.J., Doherty, M.K., Pratt, J.M and Gaskell, S.J. Multiplexed absolute quantification in proteomics using artificial QCAT proteins of concatenated signature peptides. In Nature Methods, 2, 8 (Aug. 2005), 587--589. Published online at http://www.nature.com/nmeth/journal/v2/n8
[4]
Banzhaf, W., Nordin, P., Keller, R.E. and Francone, F.D. Genetic Programming -- An Introduction, Morgan Kaufmann, San Fransisco, CA, 1998.
[5]
Breiman, L., Friedman, J., Olshen, R. and Stone, C. Classification and Regression Trees. Chapman & Hall / CRC, 1984
[6]
Broadhurst, D.I. and Kell, D.B. Statistical Strategies for Avoiding False Discoveries in Metabolomics and Related Experiments. Metabolomics, 2, 4 (Dec. 2006), 171--197.
[7]
Cover, T. and Hart, P. Nearest neighbor pattern classification. In IEEE Transactions on Information Theory, 13, 1 (Jan. 1967), 21--27.
[8]
Eriksson, J., Chait, B.T. and Fenyo, D. A Statistical Basis for Testing the Significance of Mass Spectrometric Protein Identification Results. Analytical Chemistry 72, 5 (Mar. 2000), 999--1005.
[9]
Fenn, J.B., Mann, M., Meng, C.K., Wong, S.F. and Whitehouse, C.M. Electrospray ionization for mass spectrometry of large biomolecules. Science, 246, 4926 (Oct. 1989), 64--71.
[10]
Gay, S., Binz, P.-A., Hochstrasser, D.F. and Appel, R.D. Peptide mass fingerprinting peak intensity prediction: Extracting knowledge from spectra. In Proteomics, 2, 10 (Nov. 2002), 1374--1391.
[11]
Gerber, S.A., Rush, J., Stemman, O., Kirshner, M.W. and Gygi, S.P. Absolute quantification of proteins and phosphoproteins from cell lysates by tandem MS. In PNAS, 100, 12 (Jun' 2003), 6940--6945
[12]
Gianazza,E., Eberini, I., Arnoldi, A., Wait, R. and Sirtori, C.R. A Proteomic Investigation of Isolated Soy Proteins with Variable Effects in Experimental and Clinical Studies. In The Journal of Nutrition, 133, 1 (Jan. 2003), 9--14.
[13]
de Jong, E.D., Watson, R.A. and Pollack, J.B. Reducing Bloat and Promoting Diversity using Multi-Objective Methods. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO--2001), (Jul' 2001), 11--18
[14]
Langdon, W.B. Genetic Programming and Data Structures. Kluwer, Massachusetts, MS, 1998.
[15]
Pratt, J.M., Simpson, D.M., Doherty, M.K., Rivers, J., Gaskell, S.J. and Beynon, R.J. Multiplexed absolute quantification for proteomics using concatenated signature peptides encoded by QconCAT genes. In Nature Protocols, 1, 2 (2006), 1029--1043.
[16]
Rifai, N, Gillette, M.A. and Carr, S.A. Protein biomarker discovery and validation: the long and uncertain path to clinical utility. In Nature Biotechnology, 24, 8 (Aug. 2006), 971--983.
[17]
Tackett, W.A. Recombination, Selection, and the Genetic Construction of Computer Programs. PhD thesis, University of Southern California, 1994.
[18]
Tang H., Arnold, R.J., Alves, P., Xun, Z., Clemmer, D., Novotny, M.V., Reilly, J.P. and Radivojac, P. A computational approach toward label-free protein quantification using predicted peptide detectability. In Bioinformatics, 22, 14 (Jul' 2006), e481--e488.
[19]
Vaidyanathan, S., Broadhurst, D.I., Kell, D.B. and Goodacre, R. Explanatory Optimization of Protein Mass Spectrometry via Genetic Search. Anal. Chem. 75, 23 (Dec. 2003), 6679--6686.
[20]
Westin, L.K., Receiver operating characteristic (ROC) analysis. Technical paper, UNINF-01.18, 2001, Umea University, http://www.cs.umu.se/research/report

Cited By

View all
  • (2023)Bioinformatics Tools and Knowledgebases to Assist Generating Targeted Assays for Plasma ProteomicsSerum/Plasma Proteomics10.1007/978-1-0716-2978-9_32(557-577)Online publication date: 12-Feb-2023
  • (2019)AP3: An Advanced Proteotypic Peptide Predictor for Targeted Proteomics by Incorporating Peptide DigestibilityAnalytical Chemistry10.1021/acs.analchem.9b0252091:13(8705-8711)Online publication date: 6-Jun-2019
  • (2014)Genetic Programming for Measuring Peptide DetectabilityProceedings of the 10th International Conference on Simulated Evolution and Learning - Volume 888610.1007/978-3-319-13563-2_50(593-604)Online publication date: 15-Dec-2014
  • Show More Cited By

Index Terms

  1. Peptide detectability following ESI mass spectrometry: prediction using genetic programming

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        GECCO '07: Proceedings of the 9th annual conference on Genetic and evolutionary computation
        July 2007
        2313 pages
        ISBN:9781595936974
        DOI:10.1145/1276958
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 07 July 2007

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. AUROC
        2. classification
        3. genetic programming
        4. input selection
        5. mass spectrometry
        6. proteomics

        Qualifiers

        • Article

        Conference

        GECCO07
        Sponsor:

        Acceptance Rates

        GECCO '07 Paper Acceptance Rate 266 of 577 submissions, 46%;
        Overall Acceptance Rate 1,669 of 4,410 submissions, 38%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)2
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 16 Feb 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2023)Bioinformatics Tools and Knowledgebases to Assist Generating Targeted Assays for Plasma ProteomicsSerum/Plasma Proteomics10.1007/978-1-0716-2978-9_32(557-577)Online publication date: 12-Feb-2023
        • (2019)AP3: An Advanced Proteotypic Peptide Predictor for Targeted Proteomics by Incorporating Peptide DigestibilityAnalytical Chemistry10.1021/acs.analchem.9b0252091:13(8705-8711)Online publication date: 6-Jun-2019
        • (2014)Genetic Programming for Measuring Peptide DetectabilityProceedings of the 10th International Conference on Simulated Evolution and Learning - Volume 888610.1007/978-3-319-13563-2_50(593-604)Online publication date: 15-Dec-2014
        • (2012)Computational approaches to protein inference in shotgun proteomicsBMC Bioinformatics10.1186/1471-2105-13-S16-S413:S16Online publication date: 5-Nov-2012
        • (2012)Genetic programming for biomarker detection in mass spectrometry dataProceedings of the 25th Australasian joint conference on Advances in Artificial Intelligence10.1007/978-3-642-35101-3_23(266-278)Online publication date: 4-Dec-2012
        • (2011)CONSeQuence: Prediction of Reference Peptides for Absolute Quantitative Proteomics Using Consensus Machine Learning ApproachesMolecular & Cellular Proteomics10.1074/mcp.M110.00338410:11(M110.003384)Online publication date: 3-Aug-2011
        • (2010)The Importance of Peptide Detectability for Protein Identification, Quantification, and Experiment Design in MS/MS ProteomicsJournal of Proteome Research10.1021/pr10055869:12(6288-6297)Online publication date: 10-Nov-2010
        • (2008)Rapid prediction of optimum population size in genetic programming using a novel genotype -Proceedings of the 10th annual conference on Genetic and evolutionary computation10.1145/1389095.1389346(1315-1322)Online publication date: 13-Jul-2008

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media