Skip to main content

Evolved Matrix Operations for Post-processing Protein Secondary Structure Predictions

  • Conference paper
Genetic Programming (EuroGP 2004)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3003))

Included in the following conference series:

  • 756 Accesses

Abstract

Predicting the three-dimensional structure of proteins is a hard problem, so many have opted instead to predict the secondary structural state (usually helix, strand or coil) of each amino acid residue. This should be an easier task, but it now seems that a ceiling of around 76% per-residue three-state accuracy has been reached. Further improvements will require the correct processing of so-called “long-range information”. We present a novel application of genetic programming to evolve high-level matrix operations to post-process secondary structure prediction probabilities produced by the popular, state-of-the-art neural network-based PSIPRED by David Jones. We show that global and long-range information may be used to increase three-state accuracy by at least 0.26 percentage points – a small but statistically significant difference. This is on top of the 0.14 percentage point increase already made by PSIPRED’s built-in filters.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J.H., Zhang, Z., Miller, W., Lipman, D.J.: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nuc. Ac. Res. 25, 3389–3402 (1997)

    Article  Google Scholar 

  2. Brenner, S.E., Koehl, P., Levitt, M.: The ASTRAL compendium for protein structure and sequence analysis. Nuc. Ac. Res. 28(1), 254–256 (2000)

    Article  Google Scholar 

  3. Eyrich, V.A., Marti-Renom, M.A., Przybylski, D., Madhusudhan, M.S., Fiser, A., Pazos, F., Valencia, A., Sali, A., Rost, B.: EVA: continuous automatic evaluation of protein structure prediction servers. Bioinformatics 17(12), 1242–1243 (2001)

    Article  Google Scholar 

  4. Jones, D.T.: Protein secondary structure prediction based on position-specific scoring matrices. J. Mol. Biol. 292, 195–202 (1999)

    Article  Google Scholar 

  5. Kabsch, W., Sander, C.: Dictionary of protein secondary structure — pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22, 2577–2637 (1983)

    Article  Google Scholar 

  6. Keijzer, M.: Scientific Discovery using Genetic Programming. PhD thesis, Danish Technical University, Lyngby, Denmark (March 2002)

    Google Scholar 

  7. Koza, J.R.: Genetic Programming: On the Programming of Computers by Natural Selection. MIT Press, Cambridge (1992)

    MATH  Google Scholar 

  8. MacCallum, R.M.: Introducing a Perl Genetic Programming System: and Can Meta-evolution Solve the Bloat Problem? In: Ryan, C., Soule, T., Keijzer, M., Tsang, E.P.K., Poli, R., Costa, E. (eds.) EuroGP 2003. LNCS, vol. 2610, pp. 369–378. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  9. Meiler, J., Baker, D.: Coupled prediction of protein secondary and tertiary structure. Proc. Natl. Acad. Sci. USA 100(21), 12105–12110 (2003)

    Article  Google Scholar 

  10. Montana, D.J.: Strongly typed genetic programming. BBN Technical Report #7866, Bolt Beranek and Newman, Inc., 10 Moulton Street, Cambridge, MA 02138, USA (May 7, 1993)

    Google Scholar 

  11. Murzin, G., Brenner, S.E., Hubbard, T., Chothia, C.: SCOP — a structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol. 247, 536–540 (1995)

    Google Scholar 

  12. Rost, B.: Review: protein secondary structure prediction continues to rise. J. Struct. Biol. 134(2-3), 204–218 (2001)

    Article  Google Scholar 

  13. Silva, S.: GPLAB - A Genetic Programming Toolbox for MATLAB, http://www.itqb.unl.pt:1111/gplab

  14. Soeller, C., Schwebel, R., Lukka, T.J., Jenness, T., Hunt, D., Glazebrook, K., Cerney, J., Brinchmann, J.: The Perl Data Language, http://pdl.perl.org

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Aggarwal, V., MacCallum, R.M. (2004). Evolved Matrix Operations for Post-processing Protein Secondary Structure Predictions. In: Keijzer, M., O’Reilly, UM., Lucas, S., Costa, E., Soule, T. (eds) Genetic Programming. EuroGP 2004. Lecture Notes in Computer Science, vol 3003. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24650-3_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-24650-3_20

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-21346-8

  • Online ISBN: 978-3-540-24650-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics