Abstract
Predicting the three-dimensional structure of proteins is a hard problem, so many have opted instead to predict the secondary structural state (usually helix, strand or coil) of each amino acid residue. This should be an easier task, but it now seems that a ceiling of around 76% per-residue three-state accuracy has been reached. Further improvements will require the correct processing of so-called “long-range information”. We present a novel application of genetic programming to evolve high-level matrix operations to post-process secondary structure prediction probabilities produced by the popular, state-of-the-art neural network-based PSIPRED by David Jones. We show that global and long-range information may be used to increase three-state accuracy by at least 0.26 percentage points – a small but statistically significant difference. This is on top of the 0.14 percentage point increase already made by PSIPRED’s built-in filters.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J.H., Zhang, Z., Miller, W., Lipman, D.J.: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nuc. Ac. Res. 25, 3389–3402 (1997)
Brenner, S.E., Koehl, P., Levitt, M.: The ASTRAL compendium for protein structure and sequence analysis. Nuc. Ac. Res. 28(1), 254–256 (2000)
Eyrich, V.A., Marti-Renom, M.A., Przybylski, D., Madhusudhan, M.S., Fiser, A., Pazos, F., Valencia, A., Sali, A., Rost, B.: EVA: continuous automatic evaluation of protein structure prediction servers. Bioinformatics 17(12), 1242–1243 (2001)
Jones, D.T.: Protein secondary structure prediction based on position-specific scoring matrices. J. Mol. Biol. 292, 195–202 (1999)
Kabsch, W., Sander, C.: Dictionary of protein secondary structure — pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22, 2577–2637 (1983)
Keijzer, M.: Scientific Discovery using Genetic Programming. PhD thesis, Danish Technical University, Lyngby, Denmark (March 2002)
Koza, J.R.: Genetic Programming: On the Programming of Computers by Natural Selection. MIT Press, Cambridge (1992)
MacCallum, R.M.: Introducing a Perl Genetic Programming System: and Can Meta-evolution Solve the Bloat Problem? In: Ryan, C., Soule, T., Keijzer, M., Tsang, E.P.K., Poli, R., Costa, E. (eds.) EuroGP 2003. LNCS, vol. 2610, pp. 369–378. Springer, Heidelberg (2003)
Meiler, J., Baker, D.: Coupled prediction of protein secondary and tertiary structure. Proc. Natl. Acad. Sci. USA 100(21), 12105–12110 (2003)
Montana, D.J.: Strongly typed genetic programming. BBN Technical Report #7866, Bolt Beranek and Newman, Inc., 10 Moulton Street, Cambridge, MA 02138, USA (May 7, 1993)
Murzin, G., Brenner, S.E., Hubbard, T., Chothia, C.: SCOP — a structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol. 247, 536–540 (1995)
Rost, B.: Review: protein secondary structure prediction continues to rise. J. Struct. Biol. 134(2-3), 204–218 (2001)
Silva, S.: GPLAB - A Genetic Programming Toolbox for MATLAB, http://www.itqb.unl.pt:1111/gplab
Soeller, C., Schwebel, R., Lukka, T.J., Jenness, T., Hunt, D., Glazebrook, K., Cerney, J., Brinchmann, J.: The Perl Data Language, http://pdl.perl.org
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Aggarwal, V., MacCallum, R.M. (2004). Evolved Matrix Operations for Post-processing Protein Secondary Structure Predictions. In: Keijzer, M., O’Reilly, UM., Lucas, S., Costa, E., Soule, T. (eds) Genetic Programming. EuroGP 2004. Lecture Notes in Computer Science, vol 3003. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24650-3_20
Download citation
DOI: https://doi.org/10.1007/978-3-540-24650-3_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-21346-8
Online ISBN: 978-3-540-24650-3
eBook Packages: Springer Book Archive