Skip to main content

A Multi-agent System for Protein Secondary Structure Prediction

  • Conference paper
Transactions on Computational Systems Biology III

Part of the book series: Lecture Notes in Computer Science ((TCSB,volume 3737))

Abstract

In this paper, we illustrate a system aimed at predicting protein secondary structures. Our proposal falls in the category of multiple experts, a machine learning technique that –under the assumption of absent or negative correlation in experts’ errors– may outperform monolithic classifier systems. The prediction activity results from the interaction of a population of experts, each integrating genetic and neural technologies. Roughly speaking, an expert of this kind embodies a genetic classifier designed to control the activation of a feedforward artificial neural network. Genetic and neural components (i.e., guard and embedded predictor, respectively) are devoted to perform different tasks and are supplied with different information: Each guard is aimed at (soft-) partitioning the input space, insomuch assuring both the diversity and the specialization of the corresponding embedded predictor, which in turn is devoted to perform the actual prediction. Guards deal with inputs that encode information strictly related with relevant domain knowledge, whereas embedded predictors process other relevant inputs, each consisting of a limited window of residues. To investigate the performance of the proposed approach, a system has been implemented and tested on the RS126 set of proteins. Experimental results point to the potential of the approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J.: Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990)

    Google Scholar 

  2. Altschul, S.F., Madden, T.L., Schaeffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J.: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997)

    Article  Google Scholar 

  3. Anfinsen, C.B.: Principles that govern the folding of protein chains. Science 181, 223–230 (1973)

    Article  Google Scholar 

  4. Armano, G.: NXCS Experts for Financial Time Series Forecasting. In: Bull, L. (ed.) Applications of Learning Classifier Systems, pp. 68–91. Springer, Heidelberg (2004)

    Google Scholar 

  5. Bairoch, A., Apweiler, R.: The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res. 28, 45–48 (2000)

    Article  Google Scholar 

  6. Baldi, P., Brunak, S., Frasconi, P., Soda, G., Pollastri, G.: Exploiting the Past and the Future in Protein Secondary Structure Prediction. Bioinformatics 15, 937–946 (1999)

    Article  Google Scholar 

  7. Baldi, P., Brunak, S., Frasconi, P., Pollastri, G., Soda, G.: Bidirectional Dynamics for Protein Secondary Structure Prediction. In: Sun, R., Giles, C.L. (eds.) Sequence Learning: Paradigms, Algorithms, and Applications, pp. 80–104. Springer, Heidelberg (2000)

    Google Scholar 

  8. Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, N., Weissig, H., Shindyalov, I.N., Bourne, P.E.: The Protein Data Bank. Nucleic Acids Research. 28, 235–242 (2000)

    Article  Google Scholar 

  9. Blundell, T.L., Johnson, M.S.: Catching a common fold. Prot. Sci. 2(6), 877–883 (1993)

    Article  Google Scholar 

  10. Boczko, E.M., Brooks, C.L.: First-principles calculation of the folding free energy of a three-helix bundle protein. Science 269(5222), 393–396 (1995)

    Article  Google Scholar 

  11. Bowie, J.U., Luthy, R., Eisenberg, D.: A method to identify protein sequences that fold into a known 3-dimensional structure. Science 253, 164–170 (1991)

    Article  Google Scholar 

  12. Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth, Belmont (1984)

    MATH  Google Scholar 

  13. Breiman, L.: Stacked Regressions. Machine Learning 24, 41–48 (1996)

    MATH  MathSciNet  Google Scholar 

  14. Cleeremans, A.: Mechanisms of Implicit Learning. In: Connectionist Models of Sequence Processing. MIT Press, Cambridge (1993)

    Google Scholar 

  15. Chothia, C., Lesk, A.M.: The relation between the divergence of sequence and structure in proteins. EMBO J. 5, 823–826 (1986)

    Google Scholar 

  16. Chothia, C.: One thousand families for the molecular biologist. Nature 357, 543–544 (1992)

    Article  Google Scholar 

  17. Chou, P.Y., Fasman, U.D.: Prediction of protein conformation. Biochem. 13, 211–215 (1974)

    Article  Google Scholar 

  18. Chothia, C.: Proteins – 1000 families for the molecular biologist. Nature 357, 543–544 (1992)

    Article  Google Scholar 

  19. Clark, P., Niblett, T.: The CN2 Induction Algorithm. Machine Learning 3(4), 261–283 (1989)

    Google Scholar 

  20. Cuff, J.A., Barton, G.J.: Evaluation and improvement of multiple sequence methods for protein secondary structure prediction. PROTEINS: Structure, Function and Genetics 34, 508–519 (1999)

    Article  Google Scholar 

  21. Dandekar, T., Argos., P.: Folding the main chain of small proteins with the genetic algorithm. J. Mol. Biol. 236, 844–861 (1994)

    Article  Google Scholar 

  22. Covell, D.G.: Folding protein alpha-carbon chains into compact forms by Monte Carlo methods. Proteins 14, 409–420 (1992)

    Article  Google Scholar 

  23. Flockner, H., Braxenthaler, M., Lackner, P., Jaritz, M., Ortner, M., Sippl, M.J.: Progress in fold recognition. Proteins: Struct., Funct., Genet. 23, 376–386 (1995)

    Article  Google Scholar 

  24. Freund, Y., Schapire, R.E.: A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. Journal of Computer Science and System Sciences 55(1), 119–139 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  25. Gething, M.J., Sambrook, J.: Protein folding in the cell. Nature 355, 33–45 (1992)

    Article  Google Scholar 

  26. Goldberg, D.E.: Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley, Reading (1989)

    MATH  Google Scholar 

  27. Greer, J.: Comparative modelling methods: application to the family of the mammalian serine proteases. Proteins 7, 317–334 (1990)

    Article  Google Scholar 

  28. Havel, T.F.: Predicting the structure of the flavodoxin from Eschericia coli by homology modeling, distance geometry and molecular dynamics. Mol. Simulation 10, 175–210 (1993)

    Article  Google Scholar 

  29. Henikoff, S., Henikoff, J.G.: Amino acid substitution matrices from protein blocks. Proc. Nat. Acad. Sci., 10915–10919 (1989)

    Google Scholar 

  30. Holley, H.L., Karplus, M.: Protein secondary structure prediction with a neural network. Proc. Natl. Acad. Sc., U.S.A. 86, 152–156 (1989)

    Article  Google Scholar 

  31. Hartl, F.U.: Secrets of a double-doughnut. Nature 371, 557–559 (1994)

    Article  Google Scholar 

  32. Higgins, D., Thompson, J., Gibson, T., Thompson, J.D., Higgins, D.G., Gibson, T.J.: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673–4680 (1994)

    Article  Google Scholar 

  33. Hinds, D.A., Levitt, M.: Exploring conformational space with a simple lattice model for protein structure. J. Mol. Biol. 243, 668–682 (1994)

    Article  Google Scholar 

  34. Holland, J.H.: Adaptation in Natural and Artificial Systems. University of Michigan Press, Ann Arbor (1975)

    Google Scholar 

  35. Holland, J.H.: Adaption. In: Rosen, R., Snell, F.M. (eds.) Progress in Theoretical Biology, vol. 4, pp. 263–293. Academic Press, New York (1976)

    Google Scholar 

  36. Holland, J.H.: Escaping Brittleness: The possibilities of General-Purpose Learning Algorithms Applied to Parallel Rule-Based Systems. In: Michalski, R.S., Carbonell, J., Mitchell, M. (eds.) Machine Learning, An Artificial Intelligence Approach, vol. II 20, pp. 593–623. Morgan Kaufmann, San Francisco (1986)

    Google Scholar 

  37. Jacobs, R.A., Jordan, M.I., Nowlan, S.J., Hinton, G.E.: Adaptive Mixtures of Local Experts. Neural Computation 3, 79–87 (1991)

    Article  Google Scholar 

  38. Jones, D.T., Taylor, W.R., Thornton, J.M.: A new approach to protein fold recognition. Nature 358, 86–89 (1992)

    Article  Google Scholar 

  39. Jones, D.T.: Protein secondary structure prediction based on position-specific scoring matrices. J. Mol. Biol. 292, 195–202 (1999)

    Article  Google Scholar 

  40. Jordan, M.I., Jacobs, R.A.: Hierarchies of Adaptive Experts. In: Moody, J., Hanson, S., Lippman, R. (eds.) Advances in Neural Information Processing Systems, vol. 4, pp. 985–993. Morgan Kaufmann, San Francisco (1992)

    Google Scholar 

  41. Kanehisa, M.: A multivariate analysis method for discriminating protein secondary structural segments. Prot. Engin. 2, 87–92 (1988)

    Article  Google Scholar 

  42. Krogh, A., Vedelsby, J.: Neural Network Ensembles, Cross Validation, and Active Learning. In: Tesauro, G., Touretzky, D., Leen, T. (eds.) Advances in Neural Information Processing Systems, vol. 7, pp. 231–238. MIT Press, Cambridge (1995)

    Google Scholar 

  43. Lathrop, R.H., Smith, T.F.: Global optimum protein threading with gapped alignment and empirical pair score functions. J. Mol. Biol. 255, 641–665 (1996)

    Article  Google Scholar 

  44. Levitt, M.: Protein folding by constrained energy minimization and molecular dynamics. J. Mol. Biol. 170, 723–764 (1983)

    Article  Google Scholar 

  45. Levitt, M.: A simplified representation of protein conformations for rapid simulation of protein folding. J. Mol. Biol. 104, 59–107 (1976)

    Article  Google Scholar 

  46. Madej, T., Gibrat, J.F., Bryant, S.H.: Threading a database of protein cores. Proteins: Struct., Funct., Genet. 23, 356–369 (1995)

    Article  Google Scholar 

  47. Mitchell, E.M., Artymiuk, P.J., Rice, D.W., Willett, P.: Use of techniques derived from graph theory to compare secondary structure motifs in proteins. J. Mol. Biol. 212, 151–166 (1992)

    Article  Google Scholar 

  48. Orengo, C.A., Jones, D.T., Thornton, J.M.: Protein superfamilies and domain superfolds. Nature 372, 631–634 (1994)

    Article  Google Scholar 

  49. Quinlan, J.R.: Induction of Decision Trees. Machine Learning 1, 81–106 (1986)

    Google Scholar 

  50. Ptitsyn, O.B., Finkelstein, A.V.: Theory of protein secondary structure and algorithm of its prediction. Biopolymers 22, 15–25 (1983)

    Article  Google Scholar 

  51. Pollastri, G., Przybylski, D., Rost, B., Baldi, P.: Improving the Prediction of Protein Secondary Structure in Three and Eight Classes Using Neural Networks and Profiles. Proteins 47, 228–235 (2002)

    Article  Google Scholar 

  52. Riis, S.K., Krogh, A.: Improving prediction of protein secondary structure using structured neural networks and multiple sequence alignments. J. Comp. Biol. 3, 163–183 (1996)

    Article  Google Scholar 

  53. Rivest, R.L.: Learning Decision Lists. Machine Learning 2(3), 229–246 (1987)

    Google Scholar 

  54. Robson, B.: Conformational properties of amino acid residues in globular proteins. J. Mol. Biol. 107, 327–356 (1976)

    Article  Google Scholar 

  55. Rost, B., Sander, C.: Prediction of protein secondary structure at better than 70% accuracy. J. Mol. Biol. 232, 584–599 (1993)

    Article  Google Scholar 

  56. Roterman, I.K., Lambert, M.H., Gibson, K.D., Scheraga, H.A.: A comparison of the charmm, amber and ecepp potentials for peptides. ii. phi-psi maps for n-acetyl alanine n’-methyl amide: comparisons, contrasts and simple experimental tests. J. Biomol. Struct. Dynamics 7, 421–453 (1989)

    Google Scholar 

  57. Russell, R.B., Copley, R.R., Barton, G.J.: Protein fold recognition by mapping predicted secondary structures. J. Mol. Biol. 259, 349–365 (1996)

    Article  Google Scholar 

  58. Sali, A.: Modelling mutations and homologous proteins. Curr. Opin. Biotech. 6, 437–451 (1995)

    Article  Google Scholar 

  59. Salamov, A.A., Solovyev, V.V.: Prediction of protein secondary structure by combining nearest-neighbor algorithms and multiple sequence alignment. J. Mol. Biol. 247, 11–15 (1995)

    Article  Google Scholar 

  60. Sanchez, R., Sali, A.: Advances in comparative protein-structure modeling. Curr. Opin. Struct. Biol. 7, 206–214 (1997)

    Article  Google Scholar 

  61. Schapire, E.: A Brief Introduction to Boosting. In: Proc. of the Sixteenth Int. Joint Conference on Artificial Intelligence, pp. 1401–1406 (1999)

    Google Scholar 

  62. Skolnick, J., Kolinski, A.: Simulations of the folding of a globular protein. Science 250, 1121–1125 (1990)

    Article  Google Scholar 

  63. Sun, R., Peterson, T.: Multi-agent reinforcement learning: weighting and partitioning. Neural Networks 12(4-5), 127–153 (1999)

    Article  Google Scholar 

  64. Taylor, W.R., Thornton, J.M.: Prediction of super-secondary structure in proteins. Nature 301, 540–542 (1983)

    Article  Google Scholar 

  65. Taylor, W.R., Orengo, C.A.: Protein-structure alignment. J. Mol. Biol. 208, 1–22 (1989)

    Article  Google Scholar 

  66. Unger, R., Harel, D., Wherland, S., Sussman, J.L.: A 3-D building blocks approach to analyzing and predicting structure of proteins. Proteins 5, 355–373 (1989)

    Article  Google Scholar 

  67. Vajda, S., Sippl, M., Novotny, J.: Empirical potentials and functions for protein folding and binding. Curr. Opin. Struct. Biol. 7, 228–228 (1997)

    Article  Google Scholar 

  68. Valiant, L.: A Theory of the Learnable. Communications of the ACM 27, 1134–1142 (1984)

    Article  MATH  Google Scholar 

  69. Vapnik, V.N.: Statistical Learning Theory. John Wiley and Sons Inc., New York (1998)

    MATH  Google Scholar 

  70. Vere, S.A.: Multilevel Counterfactuals for Generalizations of Relational Concepts and Productions. Artificial Intelligence 14(2), 139–164 (1980)

    Article  MATH  Google Scholar 

  71. Weigend, A.S., Mangeas, M., Srivastava, A.N.: Nonlinear Gated Experts for Time Series: Discovering Regimes and Avoiding Overfitting. Int. Journal of Neural Systems 6, 373–399 (1995)

    Article  Google Scholar 

  72. Wilson, S.W.: Classifier Fitness Based on Accuracy. Evolutionary Computation 3(2), 149–175 (1995)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Armano, G., Mancosu, G., Orro, A., Vargiu, E. (2005). A Multi-agent System for Protein Secondary Structure Prediction. In: Priami, C., Merelli, E., Gonzalez, P., Omicini, A. (eds) Transactions on Computational Systems Biology III. Lecture Notes in Computer Science(), vol 3737. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11599128_2

Download citation

  • DOI: https://doi.org/10.1007/11599128_2

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-30883-6

  • Online ISBN: 978-3-540-31446-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics