A Multi-agent System for Protein Secondary Structure Prediction

Armano, Giuliano; Mancosu, Gianmaria; Orro, Alessandro; Vargiu, Eloisa

doi:10.1007/11599128_2

Giuliano Armano²³,
Gianmaria Mancosu²⁴,
Alessandro Orro²³ &
…
Eloisa Vargiu²³

Part of the book series: Lecture Notes in Computer Science ((TCSB,volume 3737))

511 Accesses
1 Citations

Abstract

In this paper, we illustrate a system aimed at predicting protein secondary structures. Our proposal falls in the category of multiple experts, a machine learning technique that –under the assumption of absent or negative correlation in experts’ errors– may outperform monolithic classifier systems. The prediction activity results from the interaction of a population of experts, each integrating genetic and neural technologies. Roughly speaking, an expert of this kind embodies a genetic classifier designed to control the activation of a feedforward artificial neural network. Genetic and neural components (i.e., guard and embedded predictor, respectively) are devoted to perform different tasks and are supplied with different information: Each guard is aimed at (soft-) partitioning the input space, insomuch assuring both the diversity and the specialization of the corresponding embedded predictor, which in turn is devoted to perform the actual prediction. Guards deal with inputs that encode information strictly related with relevant domain knowledge, whereas embedded predictors process other relevant inputs, each consisting of a limited window of residues. To investigate the performance of the proposed approach, a system has been implemented and tested on the RS126 set of proteins. Experimental results point to the potential of the approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J.: Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990)
Google Scholar
Altschul, S.F., Madden, T.L., Schaeffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J.: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997)
Article Google Scholar
Anfinsen, C.B.: Principles that govern the folding of protein chains. Science 181, 223–230 (1973)
Article Google Scholar
Armano, G.: NXCS Experts for Financial Time Series Forecasting. In: Bull, L. (ed.) Applications of Learning Classifier Systems, pp. 68–91. Springer, Heidelberg (2004)
Google Scholar
Bairoch, A., Apweiler, R.: The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res. 28, 45–48 (2000)
Article Google Scholar
Baldi, P., Brunak, S., Frasconi, P., Soda, G., Pollastri, G.: Exploiting the Past and the Future in Protein Secondary Structure Prediction. Bioinformatics 15, 937–946 (1999)
Article Google Scholar
Baldi, P., Brunak, S., Frasconi, P., Pollastri, G., Soda, G.: Bidirectional Dynamics for Protein Secondary Structure Prediction. In: Sun, R., Giles, C.L. (eds.) Sequence Learning: Paradigms, Algorithms, and Applications, pp. 80–104. Springer, Heidelberg (2000)
Google Scholar
Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, N., Weissig, H., Shindyalov, I.N., Bourne, P.E.: The Protein Data Bank. Nucleic Acids Research. 28, 235–242 (2000)
Article Google Scholar
Blundell, T.L., Johnson, M.S.: Catching a common fold. Prot. Sci. 2(6), 877–883 (1993)
Article Google Scholar
Boczko, E.M., Brooks, C.L.: First-principles calculation of the folding free energy of a three-helix bundle protein. Science 269(5222), 393–396 (1995)
Article Google Scholar
Bowie, J.U., Luthy, R., Eisenberg, D.: A method to identify protein sequences that fold into a known 3-dimensional structure. Science 253, 164–170 (1991)
Article Google Scholar
Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth, Belmont (1984)
MATH Google Scholar
Breiman, L.: Stacked Regressions. Machine Learning 24, 41–48 (1996)
MATH MathSciNet Google Scholar
Cleeremans, A.: Mechanisms of Implicit Learning. In: Connectionist Models of Sequence Processing. MIT Press, Cambridge (1993)
Google Scholar
Chothia, C., Lesk, A.M.: The relation between the divergence of sequence and structure in proteins. EMBO J. 5, 823–826 (1986)
Google Scholar
Chothia, C.: One thousand families for the molecular biologist. Nature 357, 543–544 (1992)
Article Google Scholar
Chou, P.Y., Fasman, U.D.: Prediction of protein conformation. Biochem. 13, 211–215 (1974)
Article Google Scholar
Chothia, C.: Proteins – 1000 families for the molecular biologist. Nature 357, 543–544 (1992)
Article Google Scholar
Clark, P., Niblett, T.: The CN2 Induction Algorithm. Machine Learning 3(4), 261–283 (1989)
Google Scholar
Cuff, J.A., Barton, G.J.: Evaluation and improvement of multiple sequence methods for protein secondary structure prediction. PROTEINS: Structure, Function and Genetics 34, 508–519 (1999)
Article Google Scholar
Dandekar, T., Argos., P.: Folding the main chain of small proteins with the genetic algorithm. J. Mol. Biol. 236, 844–861 (1994)
Article Google Scholar
Covell, D.G.: Folding protein alpha-carbon chains into compact forms by Monte Carlo methods. Proteins 14, 409–420 (1992)
Article Google Scholar
Flockner, H., Braxenthaler, M., Lackner, P., Jaritz, M., Ortner, M., Sippl, M.J.: Progress in fold recognition. Proteins: Struct., Funct., Genet. 23, 376–386 (1995)
Article Google Scholar
Freund, Y., Schapire, R.E.: A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. Journal of Computer Science and System Sciences 55(1), 119–139 (1997)
Article MATH MathSciNet Google Scholar
Gething, M.J., Sambrook, J.: Protein folding in the cell. Nature 355, 33–45 (1992)
Article Google Scholar
Goldberg, D.E.: Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley, Reading (1989)
MATH Google Scholar
Greer, J.: Comparative modelling methods: application to the family of the mammalian serine proteases. Proteins 7, 317–334 (1990)
Article Google Scholar
Havel, T.F.: Predicting the structure of the flavodoxin from Eschericia coli by homology modeling, distance geometry and molecular dynamics. Mol. Simulation 10, 175–210 (1993)
Article Google Scholar
Henikoff, S., Henikoff, J.G.: Amino acid substitution matrices from protein blocks. Proc. Nat. Acad. Sci., 10915–10919 (1989)
Google Scholar
Holley, H.L., Karplus, M.: Protein secondary structure prediction with a neural network. Proc. Natl. Acad. Sc., U.S.A. 86, 152–156 (1989)
Article Google Scholar
Hartl, F.U.: Secrets of a double-doughnut. Nature 371, 557–559 (1994)
Article Google Scholar
Higgins, D., Thompson, J., Gibson, T., Thompson, J.D., Higgins, D.G., Gibson, T.J.: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673–4680 (1994)
Article Google Scholar
Hinds, D.A., Levitt, M.: Exploring conformational space with a simple lattice model for protein structure. J. Mol. Biol. 243, 668–682 (1994)
Article Google Scholar
Holland, J.H.: Adaptation in Natural and Artificial Systems. University of Michigan Press, Ann Arbor (1975)
Google Scholar
Holland, J.H.: Adaption. In: Rosen, R., Snell, F.M. (eds.) Progress in Theoretical Biology, vol. 4, pp. 263–293. Academic Press, New York (1976)
Google Scholar
Holland, J.H.: Escaping Brittleness: The possibilities of General-Purpose Learning Algorithms Applied to Parallel Rule-Based Systems. In: Michalski, R.S., Carbonell, J., Mitchell, M. (eds.) Machine Learning, An Artificial Intelligence Approach, vol. II 20, pp. 593–623. Morgan Kaufmann, San Francisco (1986)
Google Scholar
Jacobs, R.A., Jordan, M.I., Nowlan, S.J., Hinton, G.E.: Adaptive Mixtures of Local Experts. Neural Computation 3, 79–87 (1991)
Article Google Scholar
Jones, D.T., Taylor, W.R., Thornton, J.M.: A new approach to protein fold recognition. Nature 358, 86–89 (1992)
Article Google Scholar
Jones, D.T.: Protein secondary structure prediction based on position-specific scoring matrices. J. Mol. Biol. 292, 195–202 (1999)
Article Google Scholar
Jordan, M.I., Jacobs, R.A.: Hierarchies of Adaptive Experts. In: Moody, J., Hanson, S., Lippman, R. (eds.) Advances in Neural Information Processing Systems, vol. 4, pp. 985–993. Morgan Kaufmann, San Francisco (1992)
Google Scholar
Kanehisa, M.: A multivariate analysis method for discriminating protein secondary structural segments. Prot. Engin. 2, 87–92 (1988)
Article Google Scholar
Krogh, A., Vedelsby, J.: Neural Network Ensembles, Cross Validation, and Active Learning. In: Tesauro, G., Touretzky, D., Leen, T. (eds.) Advances in Neural Information Processing Systems, vol. 7, pp. 231–238. MIT Press, Cambridge (1995)
Google Scholar
Lathrop, R.H., Smith, T.F.: Global optimum protein threading with gapped alignment and empirical pair score functions. J. Mol. Biol. 255, 641–665 (1996)
Article Google Scholar
Levitt, M.: Protein folding by constrained energy minimization and molecular dynamics. J. Mol. Biol. 170, 723–764 (1983)
Article Google Scholar
Levitt, M.: A simplified representation of protein conformations for rapid simulation of protein folding. J. Mol. Biol. 104, 59–107 (1976)
Article Google Scholar
Madej, T., Gibrat, J.F., Bryant, S.H.: Threading a database of protein cores. Proteins: Struct., Funct., Genet. 23, 356–369 (1995)
Article Google Scholar
Mitchell, E.M., Artymiuk, P.J., Rice, D.W., Willett, P.: Use of techniques derived from graph theory to compare secondary structure motifs in proteins. J. Mol. Biol. 212, 151–166 (1992)
Article Google Scholar
Orengo, C.A., Jones, D.T., Thornton, J.M.: Protein superfamilies and domain superfolds. Nature 372, 631–634 (1994)
Article Google Scholar
Quinlan, J.R.: Induction of Decision Trees. Machine Learning 1, 81–106 (1986)
Google Scholar
Ptitsyn, O.B., Finkelstein, A.V.: Theory of protein secondary structure and algorithm of its prediction. Biopolymers 22, 15–25 (1983)
Article Google Scholar
Pollastri, G., Przybylski, D., Rost, B., Baldi, P.: Improving the Prediction of Protein Secondary Structure in Three and Eight Classes Using Neural Networks and Profiles. Proteins 47, 228–235 (2002)
Article Google Scholar
Riis, S.K., Krogh, A.: Improving prediction of protein secondary structure using structured neural networks and multiple sequence alignments. J. Comp. Biol. 3, 163–183 (1996)
Article Google Scholar
Rivest, R.L.: Learning Decision Lists. Machine Learning 2(3), 229–246 (1987)
Google Scholar
Robson, B.: Conformational properties of amino acid residues in globular proteins. J. Mol. Biol. 107, 327–356 (1976)
Article Google Scholar
Rost, B., Sander, C.: Prediction of protein secondary structure at better than 70% accuracy. J. Mol. Biol. 232, 584–599 (1993)
Article Google Scholar
Roterman, I.K., Lambert, M.H., Gibson, K.D., Scheraga, H.A.: A comparison of the charmm, amber and ecepp potentials for peptides. ii. phi-psi maps for n-acetyl alanine n’-methyl amide: comparisons, contrasts and simple experimental tests. J. Biomol. Struct. Dynamics 7, 421–453 (1989)
Google Scholar
Russell, R.B., Copley, R.R., Barton, G.J.: Protein fold recognition by mapping predicted secondary structures. J. Mol. Biol. 259, 349–365 (1996)
Article Google Scholar
Sali, A.: Modelling mutations and homologous proteins. Curr. Opin. Biotech. 6, 437–451 (1995)
Article Google Scholar
Salamov, A.A., Solovyev, V.V.: Prediction of protein secondary structure by combining nearest-neighbor algorithms and multiple sequence alignment. J. Mol. Biol. 247, 11–15 (1995)
Article Google Scholar
Sanchez, R., Sali, A.: Advances in comparative protein-structure modeling. Curr. Opin. Struct. Biol. 7, 206–214 (1997)
Article Google Scholar
Schapire, E.: A Brief Introduction to Boosting. In: Proc. of the Sixteenth Int. Joint Conference on Artificial Intelligence, pp. 1401–1406 (1999)
Google Scholar
Skolnick, J., Kolinski, A.: Simulations of the folding of a globular protein. Science 250, 1121–1125 (1990)
Article Google Scholar
Sun, R., Peterson, T.: Multi-agent reinforcement learning: weighting and partitioning. Neural Networks 12(4-5), 127–153 (1999)
Article Google Scholar
Taylor, W.R., Thornton, J.M.: Prediction of super-secondary structure in proteins. Nature 301, 540–542 (1983)
Article Google Scholar
Taylor, W.R., Orengo, C.A.: Protein-structure alignment. J. Mol. Biol. 208, 1–22 (1989)
Article Google Scholar
Unger, R., Harel, D., Wherland, S., Sussman, J.L.: A 3-D building blocks approach to analyzing and predicting structure of proteins. Proteins 5, 355–373 (1989)
Article Google Scholar
Vajda, S., Sippl, M., Novotny, J.: Empirical potentials and functions for protein folding and binding. Curr. Opin. Struct. Biol. 7, 228–228 (1997)
Article Google Scholar
Valiant, L.: A Theory of the Learnable. Communications of the ACM 27, 1134–1142 (1984)
Article MATH Google Scholar
Vapnik, V.N.: Statistical Learning Theory. John Wiley and Sons Inc., New York (1998)
MATH Google Scholar
Vere, S.A.: Multilevel Counterfactuals for Generalizations of Relational Concepts and Productions. Artificial Intelligence 14(2), 139–164 (1980)
Article MATH Google Scholar
Weigend, A.S., Mangeas, M., Srivastava, A.N.: Nonlinear Gated Experts for Time Series: Discovering Regimes and Avoiding Overfitting. Int. Journal of Neural Systems 6, 373–399 (1995)
Article Google Scholar
Wilson, S.W.: Classifier Fitness Based on Accuracy. Evolutionary Computation 3(2), 149–175 (1995)
Article Google Scholar

Download references

Author information

Authors and Affiliations

University of Cagliari, Piazza d’Armi, I-09123, Cagliari, Italy
Giuliano Armano, Alessandro Orro & Eloisa Vargiu
Shardna Life Sciences, Piazza Deffenu 4, I-09121, Cagliari, Italy
Gianmaria Mancosu

Authors

Giuliano Armano
View author publications
You can also search for this author in PubMed Google Scholar
Gianmaria Mancosu
View author publications
You can also search for this author in PubMed Google Scholar
Alessandro Orro
View author publications
You can also search for this author in PubMed Google Scholar
Eloisa Vargiu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Centre for Computational and Systems Biology, The Microsoft Research - University of Trento, Piazza Manci, 17, 38050, Povo (TN), Italy
Corrado Priami
Dipartimento di Matematica e Informatica, Università di Camerino, I-62032, Camerino, Italy
Emanuela Merelli
DEIS, Università di Bologna, Via Venezia 52, 47023, Cesena, Italy
Pablo Gonzalez
Alma Mater Studiorum Università di Bologna a Cesena,, via Venezia 52, 47023, Cesena, Italy
Andrea Omicini

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Armano, G., Mancosu, G., Orro, A., Vargiu, E. (2005). A Multi-agent System for Protein Secondary Structure Prediction. In: Priami, C., Merelli, E., Gonzalez, P., Omicini, A. (eds) Transactions on Computational Systems Biology III. Lecture Notes in Computer Science(), vol 3737. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11599128_2

Download citation

DOI: https://doi.org/10.1007/11599128_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-30883-6
Online ISBN: 978-3-540-31446-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics