Skip to main content

Detection of Remote Protein Homologs Using Social Programming

  • Chapter
Foundations of Computational Intelligence Volume 4

Part of the book series: Studies in Computational Intelligence ((SCI,volume 204))

Summary

We present a Grammatical Swarm (GS) for the optimization of an aggregation operator. This combines the results of several classifiers into a unique score, producing an optimal ranking of the individuals. We apply our method to the identification of new members of a protein family. Support Vector Machine and Naive Bayes classifiers exploit complementary features to compute probability estimates. A great advantage of the GS is that it produces an understandable algorithm revealing the interest of the classifiers. Due to the large volume of candidate sequences, ranking quality is of crucial importance. Consequently, our fitness criterion is based on the Area Under the ROC Curve rather than on classification error rate. We discuss the performances obtained for a particular family, the cytokines and show that this technique is an efficient means of ranking the protein sequences.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. O’Neill, M., Brabazon, A.: Grammatical swarm: The generation of programs by social programming. Natural Computing: an international journal 5(4), 443–462 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  2. O’Neill, M., Ryan, C.: Grammatical Evolution: Evolutionary Automatic Programming in an Arbitrary Language. Kluwer Academic Publishers, Hingham (2003)

    MATH  Google Scholar 

  3. O’Neill, M., Adley, C., Brabazon, A.: A grammatical evolution approach to eukaryotic promoter recognition. In: Bioinformatics Inform Workshop and Symposium, Dublin, Ireland (2005)

    Google Scholar 

  4. Kennedy, J., Eberhart, R.C.: Particle swarm optimization. In: Proceedings of the 1995 IEEE International Conference on Neural Networks, Perth, Australia, vol. 4, pp. 1942–1948. IEEE Service Center, Piscataway (1995)

    Google Scholar 

  5. Handstad, T., Hestnes, A.J.H., Saetrom, P.: Motif kernel generated by genetic programming improves remote homology and fold detection. BMC Bioinformatics 8, 23 (2007) (Evaluation Studies)

    Article  Google Scholar 

  6. Paris, G., Robilliard, D., Fonlupt, C.: Applying boosting techniques to genetic programming. In: Selected Papers from the 5th European Conference on Artificial Evolution, pp. 267–280. Springer, London (2002)

    Google Scholar 

  7. Brameier, M., Banzhaf, W.: Evolving teams of predictors with linear genetic programming. Genetic Programming and Evolvable Machines 2(4), 381–407 (2001)

    Article  MATH  Google Scholar 

  8. Yan, L., Dodier, R.H., Mozer, M., Wolniewicz, R.H.: Optimizing classifier performance via an approximation to the wilcoxon-mann-whitney statistic. In: ICML, pp. 848–855 (2003)

    Google Scholar 

  9. Vapnik, V.N.: The nature of statistical learning theory. Springer, Heidelberg (1998)

    Google Scholar 

  10. Guo, G., Li, S., Chan, K.: Face recognition by support vector machines (2000)

    Google Scholar 

  11. Joachims, T.: Text categorization with support vector machines: learning with many relevant features. In: Nédellec, C., Rouveirol, C. (eds.) Proceedings of ECML-98, 10th European Conference on Machine Learning, Chemnitz, pp. 137–142. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  12. Brown, M., Grundy, W., Lin, D., Cristianini, N., Sugnet, C., Furey, T., Ares Jr., M., Haussler, D.: Knowledge-based analysis of microarray gene expression data by using suport vector machines. In: Proc. Natl. Acad. Sci., vol. 97, pp. 262–267 (2000)

    Google Scholar 

  13. Segal, N.H., Pavlidis, P., Antonescu, C.R., Maki, R.G., Noble, W.S., DeSantis, D., Woodruff, J.M., Lewis, J.J., Brennan, M.F., Houghton, A.N., Cordon-Cardo, C.: Classification and subtype prediction of adult soft tissue sarcoma by functional genomics. Am. J. Pathol. 163(2), 691–700 (2003)

    Google Scholar 

  14. Hua, S., Sun, Z.: A novel method of protein secondary structure prediction with high segment overlap measure: Svm approach (2001)

    Google Scholar 

  15. Saeys, Y., Degroeve, S., Aeyels, D., Rouze, P., Van de Peer, Y.: Feature selection for splice site prediction: a new method using EDA-based feature ranking. BMC Bioinformatics 5, 64 (2004) (Comparative Study)

    Article  Google Scholar 

  16. Vert, J.: Support vector machine prediction of signal peptide cleavage site using a new class of kernels for strings (2002)

    Google Scholar 

  17. Ding, C., Dubchak, I.: Multi-class protein fold recognition using support vector machines and neural networks (2001)

    Google Scholar 

  18. Leslie, C., Eskin, E., Noble, W.S.: The spectrum kernel: a string kernel for svm protein classification. In: Pac. Symp. Biocomput., pp. 564–575 (2002)

    Google Scholar 

  19. Gunn, S.: Support vector machines for classification and regression (1998)

    Google Scholar 

  20. Leslie, C.S., Eskin, E., Cohen, A., Weston, J., Noble, W.S.: Mismatch string kernels for discriminative protein classification. Bioinformatics 20(4), 467–476 (2004)

    Article  Google Scholar 

  21. Liao, L., Noble, W.S.: Combining pairwise sequence similarity and support vector machines for detecting remote protein evolutionary and structural relationships. J. Comput. Biol. 10(6), 857–868 (2003)

    Article  Google Scholar 

  22. Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147, 195–197 (1981)

    Article  Google Scholar 

  23. Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J.: Gapped blast and psi-blast: a new generation of protein database search programs. Nucleic Acids Res. 25(17), 3389–3402 (1997)

    Article  Google Scholar 

  24. Saigo, H., Vert, J.P., Ueda, N., Akutsu, T.: Protein homology detection using string alignment kernels. Bioinformatics 20(11), 1682–1689 (2004)

    Article  Google Scholar 

  25. Jones, D.T.: Protein secondary structure prediction based on position-specific scoring matrices. J. Mol. Biol. 292(2), 195–202 (1999)

    Article  Google Scholar 

  26. Zemla, A., Venclovas, C., Fidelis, K., Rost, B.: A modified definition of sov, a segment-based measure for protein secondary structure prediction assessment. Proteins 34(2), 220–223 (1999)

    Article  Google Scholar 

  27. Conte, L., Ailey, L., Hubbard, B., Brenner, T., Murzin, S., Chothia, A.: Scop: a structural classification of proteins database (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Ramstein, G., Beaume, N., Jacques, Y. (2009). Detection of Remote Protein Homologs Using Social Programming. In: Abraham, A., Hassanien, AE., de Carvalho, A.P.d.L.F. (eds) Foundations of Computational Intelligence Volume 4. Studies in Computational Intelligence, vol 204. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01088-0_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-01088-0_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-01087-3

  • Online ISBN: 978-3-642-01088-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics