Skip to main content

Advanced Tree-Based Kernels for Protein Classification

  • Conference paper
AI*IA 2007: Artificial Intelligence and Human-Oriented Computing (AI*IA 2007)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4733))

Included in the following conference series:

Abstract

One of the aims of modern Bioinformatics is to discover the molecular mechanisms that rule the protein operation. This would allow us to understand the complex processes involved in living systems and possibly correct dysfunctions. The first step in this direction is the identification of the functional sites of proteins.

In this paper, we propose new kernels for the automatic protein active site classification. In particular, we devise innovative attribute-value and tree substructure representations to model biological and spatial information of proteins in Support Vector Machines. We experimented with such models and the Protein Data Bank adequately pre-processed to make explicit the active site information. Our results show that structural kernels used in combination with polynomial kernels can be effectively applied to discriminate an active site from other regions of a protein. Such finding is very important since it firstly shows a successful identification of catalytic sites for a very large family of proteins belonging to a broad class of enzymes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Cilia, E., Fabbri, A., Uriani, M., Scialdone, G.G., Ammendola, S.: The signature amidase from sulfolobus solfataricus belongs to the cx3c subgroup of enzymes cleaving both amides and nitriles: Ser195 and cys145 are predicted to be the active sites nucleophiles. The FEBS Journal 272, 4716–4724 (2005)

    Article  Google Scholar 

  2. Tramontano, A.: The ten most wanted solutions in Protein Bioinformatics. Mathematical Biology and Medicine Series. Chapman & Hall/CRC (2005)

    Google Scholar 

  3. Brunak, S., Baldi, P., Frasconi, P., Pollastri, G., Soda, G.: Exploiting the past and the future in protein secondary structure prediction. Bioinformatics 15(11) (1999)

    Google Scholar 

  4. Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., Bourne, P.E.: The protein data bank. Nucleic Acids Res. 28(1), 235–242 (2000)

    Article  Google Scholar 

  5. Meng, E.C., Polacco, B.J., Babbitt, P.C.: Superfamily active site templates. PROTEINS: Structure, Function, and Bioinformatics 55, 962–976 (2004)

    Article  Google Scholar 

  6. Gärtner, T.: A survey of kernels for structured data. Multi Relational Data Mining (MRDM) 5, 49–58 (2003)

    Google Scholar 

  7. Vapnik, V.: The Nature of Statistical Learning Theory. Springer, Heidelberg (1995)

    MATH  Google Scholar 

  8. Moschitti, A.: Efficient convolution kernels for dependency and constituent syntactic trees. In: Proceedings of The 17th European Conference on Machine Learning, Berlin, Germany (2006)

    Google Scholar 

  9. Collins, M., Duffy, N.: New ranking algorithms for parsing and tagging: Kernels over discrete structures, and the voted perceptron. In: ACL 2002 (2002)

    Google Scholar 

  10. Borgwardt, K.: Graph-based Functional Classification of Proteins using Kernel Methods. Ludwig Maximilians University of Monaco (2004)

    Google Scholar 

  11. Prim, R.C.: Shortest connection networks and some generalizations. Bell Syst. Tech. Journal 36, 1389–1401 (1957)

    Google Scholar 

  12. Moschitti, A.: A study on convolution kernel for shallow semantic parsing. In: ACL-2004. Proceedings of the 42th Conference on Association for Computational Linguistic, Barcelona, Spain (2004)

    Google Scholar 

  13. Joachims, T.: Making large-scale svm learning practical. In: Schölkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods - Support Vector Learning (1999)

    Google Scholar 

  14. Yang, Z.R.: Orthogonal kernels machines for the prediction of functional sites in proteins. IEEE Trans. on Systems, Man and Cybernetics 35(1), 100–106 (2005)

    Article  Google Scholar 

  15. Petrova, N.V., Wu, C.H.: Prediction of catalytic residues using support vector machine with selected protein sequence and structural properties. BMC Bionformatics (7), 312–324 (2006)

    Google Scholar 

  16. Zien, A., Rätsch, G., Mika, S., Schölkopf, B., Lengauer, T., Muller, K.R: Engineering support vector machine kernel that recognize translation initiation sites. Bioinformatics 16(9), 799–807 (2000)

    Article  Google Scholar 

  17. Pavlidis, P., Furey, T.S., Liberto, M., Grundy, W.N.: Promoter region-based classification of genes. In: Proceedings of the Pacific Symposium on Biocomputing, pp. 151–163 (2001)

    Google Scholar 

  18. Bock, J.R., Gough, D.A.: Predicting protein-protein interactions from primary structure. Bionformatics 17(5), 455–460 (2001)

    Article  Google Scholar 

  19. Brown, M.P.S., Grundy, W.N., Lin, D., Cristianini, N., Sugnet, C., Ares, M., Haussler, D.: Support vector machine classification of microarray expression data. In: UCSC-URL (1999)

    Google Scholar 

  20. Brown, M.P.S., Grundy, W.N., Lin, D., Cristianini, N., Sugnet, C., Furey, T.S., Ares, M., Haussler, D.: Knowledge-based analysis of microarray gene expression data by using support vector machine. PNAS 97(1), 262–267 (2000)

    Article  Google Scholar 

  21. Furey, T.S., Duffy, N., Cristianini, N., Bednarski, D., Schummer, M., Haussler, D.: Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics 16(10), 906–914 (2000)

    Article  Google Scholar 

  22. Pavlidis, P., Furey, T., Liberto, M., Grundy, W.N.: Learning gene functional classification from multiple data types. Journal of Computational Biology (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Roberto Basili Maria Teresa Pazienza

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Cilia, E., Moschitti, A. (2007). Advanced Tree-Based Kernels for Protein Classification. In: Basili, R., Pazienza, M.T. (eds) AI*IA 2007: Artificial Intelligence and Human-Oriented Computing. AI*IA 2007. Lecture Notes in Computer Science(), vol 4733. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74782-6_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74782-6_20

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74781-9

  • Online ISBN: 978-3-540-74782-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics