Skip to main content

ILP Characterization of 3D Protein-Binding Sites and FCA-Based Interpretation

  • Conference paper
Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2012)

Abstract

Life sciences are continuously producing large amounts of complex data that require relational learning to facilitate knowledge discovery. Inductive Logic Programming (ILP) is a powerful method which allows expressive representation of the data and produces explicit knowledge. However, ILP systems return variable theories depending on heuristic user-choices of various parameters and may miss potentially relevant rules. Accordingly, we propose an original approach based on post-ILP propositionalization of the examples and Formal Concept Analysis for effective interpretation of reached rules with the possibility of adding domain knowledge. Our approach is applied to the characterization of three-dimensional (3D) protein-binding sites which are protein portions on which interactions with other proteins take place. We define a relational representation of protein 3D patches and formalize the problem as a concept learning problem using ILP. We report here the results we obtained on particular protein-binding sites namely phosphorylation sites using ILP followed by FCA-based interpretation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. De Raedt L.: Logical and Relational Learning. Springer (2008)

    Book  MATH  Google Scholar 

  2. Smith, G., Sternberg, M.: Prediction of protein-protein interactions by docking methods. Current Opinion in Structural Biology 12(1), 28–35 (2002)

    Article  Google Scholar 

  3. Aloy, P., Russell, R.: InterPreTS: Protein Interaction Prediction through Tertiary Structure. Bioinformatics Applications Note 19(1), 161–162 (2003)

    Article  Google Scholar 

  4. Jansen, R., Yu, H., Greenbaum, D., Kluger, Y., Krogan, N.J., Chung, S., Emili, A., Snyder, M., Greenblatt, J.F., Gerstein, M.: A Bayesian networks approach for predicting protein-protein interactions from genomic data. Science 302(5644), 449–453 (2003)

    Article  Google Scholar 

  5. Tran, T.N., Satou, K., Ho, T.B.: Using Inductive Logic Programming for Predicting Protein-Protein Interactions from Multiple Genomic Data. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 321–330. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  6. Jones, S., Thornton, J.: Analysis of protein-protein interaction sites using surface patches. J. Mol. Biol. 272, 121–132 (1997)

    Article  Google Scholar 

  7. Zhu, H., Domingues, F.S., Sommer, I., Lengauer, T.: NOXclass: prediction of protein-protein interaction types. BMC Bioinformatics 7, 27 (2006)

    Article  Google Scholar 

  8. Muggleton, S.: Inductive Logic Programming. New Generation Computing 8(4), 295–318 (1991)

    Article  MATH  Google Scholar 

  9. Muggleton, S., De Raedt, L.: Inductive Logic Programming: Theory And Methods. Journal of Logic Programming 19(20), 629–679 (1994)

    Article  MathSciNet  Google Scholar 

  10. Page, D., Srinivasan, A.: ILP: A Short Look Back and a Longer Look Forward. Journal of Machine Learning Research 4, 415–430 (2003)

    Google Scholar 

  11. King, R.: Logic, Automation, and the Future of Biology. In: Proceedings of the Spring School on Modelling Complex Biological Systems, Sophia-Antipolis, France (2011)

    Google Scholar 

  12. Ganter, B., Wille, R.: Formal concept analysis: Mathematical foundations. Springer, Heidelberg (1999)

    Book  MATH  Google Scholar 

  13. Guharoy, M., Chakrabarti, P.: Conservation and relative importance of residues across protein-protein interfaces. PNAS 102(43), 15447–15452 (2005)

    Article  Google Scholar 

  14. Diella, F., Gould, C.M., Chica, C., Via, A., Gibson, T.J.: Phospho.ELM: a database of phosphorylation sites. Nucleic Acids Res. 36(Database issue), D240-D244 (2008)

    Google Scholar 

  15. Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., Bourne, P.E.: The Protein Data Bank. Nucleic Acids Research 28, 235–242 (2000)

    Article  Google Scholar 

  16. Yu, C.S., Chen, Y.C., Lu, C.H., Hwang, J.K.: Prediction of protein subcellular localization. Proteins 64, 643–651 (2006)

    Article  Google Scholar 

  17. Dubchak, I., Muchnik, I., Mayor, C., Dralyuk, I., Kim, S.-H.: Recognition of a protein fold in the context of the SCOP classification. Proteins: Structure, Function, and Genetics 35(4), 401–407 (1999)

    Article  Google Scholar 

  18. Srinivasan, A.: The Aleph Manual (2007), http://www.comlab.ox.ac.uk/oucl/research/areas/machlearn/Aleph/

  19. Szathmary, L.: Symbolic Data Mining Methods with the Coron Platform. PhD Thesis in Computer Science, Univ. Henri Poincaré – Nancy 1, France (2006)

    Google Scholar 

  20. Wong, Y., et al.: Kinasephos 2.0: A Web Server For Identifying Protein Kinase-Specific Phosphorylation Sites Based on Sequences and Coupling Patterns. Nucleic Acids Res. 35(Web Server issue), W588–W594 (2007)

    Google Scholar 

  21. Durek, P., Schudoma, C., Weckwerth, W., Selbig, J., Walther, D.: Detection and characterization of 3D-signature phosphorylation site motifs and their contribution towards improved phosphorylation site prediction in proteins. BMC Bioinformatics 10, 117 (2009)

    Article  Google Scholar 

  22. Finn, P., Muggleton, S., Page, D., Srinivasan, A.: Pharmacophore Discovery Using the Inductive Logic Programming System PROGOL. Machine Learning 30(2-3), 241–273 (1998)

    Article  Google Scholar 

  23. Punta, M., et al.: The Pfam protein families database. Nucleic Acids Research 40(Database Issue), D290–D301 (2012)

    Google Scholar 

  24. Obata, T., Yaffe, M.B., Leparc, G.G., Piro, E.T., Maegawa, H., Kashiwagi, A., Kikkawa, R., Cantley, L.C.: Peptide and protein library screening defines optimal substrate motifs for AKT/PKB. J. Biol. Chem. 275, 36108–36115 (2000)

    Article  Google Scholar 

  25. Page, D., Craven, M.: Biological applications of multi-relational data mining. SIGKDD Explorations 5(1), 69–79 (2003)

    Article  Google Scholar 

  26. Tsunoyama, K., Ata Amini, A., Sternberg, M., Muggleton, S.: Scaffold Hopping in Drug Discovery Using Inductive Logic Programming. Journal of Chemical Information and Modeling 48(5), 949–957 (2008)

    Article  Google Scholar 

  27. Turcotte, M., Muggleton, S., Sternberg, M.: Automated discovery of structural signatures of protein fold and function. Journal of Molecular Biology 306(3), 591–605 (2001)

    Article  Google Scholar 

  28. Dzeroski, S., Lavrac, N.: Relational Data Mining. Springer (2001)

    Google Scholar 

  29. Santos, J., Nassif, H., Page, D., Muggleton, S., Sternberg, M.: Automated identification of protein-ligand interaction features using Inductive Logic Programming: a hexose binding case study. BMC Bioinformatics 13, 162 (2012)

    Article  Google Scholar 

  30. Kramer, S., Lavrac, N., Flach, P.: Propositionalization Approaches to Relational data Mining. In: Dzeroski, S., Lavrac, N. (eds.) Relational Data Mining. Springer (2001)

    Google Scholar 

  31. Berthold, M.R., Morik, K., Siebes, A. (eds.): Parallel universes and local patterns. Dagstuhl Seminar No. 07181 (2007)

    Google Scholar 

  32. Knobbe, A., Crémilleux, B., Fürnkranz, J., Scholz, M.: From Local Patterns to Global Models: The LeGo Approach to Data Mining. In: Proc. of the Int. Workshop From Local Patterns to Global Models co-located with ECML/PKDD 2008, Antwerp, Belgium, pp. 1–16 (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bresso, E., Grisoni, R., Devignes, MD., Napoli, A., Smail-Tabbone, M. (2013). ILP Characterization of 3D Protein-Binding Sites and FCA-Based Interpretation. In: Fred, A., Dietz, J.L.G., Liu, K., Filipe, J. (eds) Knowledge Discovery, Knowledge Engineering and Knowledge Management. IC3K 2012. Communications in Computer and Information Science, vol 415. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-54105-6_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-54105-6_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-54104-9

  • Online ISBN: 978-3-642-54105-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics