Abstract
Protein domains are the building blocks of proteins, and their interactions are crucial in forming stable protein-protein interactions (PPI) and take part in many cellular processes and biochemical events. Prediction of protein domain-domain interactions (DDI) is an emerging problem in computational biology. Different from early works on DDI prediction, which exploit only a single protein database, we introduce in this paper an integrative approach to DDI prediction that exploits multiple genome databases using inductive logic programming (ILP). The main contribution to biomedical knowledge discovery of this work are a newly generated database of more than 100,000 ground facts of the twenty predicates on protein domains, and various DDI findings that are evaluated to be significant. Experimental results show that ILP is more appropriate to this learning problem than several other methods. Also, many predictive rules associated with domain sites, conserved motifs, protein functions and biological pathways were found.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Srinivasan, A.: http://web.comlab.ox.ac.uk/oucl/research/areas/machlearn/Aleph/
Chen, X.W., Liu, M.: Prediction of protein-protein interactions using random decision forest framework. Bioinformatics 21(24), 4394–4400 (2005)
Comprehensive Yeast Genome Database, http://mips.gsf.de/genre/proj/yeast/
InterPro database concerning protein families and domains, http://www.ebi.ac.uk/interpro/
Deng, M., Mehta, S., Sun, F., Chen, T.: Inferring domain-domain interactions from protein-protein interactions. Genome Res. 12(10), 1540–1548 (2002)
Protein families database of alignments and HMMs, http://www.sanger.ac.uk/Software/Pfam/
Protein figerprint, http://umber.sbs.man.ac.uk/dbbrowser/PRINTS/
Han, D., Kim, H.S., Seo, J., Jang, W.: A domain combination based probabilistic framework for protein - protein interaction prediction. In: Genome Inform. Ser. Workshop Genome Inform, pp. 250–259 (2003)
Joachims, T.: http://svmlight.joachims.org/
Kim, R.M., Park, J., Suh, J.K.: Large scale statistical prediction of protein - protein interaction by potentially interacting domain (PID) pair. In: Genome Inform. Ser. Workshop Genome Inform, pp. 48–50 (2002)
Moon, H.S., Bhak, J., Lee, K.H., Lee, D.: Architecture of basic building blocks in protein and domain structural interaction networks. Bioinformatics 21(8), 1479–1486 (2005)
Turcotte, M., Muggleton, S.H., Sternberg, M.J.E.: Protein fold recognition. In: Proc. of the 8th International Workshop on Inductive Logic Programming (ILP 1998), pp. 53–64 (1998)
Muggleton, S., King, R.D., Sternberg, M.J.E.: Protein secondary structure prediction using logic-based machine learning. Protein Eng. 6(5), 549 (1993)
Ng, S.K., Tan, S.H.: Discovering protein-protein interactions. Journal of Bioinformatics and Computational Biology 1(4), 711–741 (2003)
Ng, S.K., Zhang, Z., Tan, S.H.: Integrative approach for computationally inferring protein domain interactions. Bioinformatics 19(8), 923–929 (2003)
Ng, S.K., Zhang, Z., Tan, S.H., Lin, K.: InterDom: A database of putative interacting protein domains for validating predicted protein interactions and complexes. Nucleic Acids Res. 31(1), 251–254 (2003)
Database of Interacting Proteins, http://dip.doe-mbi.ucla.edu/
PROSITE: Database of protein families and domains, http://kr.expasy.org/prosite/
Gene Ontology, http://www.geneontology.org/
Reichmann, D., Rahat, O., Albeck, S., Meged, R., Dym, O., Schreiber, G.: From The Cover: The modular architecture of protein-protein binding interfaces. PNAS 102(1), 57–62 (2005)
Universal Protein Resource, http://www.pir.uniprot.org/
Riley, R., Lee, C., Sabatti, C., Eisenberg, D.: Inferring protein domain interactions from databases of interacting proteins. Genome Biology 6(10), R89 (2005)
Sprinzak, E., Margalit, H.: Correlated sequence-signatures as markers of protein-protein interaction. Journal of Molecular Biology 311(4), 681–692 (2001)
Tran, T.N., Satou, K., Ho, T.-B.: Using inductive logic programming for predicting protein-protein interactions from multiple genomic data. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS, vol. 3721, pp. 321–330. Springer, Heidelberg (2005)
Wilson, K., Walker, J.: Principle and Techniques of Biochemistry and Molecular Biology, 6th edn. Cambridge University Press, Cambridge (2005)
Wojcik, J., Schachter, V.: Protein-protein interaction map inference using interacting domain profile pairs. Bioinformatics 17(suppl-1), S296–S305 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Nguyen, T.P., Ho, T.B. (2006). Prediction of Domain-Domain Interactions Using Inductive Logic Programming from Multiple Genome Databases. In: Todorovski, L., Lavrač, N., Jantke, K.P. (eds) Discovery Science. DS 2006. Lecture Notes in Computer Science(), vol 4265. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11893318_20
Download citation
DOI: https://doi.org/10.1007/11893318_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-46491-4
Online ISBN: 978-3-540-46493-8
eBook Packages: Computer ScienceComputer Science (R0)