skip to main content
10.1145/2147805.2147899acmconferencesArticle/Chapter ViewAbstractPublication PagesbcbConference Proceedingsconference-collections
poster

Improving protein-RNA interface prediction by combining sequence homology based method with a naive Bayes classifier: preliminary results

Published: 01 August 2011 Publication History

Abstract

Protein-RNA interactions play important roles in cellular processes like protein synthesis, RNA processing, and gene expression regulation. Reliable identification of the interfaces involved in RNA-protein interactions is essential for comprehending the mechanisms and the functional implications of these interactions and provides a valuable guide for rational drug discovery and design. Because the determination of 3D structures of protein-RNA complexes has various technical limitations and is typically costly, reliable in silico interface prediction methods that require only the sequence information are urgently needed.
We present HomPRIP, a homologous sequence based method for predicting protein-RNA interfaces, based on our conservation analysis of protein-RNA interfaces. We test Hom-PRIP on a benchmark dataset of 199 proteins and compare it with the state-of-the-art protein-RNA interface prediction methods. Our results show that HomPRIP can reliably identify protein-RNA interface residues in 71% of test proteins with at least one putative sequence homolog passing the similarity thresholds of HomPRIP. Moreover, to facilitate predictions for proteins with no identified homologs, we develop HomPRIP-NB, a method combining the HomPRIP predictor and a Naive Bayes (NB) classifier trained using evolutionary information derived from alignments against the NCBI nr database. Our results suggest that HomPRIP-NB significantly outperforms the state-of-the-art machine learning methods for predicting protein-RNA interface residues.

References

[1]
J Allers and Y Shamoo. Structure-based analysis of protein-RNA interactions using the program ENTANGLE. J Mol Biol, 311(1):75--86, 2001.
[2]
M. A. Andrade. Position-specific annotation of protein function based on multiple homologs. Proc Int Conf Intell Syst Mol Biol, 99:28--33, 1999.
[3]
C. Caragea, J. Sinapov, V. Honavar, and D. Dobbs. Assessing the performance of macromolecular sequence classifiers. In Proceedings of BIBE 2007., pages 320--326. IEEE.
[4]
CW Cheng, EY Su, JK Hwang, TY Sung, and WL Hsu. Predicting RNA-binding sites of proteins using support vector machines and evolutionary information. BMC Bioinformatics, 9(Suppl 12):S6, 2008.
[5]
JJ Ellis, M Broom, and S Jones. Protein-RNA interactions: structural analysis and functional classes. Proteins, 66(4):903--911, 2007.
[6]
S Jones, DT Daley, NM Luscombe, HM Berman, and JM Thornton. Protein-RNA interactions: a structural analysis. Nucleic Acids Res, 29(4):943--954, 2001.
[7]
H Kim, E Jeong, SW Lee, and K Han. Computational analysis of hydrogen bonds in protein-RNA complexes for interaction patterns. FEBS Lett, 552(2--3):231--239, 2003.
[8]
M Kumar, MM Gromiha, and GP Raghava. Prediction of RNA binding sites in a protein using SVM and PSSM profile. Proteins, 71(1):189--194, 2008.
[9]
B.A. Lewis, R. R. Walia, M. Terribilini, J. Ferguson, C. Zheng, V. Honavar, and D. Dobbs. PRIDB: a protein--RNA interface database. Nucleic Acids Research, 39(suppl 1):D277, 2011.
[10]
Stefan Maetschke and Zheng Yuan. Exploiting structural and topological information to improve prediction of RNA-protein binding sites. BMC Bioinformatics, 10(1):341, 2009.
[11]
M. A. Marti-Renom, A. C. Stuart, A. Fiser, R. Sanchez, F. Melo, and A. Sali. Comparative protein structure modeling of genes and genomes. Annu Rev Biophys Biomol Struct, 29:291--325, 2000.
[12]
L. R. Matthews, P. Vaglio, J. Reboul, H. Ge, B. P. Davis, J. Garrels, S. Vincent, and M. Vidal. Identification of potential interaction networks using sequence-based searches for conserved protein-protein interactions or "interologs". Genome Res, 11(12):2120--6, 2001.
[13]
Laura Perez-Cano and Juan Fernandez-Recio. Dissection and prediction of RNA-binding sites on proteins. BioMolecular Concepts, 1(5-6):345--355, December 2010.
[14]
Cui-cui Wang, Yaping Fang, Jiamin Xiao, and Menglong Li. Identification of RNA-binding sites in proteins by integrating various sequence information. Amino Acids, 40:239--248, 2011.
[15]
L Wang and S Brown. Prediction of RNA-binding residues in protein sequences using support vector machines. Proc of the 26th IEEE EMBS Ann Int Conf, pages 5830--5832, 2006.
[16]
L Wang and SJ Brown. BindN: a web-based tool for efficient prediction of DNA and RNA binding sites in amino acid sequences. Nucleic Acids Res, (34 Web Server):W243--248, 2006.
[17]
Y Wang, Z Xue, G Shen, and J Xu. PRINTR: prediction of RNA binding sites in proteins using SVM and profiles. Amino Acids, 35(2):295--302, 2008.
[18]
LC Xue, D Dobbs, and V Honavar. Homppi: A class of sequence homology based protein-protein interface prediction methods. BMC Bioinformatics, In press.
[19]
G. Zehetner. OntoBlast function: From sequence similarities directly to potential functional annotations by ontology terms. Nucleic Acids Res, 31(13):3799--803, Jul 1 2003.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
BCB '11: Proceedings of the 2nd ACM Conference on Bioinformatics, Computational Biology and Biomedicine
August 2011
688 pages
ISBN:9781450307963
DOI:10.1145/2147805
  • General Chairs:
  • Robert Grossman,
  • Andrey Rzhetsky,
  • Program Chairs:
  • Sun Kim,
  • Wei Wang
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 August 2011

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Poster

Funding Sources

Conference

BCB' 11
Sponsor:

Acceptance Rates

Overall Acceptance Rate 254 of 885 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 98
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 15 Feb 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media