A deep neural network approach using distributed representations of RNA sequence and structure for identifying binding site of RNA-binding proteins | IEEE Conference Publication | IEEE Xplore

A deep neural network approach using distributed representations of RNA sequence and structure for identifying binding site of RNA-binding proteins


Abstract:

RNA-binding proteins (RBPs) play a crucial role in the post-transcriptional regulation of RNAs. Identification of RBP binding sites is a key step to understand the biolog...Show More

Abstract:

RNA-binding proteins (RBPs) play a crucial role in the post-transcriptional regulation of RNAs. Identification of RBP binding sites is a key step to understand the biological mechanism of post-transcriptional regulation. Although many computational methods have been developed for predicting RNA-protein binding sites, few study considers the k-mer embedding representation of RNA primary sequence and secondary structure specificities. In this paper, we develop a general deep learning framework, named deepRKE, to predict RNA-protein binding sites. deepRKE takes an unsupervised shallow two-layer neural network to automatically learn the distributed representation of k-mers by taking their neighbor context into account. Compared to conventional k-mers approach, distributed representations effectively detect the latent relationship and similarity between k-mers. The distributed representations of the sequences and secondary structures are fed into CNN convolutional neural network (CNN) and a bidirectional long short term memory network (BLSTM) to discriminate the RBP binding sites from unbound sites. We comprehensively evaluate deepRKE on two large-scale RBP binding sites datasets, and the experimental results show that deepRKE achieves better performance than five competitive methods.
Date of Conference: 18-21 November 2019
Date Added to IEEE Xplore: 06 February 2020
ISBN Information:
Conference Location: San Diego, CA, USA

Contact IEEE to Subscribe

References

References is not available for this document.