Abstract:
RNA-binding proteins (RBPs) play a crucial role in the post-transcriptional regulation of RNAs. Identification of RBP binding sites is a key step to understand the biolog...Show MoreMetadata
Abstract:
RNA-binding proteins (RBPs) play a crucial role in the post-transcriptional regulation of RNAs. Identification of RBP binding sites is a key step to understand the biological mechanism of post-transcriptional regulation. Although many computational methods have been developed for predicting RNA-protein binding sites, few study considers the k-mer embedding representation of RNA primary sequence and secondary structure specificities. In this paper, we develop a general deep learning framework, named deepRKE, to predict RNA-protein binding sites. deepRKE takes an unsupervised shallow two-layer neural network to automatically learn the distributed representation of k-mers by taking their neighbor context into account. Compared to conventional k-mers approach, distributed representations effectively detect the latent relationship and similarity between k-mers. The distributed representations of the sequences and secondary structures are fed into CNN convolutional neural network (CNN) and a bidirectional long short term memory network (BLSTM) to discriminate the RBP binding sites from unbound sites. We comprehensively evaluate deepRKE on two large-scale RBP binding sites datasets, and the experimental results show that deepRKE achieves better performance than five competitive methods.
Date of Conference: 18-21 November 2019
Date Added to IEEE Xplore: 06 February 2020
ISBN Information: