Abstract
The interactions between proteins and RNA (RPIs) play a crucial role in most cellular processes such as RNA stability and translation. Although there have been many high-throughput experiments recently to detect RPIs, these experiments are largely time-consuming and labor-intensive. Therefore, it is imminent to propose an efficient computational method to predict RPIs. In this study, we put forward a novel approach for predicting protein and ncRNA interactions based on sequences information only. By employing the bi-gram probability feature extraction method and k-mer algorithm, the represent features from protein and ncRNA were extracted. To evaluate the performance of the proposed model, two widely used datasets named RPI1807 and RPI2241 were trained with the adoption of random forest classifier by using five-fold cross-validation. The experimental results with the AUC of 0.992 and 0.947 on dataset RPI1807 and RPI2241 respectively indicated the effectiveness of our experimental approach for predicting RPIs, which provided the guidance for reference for future research in the biological field.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Wapinski, O., Chang, H.Y.: Long noncoding RNAs and human disease. Trends Cell Biol. 21(6), 354–361 (2011)
Guttman, M., Amit, I., Garber, M., French, C., Lin, M.F., Feldser, D., Huarte, M., Zuk, O., Carey, B.W., Cassady, J.P.: Chromatin signature reveals over a thousand highly conserved large non-coding RNAs in mammals. Nature 458(7235), 223 (2009)
Yu, F., Zheng, J., Mao, Y., Dong, P., Li, G., Lu, Z., Guo, C., Liu, Z., Fan, X.: Long non-coding RNA APTR promotes the activation of hepatic stellate cells and the progression of liver fibrosis. Biochem. Biophys. Res. Commun. 463(4), 679–685 (2015)
Harrow, J., Frankish, A., Gonzalez, J.M., Tapanari, E., Diekhans, M., Kokocinski, F., Aken, B.L., Barrell, D., Zadissa, A., Searle, S.: GENCODE: the reference human genome annotation for The ENCODE Project. Genome Res. 22(9), 1760–1774 (2012)
Chen, X., You, Z.H., Yan, G.Y., Gong, D.W.: IRWRLDA: improved random walk with restart for lncRNA-disease association prediction. Oncotarget 7(36), 57919–57931 (2016)
Chen, X., Yan, C.C., Zhang, X., You, Z.H.: Long non-coding RNAs and complex diseases: from experimental results to computational models. Brief. Bioinform. 18(4), 558 (2016)
Wang, Y.B., You, Z.H., Li, X., Jiang, T.H., Chen, X., Zhou, X., Wang, L.: Predicting protein-protein interactions from protein sequences by a stacked sparse autoencoder deep neural network. Mol. BioSyst. 13(7), 1336–1344 (2017)
Li, S., You, Z.H., Guo, H., Luo, X., Zhao, Z.Q.: Inverse-free extreme learning machine with optimal information updating. IEEE Trans. Cybern. 46(5), 1229 (2016)
Lei, W., You, Z.H., Xing, C., Li, J.Q., Xin, Y., Wei, Z., Yuan, H.: An ensemble approach for large-scale identification of protein-protein interactions using the alignments of multiple sequences. Oncotarget 8(3), 5149–5159 (2016)
Huang, Q., You, Z., Zhang, X., Zhou, Y.: Prediction of protein-protein interactions with clustered amino acids and weighted sparse representation. Int. J. Mol. Sci. 16(5), 10855–10869 (2015)
Huang, Y.A., You, Z.H., Chen, X.: A systematic prediction of drug-target interactions using molecular fingerprints and protein sequences. Curr. Protein Pept. Sci. 5(19), 468–478 (2017)
You, Z.H., Huang, Z.A., Zhu, Z., Yan, G.Y., Li, Z.W., Wen, Z., Chen, X.: PBMDA: a novel and effective path-based computational model for miRNA-disease association prediction. PLoS Comput. Biol. 13(3), e1005455 (2017)
Li, Z.W., You, Z.H., Chen, X., Li, L.P., Huang, D.S., Yan, G.Y., Nie, R., Huang, Y.A.: Accurate prediction of protein-protein interactions by integrating potential evolutionary information embedded in PSSM profile and discriminative vector machine classifier. Oncotarget 8(14), 23638 (2017)
An, J.Y., You, Z.H., Chen, X., Huang, D.S., Yan, G., Wang, D.F.: Robust and accurate prediction of protein self-interactions from amino acids sequence using evolutionary information. Mol. BioSyst. 12(12), 3702 (2016)
An, J.Y., You, Z.H., Chen, X., Huang, D.S., Li, Z.W., Liu, G., Wang, Y.: Identification of self-interacting proteins by exploring evolutionary information embedded in PSI-BLAST-constructed position specific scoring matrix. Oncotarget 7(50), 82440–82449 (2016)
Lei, Y.K., You, Z.H., Ji, Z., Zhu, L., Huang, D.S.: Assessing and predicting protein interactions by combining manifold embedding with multiple information integration. BMC Bioinform. 13(S7), S3 (2012)
You, Z.H., Lei, Y.K., Gui, J., Huang, D.S., Zhou, X.: Using manifold embedding for assessing and predicting protein interactions from high-throughput experimental data. Bioinformatics 26(21), 2744 (2010)
You, Z.H., Zhu, L., Zheng, C.H., Yu, H.J., Deng, S.P., Ji, Z.: Prediction of protein-protein interactions from amino acid sequences using a novel multi-scale continuous and discontinuous feature set. BMC Bioinform. 15(S15), S9 (2014)
Alipanahi, B., Delong, A., Weirauch, M.T., Frey, B.J.: Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning. Nat. Biotechnol. 33(8), 831–838 (2015)
Pan, X., Fan, Y.X., Yan, J., Shen, H.B.: IPMiner: hidden ncRNA-protein interaction sequential pattern mining with stacked autoencoder for accurate computational prediction. BMC Genom. 17(1), 582 (2016)
Chen, H., Huang, Z.: Medical image feature extraction and fusion algorithm based on K-SVD. In: Ninth International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, 3PGCIC 2015, GuangDong, pp. 333–337 (2015)
Salwinski, L., Miller, C.S., Smith, A.J., Pettit, F.K., Bowie, J.U., Eisenberg, D.: The database of interacting proteins: 2004 update. Nucleic Acids Res. 32, D449–D451 (2004)
Chatraryamontri, A., Breitkreutz, B.J., Oughtred, R., Boucher, L., Heinicke, S., Chen, D., Stark, C., Breitkreutz, A., Kolas, N., O’Donnell, L.: The BioGRID interaction database: 2015 update. Nucleic Acids Res. 43, D470 (2015)
Suresh, V., Liu, L., Adjeroh, D., Zhou, X.: Revealing protein–lncRNA interaction. Brief. Bioinform. 17, 106 (2015)
Paliwal, K.K., Sharma, A., Lyons, J., Dehzangi, A.: A tri-gram based feature extraction technique using linear probabilities of position specific scoring matrix for protein fold recognition. IEEE Trans. Nanobiosci. 13(1), 44–50 (2014)
You, Z.H., Zhou, M.C., Xin, L., Shuai, L.: Highly efficient framework for predicting interactions between proteins. IEEE Trans. Cybern. PP(99), 1–13 (2016)
Huang, Y.A., Chen, X., You, Z.H., Huang, D.S., Chan, K.C.C.: ILNCSIM: improved lncRNA functional similarity calculation model. Oncotarget 7(18), 25902–25914 (2016)
Zhu, L., You, Z.H., Huang, D.S., Wang, B.: t-LSE: a novel robust geometric approach for modeling protein-protein interaction networks. PLoS ONE 8(4), e58368 (2013)
Zhu, L., You, Z.H., Huang, D.S.: Increasing the reliability of protein–protein interaction networks via non-convex semantic embedding. Neurocomputing 121(18), 99–107 (2013)
You, Z.H., Yin, Z., Han, K., Huang, D.S., Zhou, X.: A semi-supervised learning approach to predict synthetic genetic interactions by combining functional and topological properties of functional gene network. BMC Bioinform. 11(1), 1–13 (2010)
Xia, J.F., You, Z.H., Wu, M., Wang, S.L., Zhao, X.M.: Improved method for predicting phi-turns in proteins using a two-stage classifier. Protein Pept. Lett. 17(9), 1117 (2010)
You, Z.H., Li, X., Chan, K.C.: An improved sequence-based prediction protocol for protein-protein interactions using amino acids substitution matrix and rotation forest ensemble classifiers. Neurocomputing 228, 277–282 (2017)
Li, J.Q., Rong, Z.H., Chen, X., Yan, G.Y., You, Z.H.: MCMDA: matrix completion for MiRNA-disease association prediction. Oncotarget 8(13), 21187 (2017)
Mchugh, C.A., Russell, P., Guttman, M.: Methods for comprehensive experimental identification of RNA-protein interactions. Genome Biol. 15(1), 203 (2014)
Yi, H.-C., You, Z.-H., Huang, D.-S., Li, X., Jiang, T.-H., Li, L.-P.: A deep learning framework for robust and accurate prediction of ncRNA-protein interactions using evolutionary information. Mol. Ther. Nucleic Acids 11, 337–344 (2018)
Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.A.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11(12), 3371–3408 (2010)
Dahl, G.E., Sainath, T.N., Hinton, G.E.: Improving deep neural networks for LVCSR using rectified linear units and dropout. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2013, Vancouver, pp. 8609–8613 (2013)
You, Z.H., Li, J., Gao, X., He, Z., Zhu, L., Lei, Y.K., Ji, Z.: Detecting protein-protein interactions with a novel matrix-based protein sequence representation and support vector machines. Biomed. Res. Int. 2015(2), 1–9 (2015)
You, Z.H., Chan, K.C.C., Hu, P.: Predicting protein-protein interactions from primary protein sequences using a novel multi-scale local feature representation scheme and the random forest. PLoS ONE 10(5), e0125811 (2015)
You, Z.H., Li, S., Gao, X., Luo, X., Ji, Z.: Large-scale protein-protein interactions detection by integrating big biosensing data with computational model. Biomed. Res. Int. (2) (2014). https://doi.org/10.1155/2014/598129
Pedregosa, F., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12(10), 2825–2830 (2012)
Yuan, H., You, Z.H., Xing, C., Chan, K., Xin, L.: Sequence-based prediction of protein-protein interactions using weighted sparse representation model combined with global encoding. BMC Bioinform. 17(1), 184 (2016)
An, J.Y., You, Z.H., Meng, F.R., Xu, S.J., Wang, Y.: RVMAB: using the relevance vector machine model combined with average blocks to predict the interactions of proteins from protein sequences. Int. J. Mol. Sci. 17(5), 757 (2016)
An, J.Y., Meng, F.R., You, Z.H., Fang, Y.H., Zhao, Y.J., Ming, Z.: Using the relevance vector machine model combined with local phase quantization to predict protein-protein interactions from protein sequences. Biomed. Res. Int. 2016, 1–9 (2016)
Wong, L., You, Z.H., Ming, Z., Li, J., Chen, X., Huang, Y.A.: Detection of interactions between proteins through rotation forest and local phase quantization descriptors. Int. J. Mol. Sci. 17(1), 21 (2015)
Wang, L., You, Z.H., Xia, S.X., Chen, X., Yan, X., Zhou, Y., Liu, F.: An improved efficient rotation forest algorithm to predict the interactions among proteins. Soft. Comput. 17, 1–9 (2017)
Wang, L., You, Z.H., Chen, X., Yan, X., Liu, G., Zhang, W.: RFDT: a rotation forest-based predictor for predicting drug-target interactions using drug structure and protein sequence information. Curr. Protein Pept. Sci. 5(19), 445–454 (2016)
Chen, X., Huang, Y.A., Wang, X.S., You, Z.H., Chan, K.C.: FMLNCSIM: fuzzy measure-based lncRNA functional similarity calculation model. Oncotarget 7(29), 45948 (2016)
Luo, X., You, Z., Zhou, M., Li, S., Leung, H., Xia, Y., Zhu, Q.: A highly efficient approach to protein interactome mapping based on collaborative filtering framework. Sci. Rep. 5(7702), 7702 (2015)
Lei, Y.K., You, Z.H., Dong, T., Jiang, Y.X., Yang, J.A.: Increasing reliability of protein interactome by fast manifold embedding. Pattern Recognit. Lett. 34(4), 372–379 (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Zhan, ZH., You, ZH., Zhou, Y., Li, LP., Li, ZW. (2018). Efficient Framework for Predicting ncRNA-Protein Interactions Based on Sequence Information by Deep Learning. In: Huang, DS., Jo, KH., Zhang, XL. (eds) Intelligent Computing Theories and Application. ICIC 2018. Lecture Notes in Computer Science(), vol 10955. Springer, Cham. https://doi.org/10.1007/978-3-319-95933-7_41
Download citation
DOI: https://doi.org/10.1007/978-3-319-95933-7_41
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-95932-0
Online ISBN: 978-3-319-95933-7
eBook Packages: Computer ScienceComputer Science (R0)