Abstract
Circular RNA (circRNA) is an RNA molecule different from linear RNA with covalently closed loop structure. CircRNAs can act as sponging miRNAs and can interact with RNA binding protein. Previous studies have revealed that circRNAs play important role in the development of different diseases. The biological functions of circRNAs can be investigated with the help of circRNA-protein interaction. Due to scarce circRNA data, long circRNA sequences and the sparsely distributed binding sites on circRNAs, much fewer endeavors are found in studying the circRNA-protein interaction compared to interaction between linear RNA and protein. With the increase in experimental data on circRNA, machine learning methods are widely used in recent times for predicting the circRNA-protein interaction. The existing methods either use RNA sequence or protein sequence for predicting the binding sites. In this paper, we present a new method PCPI (Predicting CircRNA and Protein Interaction) to predict the interaction between circRNA and protein using support vector machine (SVM) classifier. We have used both the RNA and protein sequences to predict their interaction. The circRNA sequences were converted in pseudo peptide sequences based on codon translation. The pseudo peptide and the protein sequences were classified based on dipole moments and the volume of the side chains. The 3-mers of the classified sequences were used as features for training the model. Several machine learning model were used for classification. Comparing the performances, we selected SVM classifier for predicting circRNA-protein interaction. Our method achieved 93% prediction accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Li, L., et al.: Comprehensive analysis of CircRNA expression Bprofiles in humans by RAISE. Int. J. Oncol. 51, 1625–1638 (2017). https://doi.org/10.3892/ijo.2017.4162
Kristensen, L.S., Andersen, M.S., Stagsted, L.V.W., Ebbesen, K.K., Hansen, T.B., Kjems, J.: The biogenesis, biology and characterization of circular RNAs. Nat. Rev. Genet. 20, 675–691 (2019)
Li, X., Yang, L., Chen, L.L.: The biogenesis, functions, and challenges of circular RNAs. Mol. Cell 71, 428–442 (2018)
Jeck, W.R., et al.: Circular RNAs are abundant, conserved, and associated with ALU repeats. RNA 19, 141–157 (2013). https://doi.org/10.1261/rna.035667.112
Hansen, T.B., et al.: Natural RNA circles function as efficient microRNA sponges. Nat. 495, 384–388 (2013). https://doi.org/10.1038/nature11993
Hentze, M.W., Preiss, T.: Circular RNAs: splicing’s enigma variations. EMBO J. 32, 923–925 (2013). https://doi.org/10.1038/emboj.2013.53
Li, G.F., Li, L., Yao, Z.Q., Zhuang, S.J.: Hsa_circ_0007534/MiR-761/ZIC5 Regulatory loop modulates the proliferation and migration of glioma cells. Biochem. Biophys. Res. Commun. 499, 765–771 (2018). https://doi.org/10.1016/j.bbrc.2018.03.219
Han, D., et al.: Circular RNA CircMTO1 acts as the sponge of MicroRNA-9 to suppress hepatocellular carcinoma progression. Hepatol. 66, 1151–1164 (2017). https://doi.org/10.1002/hep.29270
Huang, W.J., et al.: Silencing Circular RNA Hsa_circ_0000977 suppresses pancreatic ductal adenocarcinoma progression by stimulating MiR-874-3p and inhibiting PLK1 expression. Cancer Lett. 422, 70–80 (2018)
Chen, J., et al.: Circular RNA profile identifies CircPVT1 as a proliferative factor and prognostic marker in gastric cancer. Cancer Lett. 388, 208–219 (2017). https://doi.org/10.1016/j.canlet.2016.12.006
Xu, T., Wu, J., Han, P., Zhao, Z., Song, X.: Circular RNA expression profiles and features in human tissues: a study using RNA-Seq data. BMC Genomics 18 (2017). https://doi.org/10.1186/s12864-017-4029-3
Tucker, D., Zheng, W., Zhang, D.-H., Dong, X.: Circular RNA and its potential as prostate cancer biomarkers. World J. Clin. Oncol. 11, 563–572 (2020). https://doi.org/10.5306/wjco.v11.i8.563
Li, Z., Chen, Z., Hu, G.H., Jiang, Y.: Roles of circular RNA in breast cancer: present and future. Am. J. Transl. Res. 11, 3945–3954 (2019)
Du, W.W., Zhang, C., Yang, W., Yong, T., Awan, F.M., Yang, B.B.: Identifying and characterizing CircRNA-Protein interaction. Theranostics 7, 4183–4191 (2017)
Zang, J., Lu, D., Xu, A.: The interaction of CircRNAs and RNA binding proteins: an important part of CircRNA maintenance and function. J. Neurosci. Res. 98, 87–97 (2020)
Li, Y.E., et al.: Identification of high-confidence RNA regulatory elements by combinatorial classification of RNA-protein binding sites. Genome Biol. 18 (2017). https://doi.org/10.1186/s13059-017-1298-8
Yang, Y.C.T., et al.: CLIPdb: a CLIP-Seq database for protein-RNA interactions. BMC Genomics 16 (2015). https://doi.org/10.1186/s12864-015-1273-2
Li, J.H., Liu, S., Zhou, H., Qu, L.H., Yang, J.H.: StarBase v2.0: decoding MiRNA-CeRNA, MiRNA-NcRNA and protein-RNA interaction networks from large-scale CLIP-Seq data. Nucleic Acids Res. 42 (2014). https://doi.org/10.1093/nar/gkt1248
Zhao, H., Yang, Y., Zhou, Y.: Prediction of RNA binding proteins comes of age from low resolution to high resolution. Mol. Biosyst. 9, 2417–2425 (2013)
Fornes, O., Garcia-Garcia, J., Bonet, J., Oliva, B.: On the use of knowledge-based potentials for the evaluation of models of Protein-Protein, Protein-DNA, and Protein-RNA interactions. Adv. Protein Chem. Struct. Biol. 94, 77–120 (2014). ISBN 9780128001684
Kauffman, C., Karypis, G.: Computational tools for Protein-DNA interactions. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2, 14–28 (2012)
Liu, L.A., Bradley, P.: Atomistic modeling of Protein-DNA interaction specificity: progress and applications. Curr. Opin. Struct. Biol. 22, 397–405 (2012)
Choi, S., Han, K.: Predicting Protein-binding RNA nucleotides using the feature-based removal of data redundancy and the interaction propensity of nucleotide triplets. Comput. Biol. Med. 43, 1687–1697 (2013). https://doi.org/10.1016/j.compbiomed.2013.08.011
Panwar, B., Raghava, G.P.S.: Identification of Protein-Interacting nucleotides in a RNA sequence using composition profile of Tri-Nucleotides. Genomics 105, 197–203 (2015). https://doi.org/10.1016/j.ygeno.2015.01.005
Jia, C., Bi, Y., Chen, J., Leier, A., Li, F., Song, J.: PASSION: an ensemble neural network approach for identifying the binding sites of RBPs on CircRNAs. Bioinformatics 36, 4276–4282 (2020). https://doi.org/10.1093/bioinformatics/btaa522
Wang, Z., Lei, X.: Matrix factorization with neural network for predicting CircRNA-RBP interactions. BMC Bioinform. 21 (2020). https://doi.org/10.1186/s12859-020-3514-x
Conn, S.J., et al.: The RNA binding protein quaking regulates formation of CircRNAs. Cell 160, 1125–1134 (2015). https://doi.org/10.1016/j.cell.2015.02.014
Abdelmohsen, K., et al.: Identification of HuR target circular RNAs uncovers suppression of PABPN1 translation by CircPABPN1. RNA Biol. 14, 361–369 (2017). https://doi.org/10.1080/15476286.2017.1279788
Dudekula, D.B., Panda, A.C., Grammatikakis, I., De, S., Abdelmohsen, K., Gorospe, M.: Circinteractome: a web tool for exploring circular RNAs and their interacting proteins and MicroRNAs. RNA Biol. 13, 34–42 (2016). https://doi.org/10.1080/15476286.2015.1128065
Okholm, T.L.H., et al.: Transcriptome-wide profiles of circular RNA and RNA-binding protein interactions reveal effects on circular RNA biogenesis and cancer pathway expression. Genome Med. 12 (2020). https://doi.org/10.1186/s13073-020-00812-8
Zhou, H.L., Mangelsdorf, M., Liu, J.H., Zhu, L., Wu, J.Y.: RNA-binding proteins in neurological diseases. Sci. China Life Sci. 57, 432–444 (2014)
Pereira, B., Billaud, M., Almeida, R.: RNA-binding proteins in cancer: old players and new actors. Trends in Cancer 3, 506–528 (2017)
Prashad, S., Gopal, P.P.: RNA-binding proteins in neurological development and disease. RNA Biol. 18, 972–987 (2021)
Zhang, K., Pan, X., Yang, Y., Shen, H.: Bin CRIP: predicting CircRNA-RBP-binding sites using a codon-based encoding and hybrid deep neural networks. RNA 25, 1604–1615 (2019). https://doi.org/10.1261/rna.070565.119
Maticzka, D., Lange, S.J., Costa, F., Backofen, R.: GraphProt: modeling binding preferences of RNA-binding proteins. Genome Biol. 15 (2014). https://doi.org/10.1186/gb-2014-15-1-r17
Corrado, G., Tebaldi, T., Costa, F., Frasconi, P., Passerini, A.: RNAcommender: genome-wide recommendation of RNA-Protein interactions. Bioinformatics 32, 3627–3634 (2016). https://doi.org/10.1093/bioinformatics/btw517
Alipanahi, B., Delong, A., Weirauch, M.T., Frey, B.J.: Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning. Nat. Biotechnol. 33, 831–838 (2015). https://doi.org/10.1038/nbt.3300
Pan, X., Shen, H.: Bin predicting RNA-Protein binding sites and motifs through combining local and global deep convolutional neural networks. Bioinformatics 34, 3427–3436 (2018). https://doi.org/10.1093/bioinformatics/bty364
Yuan, L., Yang, Y.: DeCban: prediction of CircRNA-RBP interaction sites by using double embeddings and cross-branch attention networks. Front. Genet. 11 (2021). https://doi.org/10.3389/fgene.2020.632861
Niu, M., Zou, Q., Lin, C.: CRBPDL: identification of CircRNA-RBP interaction sites using an ensemble neural network approach. PLoS Comput. Biol. 18 (2022). https://doi.org/10.1371/journal.pcbi.1009798
Fu, L., Niu, B., Zhu, Z., Wu, S., Li, W.: CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics 28, 3150–3152 (2012). https://doi.org/10.1093/bioinformatics/bts565
Shen, J., et al.: Predicting protein-protein interactions based only on sequences information. Proc. Natl. Acad. Sci. U.S.A. 104, 4337–4341 (2007). https://doi.org/10.1073/pnas.0607879104
Muppirala, U.K., Honavar, V.G., Dobbs, D.: Predicting RNA-protein interactions using only sequence information. BMC Bioinform. 12 (2011). https://doi.org/10.1186/1471-2105-12-489
Pan, X., Shen, H.: Bin RNA-protein binding motifs mining with a new hybrid deep learning based cross-domain knowledge integration approach. BMC Bioinform. 18 (2017). https://doi.org/10.1186/s12859-017-1561-8
Pan, X., Rijnbeek, P., Yan, J., Shen, H.: Bin prediction of RNA-protein sequence and structure binding preferences using deep convolutional and recurrent neural networks. BMC Genomics 19 (2018). https://doi.org/10.1186/s12864-018-4889-1
Funding
This work was partly supported by the Key Research and Development Project of Guangdong Province under grant No. 2021B0101310002, National Key Research and Development Program of China Grant No. 2021YFF1200100, Strategic Priority CAS Project XDB38050100, National Science Foundation of China under grant No. 62272449, the Shenzhen Basic Research Fund under grant, No. RCYX20200714114734194, KQTD20200820113106007 and ZDSYS20220422103800001. We would also like to thank the funding support by the Youth Innovation Promotion Association (Y2021101), CAS to Yanjie Wei.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Hossain, M., Reza, M., Li, X., Peng, Y., Feng, S., Wei, Y. (2023). PCPI: Prediction of circRNA and Protein Interaction Using Machine Learning Method. In: Guo, X., Mangul, S., Patterson, M., Zelikovsky, A. (eds) Bioinformatics Research and Applications. ISBRA 2023. Lecture Notes in Computer Science(), vol 14248. Springer, Singapore. https://doi.org/10.1007/978-981-99-7074-2_8
Download citation
DOI: https://doi.org/10.1007/978-981-99-7074-2_8
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-7073-5
Online ISBN: 978-981-99-7074-2
eBook Packages: Computer ScienceComputer Science (R0)