Skip to main content

A Meta-Graph for the Construction of an RNA-Centered Knowledge Graph

  • Conference paper
  • First Online:
Bioinformatics and Biomedical Engineering (IWBBIO 2023)

Abstract

The COVID-19 pandemic highlighted the importance of RNA-based technologies for the development of new vaccines. Besides vaccines, a world of RNA-based drugs, including small non-coding RNA, could open new avenues for the development of novel therapies covering the full spectrum of the main human diseases. In the context of the “National Center for Gene Therapy and Drugs based on RNA Technology” funded by the Italian PNRR and the NextGenerationEU program, our lab will contribute to the construction of a Knowledge Graph (KG) for RNA-drug analysis and the development of innovative algorithms to support RNA-drug discovery. In this paper, we describe the initial steps for the identification of public data sources from which information about different kinds of non-coding RNA sequences (and their relationships with other molecules) can be collected and used for feeding the KG. An in-depth analysis of the characteristics of these sources is provided, along with a meta-graph we developed to guide the RNA-KG construction by exploiting and integrating biomedical ontologies and relevant data from public databases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Initial results available at https://github.com/AnacletoLAB/RNA-KG.

References

  1. Bandyopadhyay, S., et al.: PuTmiR: a database for extracting neighboring transcription factors of human microRNAs. BMC Bioinf. 11(190) (2010). http://isical.ac.in/~bioinfo_miu/TF-miRNA1.php

  2. Barbier, A., et al.: The clinical progress of mRNA vaccines and immunotherapies. Nat. Biotechnol. 40, 840–865 (2022)

    Article  CAS  PubMed  Google Scholar 

  3. Bhattacharya, A., Cui, Y.: SomamiR 2.0: a database of cancer somatic mutations altering microRNA–ceRNA interactions. Nucleic Acids Res. 44, D1005–D1010 (2015). http://compbio.uthsc.edu/SomamiR/home.php

  4. Bhattacharya, A., Ziebarth, J.D., Cui, Y.: PolymiRTS database 3.0: linking polymorphisms in microRNAs and their target sites with human diseases and biological pathways. Nucleic Acids Res. 42(D1), D86–D91 (2013). http://compbio.uthsc.edu/miRSNP

  5. Bonfitto, S., Perlasca, P., Mesiti, M.: Easy-to-use interfaces for supporting the semantic annotation of web tables. In: International Workshop on Data Platforms Design, Management, and Optimization (2023)

    Google Scholar 

  6. Bouchard-Bourelle, P., et al.: snoDB: an interactive database of human snoRNA sequences, abundance and interactions. Nucleic Acids Res. 48(D1), D220–D225 (2020). http://bioinfo-scottgroup.med.usherbrooke.ca/snoDB

  7. Bruno, A., et al.: miRdSNP: a database of disease-associated SNPs and microRNA target sites on 3’UTRs of human genes. BMC Genomics 13(5) (2012). http://mirdsnp.ccr.buffalo.edu

  8. Callahan, T.J., et al.: A framework for automated construction of heterogeneous large-scale biomedical knowledge graphs. bioRxiv (2020)

    Google Scholar 

  9. Chandak, P., et al.: Building a knowledge graph to enable precision medicine. Sci. Data 10(1), 67 (2023)

    Article  PubMed  PubMed Central  Google Scholar 

  10. Chen, G., et al.: LncRNADisease: a database for long-non-coding RNA-associated diseases. Nucleic Acids Res. 41(D1), D983–D986 (2012). http://rnanut.net/lncrnadisease

  11. Chen, J., et al.: RNADisease v4.0: an updated resource of RNA-associated diseases, providing RNA-disease analysis, enrichment and prediction. Nucleic Acids Res. 51(D1), D1397–D1404 (2023). http://rnadisease.org/download

  12. Chen, Y., Wang, X.: miRDB: an online database for prediction of functional microRNA targets. Nucleic Acids Res. 48(D1), D127–D131 (2019). http://mirdb.org

  13. Cheng, J., et al.: ViRBase v3.0: a virus and host ncRNA-associated interaction repository with increased coverage and annotation. Nucleic Acids Res. 50(D1), D928–D933 (2022). http://rna-society.org/virbase

  14. Chiba, S., et al.: eSkip-Finder: a machine learning-based web application and database to identify the optimal sequences of antisense oligonucleotides for exon skipping. Nucleic Acids Res. 49(W1), 193–198 (2021). http://eskip-finder.org/cgi-bin/input.cgi

  15. Cui, T., et al.: RNALocate v2.0: an updated resource for RNA subcellular localization with increased coverage and annotation. Nucleic Acids Res. 50(D1), 333–339 (2022). http://rna-society.org/rnalocate/

  16. Dai, E., et al.: EpimiR: a database of curated mutual regulation between miRNAs and epigenetic modifications. Database (Oxford), 6 (2014). http://jianglab.cn/EpimiR

  17. Damase, T.R., et al.: The limitless future of RNA therapeutics. Front. Bioeng. Biotechnolo. 9 (2021). http://frontiersin.org/articles/10.3389/fbioe.2021.628137

  18. Das, S., et al.: R2rml: Rdb to RDF mapping language. In: W3C (2012). http://www.w3.org/TR/r2rml/

  19. Deng, J., et al.: Ribocentre: a database of ribozymes. Nucleic Acids Res. 51(D1), D262–D268 (2023). http://ribocentre.org

  20. Dimou, A.: RML: a generic language for integrated RDF mappings of heterogeneous data. In: Proceedings of Workshop on Linked Data on the Web. CEUR Workshop Proceedings, vol. 1184 (2014)

    Google Scholar 

  21. Fan, Y., et al.: Xeno-miRNet: a comprehensive database and analytics platform to explore xeno-miRNAs and their potential targets. PeerJ 6 12 (2018). http://mirnet.ca/miRNet

  22. Gao, Y.: Lnc2Cancer 3.0: an updated resource for experimentally supported lncRNA/circRNA cancer associations and web tools based on RNA-seq and scRNA-seq data. Nucleic Acids Res. 49(D1), 1251–1258 (2021). http://bio-bigdata.hrbmu.edu.cn/lnc2cancer

  23. García-González, H., et al.: ShExML: improving the usability of heterogeneous data mapping languages for first-time users. PeerJ Comput. Sci. 6, 27 (2020). http://hal.science/hal-03110745

  24. Halevy, A.: Information Integration, pp. 1490–1496. Springer, Cham (2009)

    Google Scholar 

  25. Heyvaert, P., De Meester, B., Dimou, A., Verborgh, R.: Declarative rules for linked data generation at your fingertips! In: Gangemi, A., et al. (eds.) ESWC 2018. LNCS, vol. 11155, pp. 213–217. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98192-5_40

    Chapter  Google Scholar 

  26. Huang, H.Y., et al.: miRTarBase update 2022: an informative resource for experimentally validated miRNA-target interactions. Nucleic Acids Res. 50(D1), 222–230 (2021). http://mirtarbase.cuhk.edu.cn

  27. Huang, J., et al.: The non-coding RNA Ontology (NCRO): a comprehensive resource for the unification of non-coding RNA biology. J. Biomed. Semant. 7(1), 24 (2016)

    Article  Google Scholar 

  28. Huang, Y., et al.: cncRNAdb: a manually curated resource of experimentally supported RNAs with both protein-coding and noncoding function. Nucleic Acids Res. 49(D1), 65–70 (2021). http://rna-society.org/cncrnadb

  29. Huang, Z., et al.: HMDD v3.0: a database for experimentally supported human microRNA-disease associations. Nucleic Acids Res. 47, 1013–1017 (2018). http://www.cuilab.cn/hmdd

  30. ICB Program: siRNA (2010). http://web.mit.edu/sirna/

  31. Jiang, Q., et al.: miR2Disease: a manually curated database for microRNA deregulation in human disease. Nucleic Acids Res. 37, 98–104 (2008). http://www.mir2disease.org

  32. Jühling, F., et al.: tRNAdb 2009: compilation of tRNA sequences and tRNA genes. Nucleic Acids Res. 37(suppl_1), 159–162 (2009). http://trna.bioinf.uni-leipzig.de/DataOutput/

  33. Kamens, J.: The Addgene repository: an international nonprofit plasmid and data resource. Nucleic Acids Res. 43(D1), 1152–1157 (2015). http://addgene.org

  34. Kang, J., et al.: RNAInter v4. 0: RNA interactome repository with redefined confidence scoring system and improved accessibility. Nucleic Acids Res. 50(D1), 326–332 (2022). http://rnainter.org

  35. Karagkouni, D., et al.: DIANA-TarBase v8: a decade-long collection of experimentally supported miRNA-gene interactions. Nucleic Acids Res. 46, 239–245 (2017). http://dianalab.e-ce.uth.gr/tools

  36. Kehl, T., et al.: miRPathDB 2.0: a novel release of the miRNA Pathway Dictionary Database. Nucleic Acids Res. 48(D1), 142–147 (2019). http://mpd.bioinf.uni-sb.de

  37. Kozomara, A., et al.: miRBase: from microRNA sequences to function. Nucleic Acids Res. 47(D1), 155–162 (2018). http://mirbase.org

  38. Kumar, P., et al.: tRFdb: a database for transfer RNA fragments. Nucleic Acids Res. 43(D1), 141–145 (2015), http://genome.bioch.virginia.edu/trfdb

  39. Lee, B.D., et al.: ViroidDB: a database of viroids and viroid-like circular RNAs. Nucleic Acids Res. 50(D1), 432–438 (2022). http://viroids.org

  40. Lefrançois, M., Zimmermann, A., Bakerally, N.: A SPARQL extension for generating RDF from heterogeneous formats. In: Blomqvist, E., Maynard, D., Gangemi, A., Hoekstra, R., Hitzler, P., Hartig, O. (eds.) ESWC 2017. LNCS, vol. 10249, pp. 35–50. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-58068-5_3

    Chapter  Google Scholar 

  41. Li, Z., et al.: LncBook 2.0: integrating human long non-coding RNAs with multi-omics annotations. Nucleic Acids Res. 51(D1), 186–191 (2023). http://ngdc.cncb.ac.cn/lncbook

  42. Li, Z., et al.: LncExpDB: an expression database of human long non-coding RNAs. Nucleic Acids Res. 49(D1), 962–968 (2021). http://ngdc.cncb.ac.cn/lncexpdb

  43. Liao, A.M., et al.: Aptamer-based target detection facilitated by a 3-stage G-quadruplex isothermal exponential amplification reaction. Bioengineering 188 (2022). https://doi.org/10.3791/64342. http://aptagen.com/apta-index

  44. Liu, L., et al.: LncRNAWiki 2.0: a knowledgebase of human long non-coding RNAs with enhanced curation model and database system. Nucleic Acids Res. 50(D1), 190–195 (2022). http://ngdc.cncb.ac.cn/lncrnawiki1

  45. Liu, X., et al.: SM2miR: a database of the experimentally validated small molecules’ effects on microRNA expression. Bioinformatics 29(3), 409–411 (2012). http://jianglab.cn/SM2miR

  46. Machtel, P., et al.: Emerging applications of riboswitches - from antibacterial targets to molecular tools. J. Appl. Genet. 57(4), 531–541 (2016)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Marchand, J.A., et al.: TBDB: a database of structurally annotated T-box riboswitch: tRNA pairs. Nucleic Acids Res. 49(D1), 229–235 (2021). http://tbdb.io

  48. Mas-Ponte, D., et al.: LncATLAS database for subcellular localization of long noncoding RNAs. RNA 23(7), 1080–1087 (2017). http://lncatlas.crg.eu

  49. McGeary, S.E., et al.: The biochemical basis of microRNA targeting efficacy. Science 366 (2019). http://targetscan.org

  50. Mesiti, M., et al.: XML-based approaches for the integration of heterogeneous bio-molecular data. BMC Bioinf. 10(Suppl 12), S7 (2009)

    Article  Google Scholar 

  51. Nisar, S., et al.: Insights into the role of circRNAs: Biogenesis, characterization, functional, and clinical impact in human malignancies. Front. Cell Dev. Biol. 9 (2021)

    Google Scholar 

  52. Ong, E., et al.: Ontobee: a linked ontology data server to support ontology term dereferencing, linkage, query and integration. Nucleic Acids Res. 45(D1), 347–352 (2016)

    Article  Google Scholar 

  53. Pathan, M., et al.: Vesiclepedia 2019: a compendium of RNA, proteins, lipids and metabolites in extracellular vesicles. Nucleic Acids Res. 47(D1), 516–519 (2019). http://microvesicles.org

  54. Paunovska, K., et al.: Drug delivery systems for RNA therapeutics. Nat. Rev. Genet. 23, 265–280 (2022)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Penchovsky, R., et al.: RSwitch: a novel bioinformatics database on riboswitches as antibacterial drug targets. IEEE/ACM Trans. Comput. Biol. Bioinf. 18(2), 804–808 (2020). http://penchovsky.atwebpages.com

  56. Pliatsika, V., et al.: MINTbase: a framework for the interactive exploration of mitochondrial and nuclear tRNA fragments. Bioinformatics 32(16), 2481–2489 (2016). http://cm.jefferson.edu/MINTbase

  57. Poggi, A., Lembo, D., Calvanese, D., De Giacomo, G., Lenzerini, M., Rosati, R.: Linking data to ontologies. In: Spaccapietra, S. (ed.) Journal on Data Semantics X. LNCS, vol. 4900, pp. 133–173. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-77688-8_5

    Chapter  Google Scholar 

  58. Russo, F., et al.: miRandola 2017: a curated knowledge base of non-invasive biomarkers. Nucleic Acids Res. 46(D1), 354–359 (2018). http://mirandola.iit.cnr.it

  59. Sima, A.C., et al.: Enabling semantic queries across federated bioinformatics databases. Database 2019 (2019)

    Google Scholar 

  60. Stephen, B.J., et al.: Xeno-miRNA in maternal-infant immune crosstalk: an aid to disease alleviation. Front. Immunol. 11(404) (2020)

    Google Scholar 

  61. Sun, L., et al.: The CRISPR/Cas9 system for gene editing and its potential application in pain research. Transl. Perioperative Pain Med. 1(3) (2016)

    Google Scholar 

  62. Volders, P.J., et al.: LNCipedia 5: towards a reference set of human long non-coding RNAs. Nucleic Acids Res. 47(D1), 135–139 (2018). http://lncipedia.org

  63. Vorländer, M.K., et al.: Structural basis of mRNA maturation: time to put it together. ScienceDirect 75, 102431 (2022)

    Google Scholar 

  64. Wang, J.H., et al.: tsRFun: a comprehensive platform for decoding human tsRNA expression, functions and prognostic value by high-throughput small RNA-Seq and CLIP-Seq data. Nucleic Acids Res. 50(D1), 421–431 (2022). http://rna.sysu.edu.cn/tsRFun

  65. Wishart, D., et al.: DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic Acids Res. 46(D1), 1074–1082 (2018). http://go.drugbank.com/categories/DBCAT001709

  66. Wright, M.W.: A short guide to long non-coding RNA gene nomenclature. Hum. Genomics 8(1), 7 (2014)

    Article  PubMed  PubMed Central  Google Scholar 

  67. Wu, D., et al.: ncRDeathDB: a comprehensive bioinformatics resource for deciphering network organization of the ncRNA-mediated cell death system. Autophagy 11(10), 1917–1926 (2015). http://rna-society.org/ncrdeathdb

  68. Xie, B., et al.: miRCancer: a microRNA-cancer association database constructed by text mining on literature. Bioinformatics 29(5), 638–644 (2013). http://mircancer.ecu.edu

  69. Xu, F., et al.: TAM: a method for enrichment and depletion analysis of a microRNA category in a list of microRNAs. BMC Bioinf. 11(419) (2010). http://lirmed.com/tam2

  70. Xu, F., et al.: dbDEMC 3.0: Functional exploration of differentially expressed miRNAs in cancers of human and model organisms. Genomics Proteomics Bioinf. 20(3), 446–454 (2022). http://biosino.org/dbDEMC

  71. Zhang, S., et al.: A graph-based approach for integrating biological heterogeneous data based on connecting ontology. In: IEEE International Conference on Bioinformatics and Biomedicine, pp. 600–607 (2021)

    Google Scholar 

  72. Zhang, Y.Y., et al.: dbEssLnc: a manually curated database of human and mouse essential lncRNA genes. Comput. Struct. Biotechnol. J. 20, 2657–2663 (2022). http://esslnc.pufengdu.org

  73. Zhao, L., et al.: NONCODEV6: an updated database dedicated to long non-coding RNA annotation in both animals and plants. Nucleic Acids Res. 49(D1), 165–171 (2021). http://noncode.org

  74. Xia, F., et al.: Graph learning: a survey. IEEE Trans. Artif. Intell. 2(2), 109–127 (2021)

    Article  Google Scholar 

Download references

Acknowledgments

This research was supported by the “National Center for Gene Therapy and Drugs based on RNA Technology”, PNRR-NextGenerationEU program [G43C22001320007]. The authors wish to thank Tiffany J. Callahan, Justin T. Reese, and Peter N. Robinson for useful discussions on the topics of this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Giorgio Valentini .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cavalleri, E. et al. (2023). A Meta-Graph for the Construction of an RNA-Centered Knowledge Graph. In: Rojas, I., Valenzuela, O., Rojas Ruiz, F., Herrera, L.J., Ortuño, F. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2023. Lecture Notes in Computer Science(), vol 13919. Springer, Cham. https://doi.org/10.1007/978-3-031-34953-9_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-34953-9_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-34952-2

  • Online ISBN: 978-3-031-34953-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics