Skip to main content

Biomedical Knowledge Graph Embeddings for Personalized Medicine

  • Conference paper
  • First Online:
Progress in Artificial Intelligence (EPIA 2021)

Abstract

Personalized medicine promises to revolutionize healthcare in the coming years. However significant challenges remain, namely in regard to integrating the vast amount of biomedical knowledge generated in the last few years. Here we describe an approach that uses Knowledge Graph Embedding (KGE) methods on a biomedical Knowledge Graph as a path to reasoning over the wealth of information stored in publicly accessible databases. We use curated databases such as Ensembl, DisGeNET and Gene Ontology as data sources to build a Knowledge Graph containing relationships between genes, diseases and other biological entities and explore the potential of KGE methods to derive medically relevant insights from this KG. To showcase the method’s usefulness we describe two use cases: a) prediction of gene-disease associations and b) clustering of disease embeddings. We show that the top gene-disease associations predicted by this approach can be confirmed in external databases or have already been identified in the literature. An analysis of clusters of diseases, with a focus on Autism Spectrum Disorder (ASD), affords novel insights into the biology of this paradigmatic complex disorder and the overlap of its genetic background with other diseases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Diagnostic and Statistical Manual of Mental Disorders: Dsm-5. Amer Psychiatric Pub Incorporated (2013), google-Books-ID: EIbMlwEACAAJ

    Google Scholar 

  2. Akiba, T., Sano, S., Yanase, T., Ohta, T., Koyama, M.: Optuna: a next-generation Hyperparameter Optimization Framework. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2019, pp. 2623–2631. Association for Computing Machinery, New York (July 2019). https://doi.org/10.1145/3292500.3330701

  3. Asif, M., Martiniano, H.F.M.C.M., Vicente, A.M., Couto, F.M.: Identifying disease genes using machine learning and gene functional similarities, assessed through Gene Ontology. PLoS One 13(12), 1–15 (2018). https://doi.org/10.1371/journal.pone.0208626

  4. Asif, M., et al.: Identification of biological mechanisms underlying a multidimensional ASD phenotype using machine learning. bioRxiv p. 470757 (2019)

    Google Scholar 

  5. Aurilio, G., et al.: Androgen receptor signaling pathway in prostate cancer: from genetics to clinical applications. Cells 9(12) (2020). https://doi.org/10.3390/cells9122653

  6. Autism Spectrum Disorders Working Group of The Psychiatric Genomics Consortium: Meta-analysis of GWAS of over 16,000 individuals with autism spectrum disorder highlights a novel locus at 10q24.32 and a significant overlap with schizophrenia. Mol. Autism 8, 21 (2017). https://doi.org/10.1186/s13229-017-0137-9

  7. Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., Yakhnenko, O.: Translating embeddings for modeling multi-relational data. In: Burges, C.J.C., Bottou, L., Welling, M., Ghahramani, Z., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 26, pp. 2787–2795. Curran Associates, Inc. (2013). http://papers.nips.cc/paper/5071-translating-embeddings-for-modeling-multi-relational-data.pdf

  8. Boyle, E.A., Li, Y.I., Pritchard, J.K.: An expanded view of complex traits: from polygenic to omnigenic. Cell 169(7), 1177–1186 (2017)

    Google Scholar 

  9. Campello, R.J.G.B., Moulavi, D., Sander, J.: Density-based clustering based on hierarchical density estimates. In: Pei, J., Tseng, V.S., Cao, L., Motoda, H., Xu, G. (eds.) PAKDD 2013. LNCS (LNAI), vol. 7819, pp. 160–172. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37456-2_14

    Chapter  Google Scholar 

  10. Fleming, L., et al.: Genotype-phenotype correlation of congenital anomalies in multiple congenital anomalies hypotonia seizures syndrome (MCAHS1)/PIGN-related epilepsy. Am. J. Med. Genet.. Part A 170A(1), 77–86 (2016). https://doi.org/10.1002/ajmg.a.37369

  11. Goetz, L.H., Schork, N.J.: Personalized medicine: motivation, challenges, and progress. Fertil. Steril. 109(6), 952–963 (2018). https://doi.org/10.1016/j.fertnstert.2018.05.006

  12. Martiniano, H.F.M.C., Asif, M., Vicente, A.M., Correia, L.: Network propagation-based semi-supervised identification of genes associated with autism spectrum disorder. In: Raposo, M., Ribeiro, P., Sério, S., Staiano, A., Ciaramella, A. (eds.) CIBB 2018. LNCS, vol. 11925, pp. 239–248. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-34585-3_21

    Chapter  Google Scholar 

  13. Maydan, G., et al.: Multiple congenital anomalies-hypotonia-seizures syndrome is caused by a mutation in PIGN. J. Med. Genet. 48(6), 383–389 (2011). https://doi.org/10.1136/jmg.2010.087114

  14. McInnes, L., Healy, J., Melville, J.: UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction. arXiv:1802.03426 [cs, stat] (December 2018), http://arxiv.org/abs/1802.03426, arXiv: 1802.03426

  15. Mohamed, S.K., Nounu, A., Nováček, V.: Biological applications of knowledge graph embedding models. Briefings Bioinform. 22(2), 1679–1693 (2021)

    Google Scholar 

  16. Moulavi, D., Jaskowiak, P.A., Campello, R.J.G.B., Zimek, A., Sander, J.: Density-Based Clustering Validation. In: Proceedings of the 2014 SIAM International Conference on Data Mining, pp. 839–847. Proceedings, Society for Industrial and Applied Mathematics (April 2014). https://doi.org/10.1137/1.9781611973440.96, https://epubs.siam.org/doi/10.1137/1.9781611973440.96

  17. Nicholson, D.N., Greene, C.S.: Constructing knowledge graphs and their biomedical applications. Comput. Struct. Biotech. J. 18, 1414–1428 (2020). https://doi.org/10.1016/j.csbj.2020.05.017

    Article  Google Scholar 

  18. Trouillon, T., Welbl, J., Riedel, S., Gaussier, E., Bouchard, G.: Complex embeddings for simple link prediction. In: Balcan, M.F., Weinberger, K.Q. (eds.) Proceedings of The 33rd International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 48, pp. 2071–2080. PMLR, New York (June 2016). http://proceedings.mlr.press/v48/trouillon16.html

  19. Vicente, A.M., Ballensiefen, W., Jönsson, J.I.: How personalised medicine will transform healthcare by 2030: the ICPerMed vision. J. Transl. Med. 18(1), 180 (2020)

    Google Scholar 

  20. Wang, Q., Mao, Z., Wang, B., Guo, L.: Knowledge graph embedding: a survey of approaches and applications. IEEE Trans. Knowl. Data Eng. 29(12), 2724–2743 (2017). https://doi.org/10.1109/TKDE.2017.2754499

  21. Yang, B., Yih, W.T., He, X., Gao, J., Deng, L.: Embedding Entities and Relations for Learning and Inference in Knowledge Bases. arXiv:1412.6575 [cs] (August 2015), http://arxiv.org/abs/1412.6575, arXiv: 1412.6575

  22. Zheng, D., et al.: DGL-KE: Training Knowledge Graph Embeddings at Scale. In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2020, pp. 739–748. Association for Computing Machinery, New York (2020)

    Google Scholar 

Download references

Acknowledgements

The authors would like to acknowledge the support by the UID/MULTI/04046/2019 centre grant from FCT, Portugal (to BioISI), and the MedPerSyst project (POCI-01-0145-FEDER-016428-PAC) “Redes sinapticas e abordagens compreensivas de medicina personalizada em doenças neurocomportamentais ao longo da vida” (SAICTPAC/0010/2015). This work used the European Grid Infrastructure (EGI) with the support of NCG-INGRID-PT/INCD (Portugal). This work was produced with the support of INCD funded by FCT and FEDER under the project 01/SAICT/2016 n\(^{\circ }\) 022153.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hugo Martiniano .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Vilela, J. et al. (2021). Biomedical Knowledge Graph Embeddings for Personalized Medicine. In: Marreiros, G., Melo, F.S., Lau, N., Lopes Cardoso, H., Reis, L.P. (eds) Progress in Artificial Intelligence. EPIA 2021. Lecture Notes in Computer Science(), vol 12981. Springer, Cham. https://doi.org/10.1007/978-3-030-86230-5_46

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-86230-5_46

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-86229-9

  • Online ISBN: 978-3-030-86230-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics