skip to main content
10.1145/3514221.3517903acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article
Public Access
Artifacts Evaluated & Reusable / v1.1

Compact Walks: Taming Knowledge-Graph Embeddings with Domain- and Task-Specific Pathways

Authors Info & Claims
Published:11 June 2022Publication History

ABSTRACT

Knowledge-graph (KG) embeddings have emerged as a promise in addressing challenges faced by modern biomedical research, including the growing gap between therapeutic needs and available treatments. The popularity of KG embeddings in graph analytics is on the rise, due at least partially to the presumed semanticity of the learned embeddings. Unfortunately, the ability of a node neighborhood picked up by an embedding to capture the node's semantics may depend on the characteristics of the data. One of the reasons for this problem is that KG nodes can be promiscuous, that is, associated with a number of different relationships that are not unique or indicative of the properties of the nodes.

To address the promiscuity challenge and the documented runtime-performance challenge in real-life KG embedding tools, we propose to use domain- and task-specific information to specify regular-expression pathways that define neighborhoods of KG nodes of interest. Our proposed CompactWalks framework uses these semantic subgraphs to enable meaningful compact walks in random-walk based KG embedding methods. We report the results of case studies for the task of determining which pharmaceutical drugs could treat the same diseases. The findings suggest that our CompactWalks approach has the potential to address the promiscuity and runtime-performance challenges in applying embedding tools to large-scale KGs in real life, in the biomedical domain and possibly beyond.

Skip Supplemental Material Section

Supplemental Material

SIGMOD22-modds026-hou.mp4

Knowledge-graph (KG) embeddings have been widely used to address challenges faced by biomedical research, due to the presumed semanticity of the learned embeddings. However, the node neighborhoods picked up by the embeddings do not always appropriately reflect the semantics of the nodes. One of the reasons is that nodes in KGs can be promiscuous, that is, associated with different relationships that are not node specific. In addition, learning embeddings does not always scale to large-size KGs. To address these challenges, we propose a framework that uses domain- and task-specific information to define task-appropriate neighborhoods of KG nodes of interest. We report the results of testing the framework with use cases of drug prediction, disease prediction, and drug clustering. Our findings suggest that the framework can address the promiscuity and runtime-performance challenges in applying embedding tools to large-scale KGs, in the biomedical domain and possibly beyond.

mp4

105.4 MB

References

  1. Monica Agrawal, Marinka Zitnik, and Jure Leskovec. 2018. Large-scale analysis of disease pathways in the human interactome. Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing 23, 212669 (jan 2018), 111--122. https://doi.org/10.1142/9789813235533_0011Google ScholarGoogle ScholarCross RefCross Ref
  2. Alfred V. Aho and Jeffrey D. Ullman. 1992. Foundations of Computer Science, C Edition. Computer Science Press / W. H. Freeman. http://i.stanford.edu/%7Eullman/focs.htmlGoogle ScholarGoogle Scholar
  3. Farahnaz Akrami, Mohammed Samiul Saeef, Qingheng Zhang, Wei Hu, and Chengkai Li. 2020. Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study. In Proceedings of the 2020 International Conference on Management of Data, SIGMOD Conference 2020, online conference [Portland, OR, USA], June 14--19, 2020, David Maier, Rachel Pottinger, AnHai Doan, Wang- Chiew Tan, Abdussalam Alawini, and Hung Q. Ngo (Eds.). ACM, 1995--2010. https://doi.org/10.1145/3318464.3380599Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Lilia Alberghina and Hans V. Westerhoff (Eds.). 2005. Systems Biology: Definitions and Perspectives. Springer.Google ScholarGoogle ScholarCross RefCross Ref
  5. Faisal Alkhateeb, Jean-François Baget, and Jérôme Euzenat. 2009. Extending SPARQL with regular expression patterns (for querying RDF). J. Web Semant. 7, 2 (2009), 57--73. https://doi.org/10.1016/j.websem.2009.02.002Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Mona Alshahrani, Mohammad Asif Khan, Omar Maddouri, Akira R. Kinjo, Núria Queralt-Rosinach, and Robert Hoehndorf. 2017. Neuro-symbolic representation learning on biological knowledge graphs. Bioinformatics (Oxford, England) 33, 17 (2017), 2723--2730. https://doi.org/10.1093/bioinformatics/btx275 arXiv:1612.04256Google ScholarGoogle Scholar
  7. Kemafor Anyanwu, Angela Maduko, and Amit P. Sheth. 2007. SPARQ2L: towards support for subgraph extraction queries in rdf databases. In Proceedings of the 16th International Conference on World Wide Web, WWW 2007, Banff, Alberta, Canada, May 8--12, 2007, Carey L. Williamson, Mary Ellen Zurko, Peter F. Patel-Schneider, and Prashant J. Shenoy (Eds.). ACM, 797--806. https://doi.org/10.1145/1242572.1242680Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Nancy C. Baker and Bradley M. Hemminger. 2010. Mining connections between chemicals, proteins, and diseases extracted from Medline annotations. J. Biomed. Informatics 43, 4 (2010), 510--519. https://doi.org/10.1016/j.jbi.2010.03.008Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. C. Bradford Barber, David P. Dobkin, and Hannu Huhdanpaa. 1996. The Quickhull Algorithm for Convex Hulls. ACM Trans. Math. Softw. 22, 4 (1996), 469--483. https://doi.org/10.1145/235815.235821Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Rajarshi Bhowmik and Gerard de Melo. 2020. Explainable Link Prediction for Emerging Entities in Knowledge Graphs. In The Semantic Web - ISWC 2020 - 19th International Semantic Web Conference, Athens, Greece, November 2--6, 2020, Proceedings, Part I (Lecture Notes in Computer Science, Vol. 12506), Jeff Z. Pan, Valentina A. M. Tamma, Claudia d'Amato, Krzysztof Janowicz, Bo Fu, Axel Polleres, Oshani Seneviratne, and Lalana Kagal (Eds.). Springer, 39--55. https://doi.org/10.1007/978--3-030--62419--4_3Google ScholarGoogle Scholar
  11. Julie Blatt, Sherif Farag, Seth J. Corey, Zafeira Sarrimanolis, Eugene Muratov, Denis Fourches, Alexander Tropsha, and William P. Janzen. 2014. Expanding the scope of drug repurposing in pediatrics: The Children's Pharmacy Collaborative. Nature 19, 11 (2014), 1696--1698. https://doi.org/10.1016/j.drudis.2014.08.003Google ScholarGoogle Scholar
  12. Duy Duc An Bui and Qing Zeng-Treitler. 2014. Research and applications: Learning regular expressions for clinical text classification. J. Am. Medical Informatics Assoc. 21, 5 (2014), 850--857. https://doi.org/10.1136/amiajnl-2013-002411Google ScholarGoogle ScholarCross RefCross Ref
  13. Hongyun Cai, Vincent W. Zheng, and Kevin Chen Chuan Chang. 2018. A Comprehensive Survey of Graph Embedding: Problems, Techniques, and Applications. IEEE Transactions on Knowledge and Data Engineering 30, 9 (2018), 1616--1637. https://doi.org/10.1109/TKDE.2018.2807452 arXiv:1709.07604Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Stephen J. Capuzzi, Thomas E. Thornton, Kammy Liu, Nancy Baker, Wai In Lam, Colin P. O'banion, Eugene N. Muratov, Diane Pozefsky, and Alexander Tropsha. 2018. Chemotext: A Publicly Available Web Server for Mining Drug-Target- Disease Relationships in PubMed. J. Chem. Inf. Model. 58, 2 (2018), 212--218. https://doi.org/10.1021/acs.jcim.7b00589Google ScholarGoogle ScholarCross RefCross Ref
  15. William Jay Conover. 1999. Practical nonparametric statistics (3rd ed.). John Wiley & Sons.Google ScholarGoogle Scholar
  16. CSIRO's Data61. 2018. StellarGraph Machine Learning Library. https://github.com/stellargraph/stellargraph.Google ScholarGoogle Scholar
  17. Landon Detwiler, Dan Suciu, and James F. Brinkley. 2008. Regular Paths in SparQL: Querying the NCI Thesaurus. In AMIA 2008, American Medical Informatics Association Annual Symposium, Washington, DC, USA, November 8--12, 2008. AMIA. http://knowledge.amia.org/amia-55142-a2008a-1.625176/t-002--1.625979/f-001--1.625980/a-032--1.626005/a-033--1.626002Google ScholarGoogle Scholar
  18. Jay L Devore. 2015. Probability and Statistics for Engineering and the Sciences (9th ed.). Cengage Learning.Google ScholarGoogle Scholar
  19. Yuxiao Dong, Nitesh V. Chawla, and Ananthram Swami. 2017. Metapath2vec: Scalable Representation Learning for Heterogeneous Networks. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '17). Association for Computing Machinery, New York, NY, USA, 135--144. https://doi.org/10.1145/3097983.3098036Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Wishart DS, Feunang YD, Guo AC, Lo EJ, Marcu A, Grant JR, Sajed T, Johnson D, Li C, Sayeeda Z, Assempour N, Iynkkaran I, Liu Y, Maciejewski A, Gale N, Wilson A, Chin L, Cummings R, Le D, Pon A, Knox C, and Wilson M. 2018. DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic acids research 46, D1 (2018), D1074--D1082. https://doi.org/doi:10.1093/nar/gkx1037Google ScholarGoogle Scholar
  21. Xinyu Fu, Jiani Zhang, Ziqiao Meng, and Irwin King. 2020. MAGNN: Metapath Aggregated Graph Neural Network for Heterogeneous Graph Embedding. In WWW '20: The Web Conference 2020, Taipei, Taiwan, April 20--24, 2020, Yennun Huang, Irwin King, Tie-Yan Liu, and Maarten van Steen (Eds.). ACM / IW3C2, 2331--2341. https://doi.org/10.1145/3366423.3380297Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Mohamed H. Gad-Elrab, Daria Stepanova, Trung-Kien Tran, Heike Adel, and Gerhard Weikum. 2020. ExCut: Explainable Embedding-Based Clustering over Knowledge Graphs. In The Semantic Web - ISWC 2020 - 19th International Semantic Web Conference, Athens, Greece, November 2--6, 2020, Proceedings, Part I (Lecture Notes in Computer Science, Vol. 12506), Jeff Z. Pan, Valentina A. M. Tamma, Claudia d'Amato, Krzysztof Janowicz, Bo Fu, Axel Polleres, Oshani Seneviratne, and Lalana Kagal (Eds.). Springer, 218--237. https://doi.org/10.1007/978--3-030--62419--4_13Google ScholarGoogle Scholar
  23. Palash Goyal and Emilio Ferrara. 2018. Graph embedding techniques, applications, and performance: A survey. Knowledge-Based Systems 151 (2018), 78--94. https://doi.org/10.1016/j.knosys.2018.03.022 arXiv:1705.02801Google ScholarGoogle ScholarCross RefCross Ref
  24. Aditya Grover and Jure Leskovec. 2016. Node2vec: Scalable Feature Learning for Networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '16). Association for Computing Machinery, New York, NY, USA, 855--864. https://doi.org/10.1145/2939672.2939754Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Daniel Scott Himmelstein, Antoine Lizee, Christine Hessler, Leo Brueggeman, Sabrina L Chen, Dexter Hadley, Ari Green, Pouya Khankhanian, and Sergio E Baranzini. 2017. Systematic integration of biomedical knowledge prioritizes drugs for repurposing. Elife 6 (2017), e26726.Google ScholarGoogle ScholarCross RefCross Ref
  26. John E. Hopcroft, Rajeev Motwani, and Jeffrey D. Ullman. 2001. Introduction to automata theory, languages, and computation, 2nd Edition. Addison-Wesley-Longman.Google ScholarGoogle Scholar
  27. Peihao Huang, Yan Huang, Wei Wang, and Liang Wang. 2014. Deep Embedding Network for Clustering. In 22nd International Conference on Pattern Recognition, ICPR 2014, Stockholm, Sweden, August 24--28, 2014. IEEE Computer Society, 1532--1537. https://doi.org/10.1109/ICPR.2014.272Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Richard J. Cole II and Peter Bruza. 2005. A Bare Bones Approach to Literature-Based Discovery: An Analysis of the Raynaud's/Fish-Oil and Migraine- Magnesium Discoveries in Semantic Space. In Discovery Science, 8th International Conference, DS 2005, Singapore, October 8--11, 2005, Proceedings (Lecture Notes in Computer Science, Vol. 3735), Achim G. Hoffmann, Hiroshi Motoda, and Tobias Scheffer (Eds.). Springer, 84--98. https://doi.org/10.1007/11563983_9Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Nitisha Jain, Jan-Christoph Kalo, Wolf-Tilo Balke, and Ralf Krestel. 2021. Do Embeddings Actually Capture Knowledge Graph Semantics?. In The Semantic Web - 18th International Conference, ESWC 2021, Virtual Event, June 6--10, 2021, Proceedings (Lecture Notes in Computer Science, Vol. 12731), Ruben Verborgh, Katja Hose, Heiko Paulheim, Pierre-Antoine Champin, Maria Maleshkova, Óscar Corcho, Petar Ristoski, and Mehwish Alam (Eds.). Springer, 143--159. https://doi.org/10.1007/978--3-030--77385--4_9Google ScholarGoogle Scholar
  30. Nitisha Jain and Ralf Krestel. 2020. Learning Fine-Grained Semantics for Multi Relational Data. In Proceedings of the ISWC 2020 Demos and Industry Tracks: From Novel Ideas to Industrial Practice co-located with 19th International Semantic Web Conference (ISWC 2020), Globally online, November 1--6, 2020 (UTC) (CEUR Workshop Proceedings, Vol. 2721), Kerry L. Taylor, Rafael S. Gonçalves, Freddy Lécué, and Jun Yan (Eds.). CEUR-WS.org, 124--129. http://ceur-ws.org/Vol-2721/paper529.pdfGoogle ScholarGoogle Scholar
  31. Munui Kim, Seung Han Baek, and Min Song. 2018. Relation extraction for biological pathway construction using node2vec. BMC Bioinformatics 19 (2018). https://doi.org/10.1186/s12859-018--2200--8Google ScholarGoogle Scholar
  32. T Klabunde. 2007. Chemogenomic approaches to drug discovery: similar receptors bind similar ligands. British Journal of Pharmacology 152, 1 (2007), 5--7.Google ScholarGoogle ScholarCross RefCross Ref
  33. Krys J. Kochut and Maciej Janik. 2007. SPARQLeR: Extended Sparql for Semantic Association Discovery. In The Semantic Web: Research and Applications, 4th European Semantic Web Conference, ESWC 2007, Innsbruck, Austria, June 3--7, 2007, Proceedings (Lecture Notes in Computer Science, Vol. 4519), Enrico Franconi, Michael Kifer, and Wolfgang May (Eds.). Springer, 145--159. https://doi.org/10.1007/978--3--540--72667--8_12Google ScholarGoogle Scholar
  34. Daniel Korn, Andrew J. Thieme, Vinicius M. Alves, Michael Yeakey, Joyce V.V.B. Borba, Stephen J. Capuzzi, Karamarie Fecho, Chris Bizon, Stephen W. Edwards, Rada Chirkova, Christine M. Colvis, Noel T. Southall, Christopher P. Austin, Eugene N. Muratov, and Alexander Tropsha. 2022. Defining clinical outcome pathways. Drug Discovery Today https://www.sciencedirect.com/science/article/abs/pii/S1359644622000654.Google ScholarGoogle Scholar
  35. André Koschmieder and Ulf Leser. 2012. Regular Path Queries on Large Graphs. In Scientific and Statistical Database Management - 24th International Conference, SSDBM 2012, Chania, Crete, Greece, June 25--27, 2012. Proceedings (Lecture Notes in Computer Science, Vol. 7338). Springer, 177--194. https://doi.org/10.1007/978--3--642--31235--9_12Google ScholarGoogle Scholar
  36. Xiujuan Lei and Yueyue Wang. 2020. Predicting Microbe-Disease Association by Learning Graph Representations and Rule-Based Inference on the Heterogeneous Network. Frontiers in Microbiology 11 (2020). https://doi.org/10.3389/fmicb.2020.00579Google ScholarGoogle Scholar
  37. Ulf Leser. 2005. A query language for biological networks. In ECCB/JBI'05 Proceedings, Fourth European Conference on Computational Biology/Sixth Meeting of the Spanish Bioinformatics Network (Jornadas de BioInformática), Palacio de Congresos, Madrid, Spain, September 28 - October 1, 2005. 39. https://doi.org/10.1093/bioinformatics/bti1105Google ScholarGoogle ScholarCross RefCross Ref
  38. Jure Leskovec and Rok Sosi?. 2016. SNAP: A General-Purpose Network Analysis and Graph-Mining Library. ACM Trans. Intell. Syst. Technol. 8, 1, Article 1 (July 2016), 20 pages. https://doi.org/10.1145/2898361Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Guanghui Li, Jiawei Luo, Qiu Xiao, Cheng Liang, Pingjian Ding, and Buwen Cao. 2017. Predicting MicroRNA-Disease Associations Using Network Topological Similarity Based on DeepWalk. IEEE Access 5 (2017), 24032--24039. https://doi.org/10.1109/ACCESS.2017.2766758Google ScholarGoogle ScholarCross RefCross Ref
  40. Quanzhong Li and Bongki Moon. 2001. Indexing and Querying XML Data for Regular Path Expressions. In VLDB 2001, Proceedings of 27th International Conference on Very Large Data Bases, September 11--14, 2001, Roma, Italy, Peter M. G. Apers, Paolo Atzeni, Stefano Ceri, Stefano Paraboschi, Kotagiri Ramamohanarao, and Richard T. Snodgrass (Eds.). Morgan Kaufmann, 361--370. http://www.vldb.org/conf/2001/P361.pdfGoogle ScholarGoogle Scholar
  41. Yahui Long and Jiawei Luo. 2021. Association Mining to Identify Microbe Drug Interactions Based on Heterogeneous Network Embedding Representation. IEEE Journal of Biomedical and Health Informatics 25, 1 (2021), 266--275. https://doi.org/10.1109/JBHI.2020.2998906Google ScholarGoogle ScholarCross RefCross Ref
  42. Ping Luo, Yuanyuan Li, Li Ping Tian, and Fang Xiang Wu. 2019. Enhancing the prediction of disease-gene associations with multimodal deep learning. Bioinformatics 35, 19 (2019), 3735--3742. https://doi.org/10.1093/bioinformatics/btz155Google ScholarGoogle ScholarCross RefCross Ref
  43. James MacQueen et al . 1967. Some methods for classification and analysis of multivariate observations. In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, Vol. 1. Oakland, CA, USA, 281--297.Google ScholarGoogle Scholar
  44. Alberto O. Mendelzon and Peter T. Wood. 1995. Finding Regular Simple Paths in Graph Databases. SIAM J. Comput. 24, 6 (1995), 1235--1258. https://doi.org/10.1137/S009753979122370XGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  45. Tomás Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. In 1st International Conference on Learning Representations, ICLR 2013, Scottsdale, Arizona, USA, May 2--4, 2013, Workshop Track Proceedings. http://arxiv.org/abs/1301.3781Google ScholarGoogle Scholar
  46. Tomás Mikolov, Ilya Sutskever, Kai Chen, Gregory S. Corrado, and Jeffrey Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality. In Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5--8, 2013, Lake Tahoe, Nevada, United States, Christopher J. C. Burges, Léon Bottou, Zoubin Ghahramani, and Kilian Q. Weinberger (Eds.). 3111--3119. https://proceedings.neurips.cc/paper/2013/hash/9aa42b31882ec039965f3c4923ce901b-Abstract.htmlGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  47. MIT. 2017. Lark - a parsing toolkit for Python. https://github.com/lark-parser/lark.Google ScholarGoogle Scholar
  48. Changsung Moon, Paul Jones, and Nagiza F. Samatova. 2017. Learning Entity Type Embeddings for Knowledge Graph Completion. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, CIKM 2017, Singapore, November 06 - 10, 2017, Ee-Peng Lim, Marianne Winslett, Mark Sanderson, Ada Wai- Chee Fu, Jimeng Sun, J. Shane Culpepper, Eric Lo, Joyce C. Ho, Debora Donato, Rakesh Agrawal, Yu Zheng, Carlos Castillo, Aixin Sun, Vincent S. Tseng, and Chenliang Li (Eds.). ACM, 2215--2218. https://doi.org/10.1145/3132847.3133095Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Frederic Morin and Yoshua Bengio. 2005. Hierarchical Probabilistic Neural Network Language Model. In Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics, AISTATS 2005, Bridgetown, Barbados, January 6--8, 2005, Robert G. Cowell and Zoubin Ghahramani (Eds.). Society for Artificial Intelligence and Statistics. http://www.gatsby.ucl.ac.uk/aistats/fullpapers/208.pdfGoogle ScholarGoogle Scholar
  50. Kenneth Morton, Patrick Wang, Chris Bizon, Steven Cox, James Balhoff, Yaphet Kebede, Karamarie Fecho, and Alexander Tropsha. 2019. ROBOKOP: an abstraction layer and user interface for knowledge graphs to support question answering. Bioinformatics 35, 24 (2019), 5382--5384. https://doi.org/10.1093/bioinformatics/btz604Google ScholarGoogle ScholarCross RefCross Ref
  51. Walter Nelson, Marinka Zitnik, Bo Wang, Jure Leskovec, Anna Goldenberg, and Roded Sharan. 2019. To embed or not: Network embedding as a paradigm in computational biology. Frontiers in Genetics 10, MAY (2019). https://doi.org/10.3389/fgene.2019.00381Google ScholarGoogle Scholar
  52. Inc Neo4j. 2020. The Neo4j Python Driver Manual v4.3. https://neo4j.com/docs/pdf/neo4j-driver-manual-4.3-python.pdf.Google ScholarGoogle Scholar
  53. N. Nosengo. 2016. Can you teach old drugs new tricks? Nature 534 (2016), 314--316. https://doi.org/10.1038/534314aGoogle ScholarGoogle ScholarCross RefCross Ref
  54. Jiajie Peng, Jiaojiao Guan, and Xuequn Shang. 2019. Predicting Parkinson's disease genes based on node2vec and autoencoder. Frontiers in Genetics 10, APR (2019). https://doi.org/10.3389/fgene.2019.00226Google ScholarGoogle Scholar
  55. Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. DeepWalk: Online Learning of Social Representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '14). Association for Computing Machinery, New York, NY, USA, 701--710. https://doi.org/10.1145/2623330.2623732Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. F. Prati, E. Uliassi, and M. L. Bolognesi. 2014. Two diseases, one approach: multitarget drug discovery in Alzheimer's and neglected tropical diseases. Med- ChemComm 5, 7 (2014), 853--861. https://doi.org/10.1039/C4MD00069BGoogle ScholarGoogle Scholar
  57. Dragomir R. Radev, Hong Qi, Harris Wu, and Weiguo Fan. 2002. Evaluating web-based question answering systems. Proceedings of the 3rd International Conference on Language Resources and Evaluation, LREC 2002 (2002), 1153--1156.Google ScholarGoogle Scholar
  58. Andrea Rossi and Antonio Matinata. 2020. Knowledge Graph Embeddings: Are Relation-Learning Models Learning Relations?. In Proceedings of the Workshops of the EDBT/ICDT 2020 Joint Conference, Copenhagen, Denmark, March 30, 2020 (CEUR Workshop Proceedings, Vol. 2578), Alexandra Poulovassilis, David Auber, Nikos Bikakis, Panos K. Chrysanthis, George Papastefanatos, Mohamed A. Sharaf, Nikos Pelekis, Chiara Renso, Yannis Theodoridis, Karine Zeitouni, Tania Cerquitelli, Silvia Chiusano, Genoveva Vargas-Solar, Behrooz Omidvar-Tehrani, Katharina Morik, Jean-Michel Renders, Donatella Firmani, Letizia Tanca, Davide Mottin, Matteo Lissandrini, and Yannis Velegrakis (Eds.). CEUR-WS.org. http://ceur-ws.org/Vol-2578/PIE2.pdfGoogle ScholarGoogle Scholar
  59. S. Roweis and G. Hinton. 2002. Stochastic Neighbor Embedding. In Proceedings of the 15th International Conference on Neural Information Processing Systems. 857--864.Google ScholarGoogle Scholar
  60. Daniel Ruffinelli, Samuel Broscheit, and Rainer Gemulla. 2020. You CAN Teach an Old Dog New Tricks! On Training Knowledge Graph Embeddings. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26--30, 2020. OpenReview.net. https://openreview.net/forum?id=BkxSmlBFvrGoogle ScholarGoogle Scholar
  61. Amit Singhal et al . 2001. Modern information retrieval: A brief overview. IEEE Data Eng. Bull. 24, 4 (2001), 35--43.Google ScholarGoogle Scholar
  62. Michael Sipser. 2013. Introduction to the theory of computation. Boston, MA : Cengage Learning, [2013], Boston, MA.Google ScholarGoogle Scholar
  63. Sunghwan Sohn, Kavishwar B Wagholikar, Dingcheng Li, Siddhartha R Jonnalagadda, Cui Tao, Ravikumar Komandur Elayavilli, and Hongfang Liu. 2013. Comprehensive temporal information detection from clinical text: medical events, time, and TLINK identification. Journal of the American Medical Informatics Association 20, 5 (04 2013), 836--842. https://doi.org/10.1136/amiajnl-2013-001622 arXiv:https://academic.oup.com/jamia/article-pdf/20/5/836/5875467/20--5--836.pdfGoogle ScholarGoogle ScholarCross RefCross Ref
  64. Chang Su, Jie Tong, Yongjun Zhu, Peng Cui, and Fei Wang. 2020. Network embedding in biomedical data science. Briefings in Bioinformatics 21, 1 (2020), 182--197. https://doi.org/10.1093/bib/bby117Google ScholarGoogle ScholarCross RefCross Ref
  65. D.R. Swanson. 1986. Fish oil, Raynaud's syndrome, and undiscovered public knowledge. Perspect Biol Med 30, 1 (1986), 7--18. https://doi.org/10.1353/pbm.1986.0087Google ScholarGoogle ScholarCross RefCross Ref
  66. D.R. Swanson. 1988. Migraine and magnesium: eleven neglected connections. Perspect Biol Med 31, 4 (1988), 526--557. https://doi.org/10.1353/pbm.1986.0087Google ScholarGoogle ScholarCross RefCross Ref
  67. George J. Tourlakis. 2012. Theory of computation. Hoboken, N.J. : Wiley, 2012., Hoboken, N.J.Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research 9, 11 (2008).Google ScholarGoogle Scholar
  69. Alina Vretinaris, Chuan Lei, Vasilis Efthymiou, Xiao Qin, and Fatma Özcan. 2021. Medical Entity Disambiguation Using Graph Neural Networks. In SIGMOD '21: International Conference on Management of Data, Virtual Event, China, June 20--25, 2021, Guoliang Li, Zhanhuai Li, Stratos Idreos, and Divesh Srivastava (Eds.). ACM, 2310--2318. https://doi.org/10.1145/3448016.3457328Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. Hongzhi Wang, Jiabao Han, Bin Shao, and Jianzhong Li. 2019. Regular Expression Matching on billion-nodes Graphs. CoRR abs/1904.11653 (2019). arXiv:1904.11653 http://arxiv.org/abs/1904.11653Google ScholarGoogle Scholar
  71. Quan Wang, Zhendong Mao, Bin Wang, and Li Guo. 2017. Knowledge graph embedding: A survey of approaches and applications. IEEE Transactions on Knowledge and Data Engineering 29, 12 (2017), 2724--2743. https://doi.org/10.1109/TKDE.2017.2754499Google ScholarGoogle ScholarCross RefCross Ref
  72. Bo Xu, Yu Liu, Shuo Yu, Lei Wang, Jie Dong, Hongfei Lin, Zhihao Yang, Jian Wang, and Feng Xia. 2019. A network embedding model for pathogenic genes prediction by multi-path random walking on heterogeneous network. BMC Medical Genomics 12 (2019). https://doi.org/10.1186/s12920-019-0627-zGoogle ScholarGoogle Scholar
  73. Xiao Yu, Yizhou Sun, Peixiang Zhao, and Jiawei Han. 2012. Query-Driven Discovery of Semantically Similar Substructures in Heterogeneous Networks. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '12). Association for Computing Machinery, New York, NY, USA, 1500--1503. https://doi.org/10.1145/2339530.2339765Google ScholarGoogle ScholarDigital LibraryDigital Library
  74. Xiang Yue, Zhen Wang, Jingong Huang, Srinivasan Parthasarathy, Soheil Moosavinasab, Yungui Huang, Simon M. Lin, Wen Zhang, Ping Zhang, and Huan Sun. 2020. Graph embedding on biomedical networks: Methods, applications and evaluations. Bioinformatics 36, 4 (2020), 1241--1251. https://doi.org/10.1093/bioinformatics/btz718 arXiv:1906.05017Google ScholarGoogle ScholarCross RefCross Ref
  75. Harald Zauner, Benedikt Linse, Tim Furche, and François Bry. 2010. A RPL through RDF: Expressive Navigation in RDF Graphs. In Web Reasoning and Rule Systems - Fourth International Conference, RR 2010, Bressanone/Brixen, Italy, September 22--24, 2010. Proceedings (Lecture Notes in Computer Science, Vol. 6333), Pascal Hitzler and Thomas Lukasiewicz (Eds.). Springer, 251--257. https://doi.org/10.1007/978--3--642--15918--3_25Google ScholarGoogle ScholarCross RefCross Ref
  76. Siyi Zhu, Jiaxin Bing, Xiaoping Min, Chen Lin, and Xiangxiang Zeng. 2018. Prediction of Drug--Gene Interaction by Using Metapath2vec. Frontiers in Genetics 9 (2018). https://doi.org/10.3389/fgene.2018.00248Google ScholarGoogle Scholar
  77. Nansu Zong, Hyeoneui Kim, Victoria Ngo, and Olivier Harismendy. 2017. Deep mining heterogeneous networks of biomedical linked data to predict novel drug-target associations. Bioinformatics 33, 15 (2017), 2337--2344. https://doi.org/10.1093/bioinformatics/btx160Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Compact Walks: Taming Knowledge-Graph Embeddings with Domain- and Task-Specific Pathways

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Conferences
              SIGMOD '22: Proceedings of the 2022 International Conference on Management of Data
              June 2022
              2597 pages
              ISBN:9781450392495
              DOI:10.1145/3514221

              Copyright © 2022 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 11 June 2022

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article

              Acceptance Rates

              Overall Acceptance Rate785of4,003submissions,20%
            • Article Metrics

              • Downloads (Last 12 months)301
              • Downloads (Last 6 weeks)58

              Other Metrics

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader