research-article

Public Access

Compact Walks: Taming Knowledge-Graph Embeddings with Domain- and Task-Specific Pathways

Authors:
Pei-Yu Hou

North Carolina State University, Raleigh, NC, USA

North Carolina State University, Raleigh, NC, USA
View Profile

,
Daniel R. Korn

University of North Carolina at Chapel Hill, Chapel Hill, NC, USA

University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
View Profile

,
Cleber C. Melo-Filho

University of North Carolina at Chapel Hill, Chapel Hill, NC, USA

University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
View Profile

,
David R. Wright

North Carolina State University, Raleigh, NC, USA

North Carolina State University, Raleigh, NC, USA
View Profile

,
Alexander Tropsha

University of North Carolina at Chapel Hill, Chapel Hill, NC, USA

University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
View Profile

,
Rada Chirkova

North Carolina State University, Raleigh, NC, USA

North Carolina State University, Raleigh, NC, USA
View Profile

SIGMOD '22: Proceedings of the 2022 International Conference on Management of DataJune 2022Pages 458–469https://doi.org/10.1145/3514221.3517903

Published:11 June 2022Publication History

SIGMOD '22: Proceedings of the 2022 International Conference on Management of Data

Pages 458–469

ABSTRACT

Knowledge-graph (KG) embeddings have emerged as a promise in addressing challenges faced by modern biomedical research, including the growing gap between therapeutic needs and available treatments. The popularity of KG embeddings in graph analytics is on the rise, due at least partially to the presumed semanticity of the learned embeddings. Unfortunately, the ability of a node neighborhood picked up by an embedding to capture the node's semantics may depend on the characteristics of the data. One of the reasons for this problem is that KG nodes can be promiscuous, that is, associated with a number of different relationships that are not unique or indicative of the properties of the nodes.

To address the promiscuity challenge and the documented runtime-performance challenge in real-life KG embedding tools, we propose to use domain- and task-specific information to specify regular-expression pathways that define neighborhoods of KG nodes of interest. Our proposed CompactWalks framework uses these semantic subgraphs to enable meaningful compact walks in random-walk based KG embedding methods. We report the results of case studies for the task of determining which pharmaceutical drugs could treat the same diseases. The findings suggest that our CompactWalks approach has the potential to address the promiscuity and runtime-performance challenges in applying embedding tools to large-scale KGs in real life, in the biomedical domain and possibly beyond.

Supplemental Material

SIGMOD22-modds026-hou.mp4

Knowledge-graph (KG) embeddings have been widely used to address challenges faced by biomedical research, due to the presumed semanticity of the learned embeddings. However, the node neighborhoods picked up by the embeddings do not always appropriately reflect the semantics of the nodes. One of the reasons is that nodes in KGs can be promiscuous, that is, associated with different relationships that are not node specific. In addition, learning embeddings does not always scale to large-size KGs. To address these challenges, we propose a framework that uses domain- and task-specific information to define task-appropriate neighborhoods of KG nodes of interest. We report the results of testing the framework with use cases of drug prediction, disease prediction, and drug clustering. Our findings suggest that the framework can address the promiscuity and runtime-performance challenges in applying embedding tools to large-scale KGs, in the biomedical domain and possibly beyond.

mp4

105.4 MB

Download

Available for Download

vtt

SIGMOD22-modds026-hou.vtt (15.6 KB)

References

Monica Agrawal, Marinka Zitnik, and Jure Leskovec. 2018. Large-scale analysis of disease pathways in the human interactome. Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing 23, 212669 (jan 2018), 111--122. https://doi.org/10.1142/9789813235533_0011Google ScholarCross Ref
Alfred V. Aho and Jeffrey D. Ullman. 1992. Foundations of Computer Science, C Edition. Computer Science Press / W. H. Freeman. http://i.stanford.edu/%7Eullman/focs.htmlGoogle Scholar
Farahnaz Akrami, Mohammed Samiul Saeef, Qingheng Zhang, Wei Hu, and Chengkai Li. 2020. Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study. In Proceedings of the 2020 International Conference on Management of Data, SIGMOD Conference 2020, online conference [Portland, OR, USA], June 14--19, 2020, David Maier, Rachel Pottinger, AnHai Doan, Wang- Chiew Tan, Abdussalam Alawini, and Hung Q. Ngo (Eds.). ACM, 1995--2010. https://doi.org/10.1145/3318464.3380599Google ScholarDigital Library
Lilia Alberghina and Hans V. Westerhoff (Eds.). 2005. Systems Biology: Definitions and Perspectives. Springer.Google ScholarCross Ref
Faisal Alkhateeb, Jean-François Baget, and Jérôme Euzenat. 2009. Extending SPARQL with regular expression patterns (for querying RDF). J. Web Semant. 7, 2 (2009), 57--73. https://doi.org/10.1016/j.websem.2009.02.002Google ScholarDigital Library
Mona Alshahrani, Mohammad Asif Khan, Omar Maddouri, Akira R. Kinjo, Núria Queralt-Rosinach, and Robert Hoehndorf. 2017. Neuro-symbolic representation learning on biological knowledge graphs. Bioinformatics (Oxford, England) 33, 17 (2017), 2723--2730. https://doi.org/10.1093/bioinformatics/btx275 arXiv:1612.04256Google Scholar
Kemafor Anyanwu, Angela Maduko, and Amit P. Sheth. 2007. SPARQ2L: towards support for subgraph extraction queries in rdf databases. In Proceedings of the 16th International Conference on World Wide Web, WWW 2007, Banff, Alberta, Canada, May 8--12, 2007, Carey L. Williamson, Mary Ellen Zurko, Peter F. Patel-Schneider, and Prashant J. Shenoy (Eds.). ACM, 797--806. https://doi.org/10.1145/1242572.1242680Google ScholarDigital Library
Nancy C. Baker and Bradley M. Hemminger. 2010. Mining connections between chemicals, proteins, and diseases extracted from Medline annotations. J. Biomed. Informatics 43, 4 (2010), 510--519. https://doi.org/10.1016/j.jbi.2010.03.008Google ScholarDigital Library
C. Bradford Barber, David P. Dobkin, and Hannu Huhdanpaa. 1996. The Quickhull Algorithm for Convex Hulls. ACM Trans. Math. Softw. 22, 4 (1996), 469--483. https://doi.org/10.1145/235815.235821Google ScholarDigital Library
Rajarshi Bhowmik and Gerard de Melo. 2020. Explainable Link Prediction for Emerging Entities in Knowledge Graphs. In The Semantic Web - ISWC 2020 - 19th International Semantic Web Conference, Athens, Greece, November 2--6, 2020, Proceedings, Part I (Lecture Notes in Computer Science, Vol. 12506), Jeff Z. Pan, Valentina A. M. Tamma, Claudia d'Amato, Krzysztof Janowicz, Bo Fu, Axel Polleres, Oshani Seneviratne, and Lalana Kagal (Eds.). Springer, 39--55. https://doi.org/10.1007/978--3-030--62419--4_3Google Scholar
Julie Blatt, Sherif Farag, Seth J. Corey, Zafeira Sarrimanolis, Eugene Muratov, Denis Fourches, Alexander Tropsha, and William P. Janzen. 2014. Expanding the scope of drug repurposing in pediatrics: The Children's Pharmacy Collaborative. Nature 19, 11 (2014), 1696--1698. https://doi.org/10.1016/j.drudis.2014.08.003Google Scholar
Duy Duc An Bui and Qing Zeng-Treitler. 2014. Research and applications: Learning regular expressions for clinical text classification. J. Am. Medical Informatics Assoc. 21, 5 (2014), 850--857. https://doi.org/10.1136/amiajnl-2013-002411Google ScholarCross Ref
Hongyun Cai, Vincent W. Zheng, and Kevin Chen Chuan Chang. 2018. A Comprehensive Survey of Graph Embedding: Problems, Techniques, and Applications. IEEE Transactions on Knowledge and Data Engineering 30, 9 (2018), 1616--1637. https://doi.org/10.1109/TKDE.2018.2807452 arXiv:1709.07604Google ScholarDigital Library
Stephen J. Capuzzi, Thomas E. Thornton, Kammy Liu, Nancy Baker, Wai In Lam, Colin P. O'banion, Eugene N. Muratov, Diane Pozefsky, and Alexander Tropsha. 2018. Chemotext: A Publicly Available Web Server for Mining Drug-Target- Disease Relationships in PubMed. J. Chem. Inf. Model. 58, 2 (2018), 212--218. https://doi.org/10.1021/acs.jcim.7b00589Google ScholarCross Ref
William Jay Conover. 1999. Practical nonparametric statistics (3rd ed.). John Wiley & Sons.Google Scholar
CSIRO's Data61. 2018. StellarGraph Machine Learning Library. https://github.com/stellargraph/stellargraph.Google Scholar
Landon Detwiler, Dan Suciu, and James F. Brinkley. 2008. Regular Paths in SparQL: Querying the NCI Thesaurus. In AMIA 2008, American Medical Informatics Association Annual Symposium, Washington, DC, USA, November 8--12, 2008. AMIA. http://knowledge.amia.org/amia-55142-a2008a-1.625176/t-002--1.625979/f-001--1.625980/a-032--1.626005/a-033--1.626002Google Scholar
Jay L Devore. 2015. Probability and Statistics for Engineering and the Sciences (9th ed.). Cengage Learning.Google Scholar
Yuxiao Dong, Nitesh V. Chawla, and Ananthram Swami. 2017. Metapath2vec: Scalable Representation Learning for Heterogeneous Networks. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '17). Association for Computing Machinery, New York, NY, USA, 135--144. https://doi.org/10.1145/3097983.3098036Google ScholarDigital Library
Wishart DS, Feunang YD, Guo AC, Lo EJ, Marcu A, Grant JR, Sajed T, Johnson D, Li C, Sayeeda Z, Assempour N, Iynkkaran I, Liu Y, Maciejewski A, Gale N, Wilson A, Chin L, Cummings R, Le D, Pon A, Knox C, and Wilson M. 2018. DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic acids research 46, D1 (2018), D1074--D1082. https://doi.org/doi:10.1093/nar/gkx1037Google Scholar
Xinyu Fu, Jiani Zhang, Ziqiao Meng, and Irwin King. 2020. MAGNN: Metapath Aggregated Graph Neural Network for Heterogeneous Graph Embedding. In WWW '20: The Web Conference 2020, Taipei, Taiwan, April 20--24, 2020, Yennun Huang, Irwin King, Tie-Yan Liu, and Maarten van Steen (Eds.). ACM / IW3C2, 2331--2341. https://doi.org/10.1145/3366423.3380297Google ScholarDigital Library
Mohamed H. Gad-Elrab, Daria Stepanova, Trung-Kien Tran, Heike Adel, and Gerhard Weikum. 2020. ExCut: Explainable Embedding-Based Clustering over Knowledge Graphs. In The Semantic Web - ISWC 2020 - 19th International Semantic Web Conference, Athens, Greece, November 2--6, 2020, Proceedings, Part I (Lecture Notes in Computer Science, Vol. 12506), Jeff Z. Pan, Valentina A. M. Tamma, Claudia d'Amato, Krzysztof Janowicz, Bo Fu, Axel Polleres, Oshani Seneviratne, and Lalana Kagal (Eds.). Springer, 218--237. https://doi.org/10.1007/978--3-030--62419--4_13Google Scholar
Palash Goyal and Emilio Ferrara. 2018. Graph embedding techniques, applications, and performance: A survey. Knowledge-Based Systems 151 (2018), 78--94. https://doi.org/10.1016/j.knosys.2018.03.022 arXiv:1705.02801Google ScholarCross Ref
Aditya Grover and Jure Leskovec. 2016. Node2vec: Scalable Feature Learning for Networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '16). Association for Computing Machinery, New York, NY, USA, 855--864. https://doi.org/10.1145/2939672.2939754Google ScholarDigital Library
Daniel Scott Himmelstein, Antoine Lizee, Christine Hessler, Leo Brueggeman, Sabrina L Chen, Dexter Hadley, Ari Green, Pouya Khankhanian, and Sergio E Baranzini. 2017. Systematic integration of biomedical knowledge prioritizes drugs for repurposing. Elife 6 (2017), e26726.Google ScholarCross Ref
John E. Hopcroft, Rajeev Motwani, and Jeffrey D. Ullman. 2001. Introduction to automata theory, languages, and computation, 2nd Edition. Addison-Wesley-Longman.Google Scholar
Peihao Huang, Yan Huang, Wei Wang, and Liang Wang. 2014. Deep Embedding Network for Clustering. In 22nd International Conference on Pattern Recognition, ICPR 2014, Stockholm, Sweden, August 24--28, 2014. IEEE Computer Society, 1532--1537. https://doi.org/10.1109/ICPR.2014.272Google ScholarDigital Library
Richard J. Cole II and Peter Bruza. 2005. A Bare Bones Approach to Literature-Based Discovery: An Analysis of the Raynaud's/Fish-Oil and Migraine- Magnesium Discoveries in Semantic Space. In Discovery Science, 8th International Conference, DS 2005, Singapore, October 8--11, 2005, Proceedings (Lecture Notes in Computer Science, Vol. 3735), Achim G. Hoffmann, Hiroshi Motoda, and Tobias Scheffer (Eds.). Springer, 84--98. https://doi.org/10.1007/11563983_9Google ScholarDigital Library
Nitisha Jain, Jan-Christoph Kalo, Wolf-Tilo Balke, and Ralf Krestel. 2021. Do Embeddings Actually Capture Knowledge Graph Semantics?. In The Semantic Web - 18th International Conference, ESWC 2021, Virtual Event, June 6--10, 2021, Proceedings (Lecture Notes in Computer Science, Vol. 12731), Ruben Verborgh, Katja Hose, Heiko Paulheim, Pierre-Antoine Champin, Maria Maleshkova, Óscar Corcho, Petar Ristoski, and Mehwish Alam (Eds.). Springer, 143--159. https://doi.org/10.1007/978--3-030--77385--4_9Google Scholar
Nitisha Jain and Ralf Krestel. 2020. Learning Fine-Grained Semantics for Multi Relational Data. In Proceedings of the ISWC 2020 Demos and Industry Tracks: From Novel Ideas to Industrial Practice co-located with 19th International Semantic Web Conference (ISWC 2020), Globally online, November 1--6, 2020 (UTC) (CEUR Workshop Proceedings, Vol. 2721), Kerry L. Taylor, Rafael S. Gonçalves, Freddy Lécué, and Jun Yan (Eds.). CEUR-WS.org, 124--129. http://ceur-ws.org/Vol-2721/paper529.pdfGoogle Scholar
Munui Kim, Seung Han Baek, and Min Song. 2018. Relation extraction for biological pathway construction using node2vec. BMC Bioinformatics 19 (2018). https://doi.org/10.1186/s12859-018--2200--8Google Scholar
T Klabunde. 2007. Chemogenomic approaches to drug discovery: similar receptors bind similar ligands. British Journal of Pharmacology 152, 1 (2007), 5--7.Google ScholarCross Ref
Krys J. Kochut and Maciej Janik. 2007. SPARQLeR: Extended Sparql for Semantic Association Discovery. In The Semantic Web: Research and Applications, 4th European Semantic Web Conference, ESWC 2007, Innsbruck, Austria, June 3--7, 2007, Proceedings (Lecture Notes in Computer Science, Vol. 4519), Enrico Franconi, Michael Kifer, and Wolfgang May (Eds.). Springer, 145--159. https://doi.org/10.1007/978--3--540--72667--8_12Google Scholar
Daniel Korn, Andrew J. Thieme, Vinicius M. Alves, Michael Yeakey, Joyce V.V.B. Borba, Stephen J. Capuzzi, Karamarie Fecho, Chris Bizon, Stephen W. Edwards, Rada Chirkova, Christine M. Colvis, Noel T. Southall, Christopher P. Austin, Eugene N. Muratov, and Alexander Tropsha. 2022. Defining clinical outcome pathways. Drug Discovery Today https://www.sciencedirect.com/science/article/abs/pii/S1359644622000654.Google Scholar
André Koschmieder and Ulf Leser. 2012. Regular Path Queries on Large Graphs. In Scientific and Statistical Database Management - 24th International Conference, SSDBM 2012, Chania, Crete, Greece, June 25--27, 2012. Proceedings (Lecture Notes in Computer Science, Vol. 7338). Springer, 177--194. https://doi.org/10.1007/978--3--642--31235--9_12Google Scholar
Xiujuan Lei and Yueyue Wang. 2020. Predicting Microbe-Disease Association by Learning Graph Representations and Rule-Based Inference on the Heterogeneous Network. Frontiers in Microbiology 11 (2020). https://doi.org/10.3389/fmicb.2020.00579Google Scholar
Ulf Leser. 2005. A query language for biological networks. In ECCB/JBI'05 Proceedings, Fourth European Conference on Computational Biology/Sixth Meeting of the Spanish Bioinformatics Network (Jornadas de BioInformática), Palacio de Congresos, Madrid, Spain, September 28 - October 1, 2005. 39. https://doi.org/10.1093/bioinformatics/bti1105Google ScholarCross Ref
Jure Leskovec and Rok Sosi?. 2016. SNAP: A General-Purpose Network Analysis and Graph-Mining Library. ACM Trans. Intell. Syst. Technol. 8, 1, Article 1 (July 2016), 20 pages. https://doi.org/10.1145/2898361Google ScholarDigital Library
Guanghui Li, Jiawei Luo, Qiu Xiao, Cheng Liang, Pingjian Ding, and Buwen Cao. 2017. Predicting MicroRNA-Disease Associations Using Network Topological Similarity Based on DeepWalk. IEEE Access 5 (2017), 24032--24039. https://doi.org/10.1109/ACCESS.2017.2766758Google ScholarCross Ref
Quanzhong Li and Bongki Moon. 2001. Indexing and Querying XML Data for Regular Path Expressions. In VLDB 2001, Proceedings of 27th International Conference on Very Large Data Bases, September 11--14, 2001, Roma, Italy, Peter M. G. Apers, Paolo Atzeni, Stefano Ceri, Stefano Paraboschi, Kotagiri Ramamohanarao, and Richard T. Snodgrass (Eds.). Morgan Kaufmann, 361--370. http://www.vldb.org/conf/2001/P361.pdfGoogle Scholar
Yahui Long and Jiawei Luo. 2021. Association Mining to Identify Microbe Drug Interactions Based on Heterogeneous Network Embedding Representation. IEEE Journal of Biomedical and Health Informatics 25, 1 (2021), 266--275. https://doi.org/10.1109/JBHI.2020.2998906Google ScholarCross Ref
Ping Luo, Yuanyuan Li, Li Ping Tian, and Fang Xiang Wu. 2019. Enhancing the prediction of disease-gene associations with multimodal deep learning. Bioinformatics 35, 19 (2019), 3735--3742. https://doi.org/10.1093/bioinformatics/btz155Google ScholarCross Ref
James MacQueen et al . 1967. Some methods for classification and analysis of multivariate observations. In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, Vol. 1. Oakland, CA, USA, 281--297.Google Scholar
Alberto O. Mendelzon and Peter T. Wood. 1995. Finding Regular Simple Paths in Graph Databases. SIAM J. Comput. 24, 6 (1995), 1235--1258. https://doi.org/10.1137/S009753979122370XGoogle ScholarDigital Library
Tomás Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. In 1st International Conference on Learning Representations, ICLR 2013, Scottsdale, Arizona, USA, May 2--4, 2013, Workshop Track Proceedings. http://arxiv.org/abs/1301.3781Google Scholar
Tomás Mikolov, Ilya Sutskever, Kai Chen, Gregory S. Corrado, and Jeffrey Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality. In Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5--8, 2013, Lake Tahoe, Nevada, United States, Christopher J. C. Burges, Léon Bottou, Zoubin Ghahramani, and Kilian Q. Weinberger (Eds.). 3111--3119. https://proceedings.neurips.cc/paper/2013/hash/9aa42b31882ec039965f3c4923ce901b-Abstract.htmlGoogle ScholarDigital Library
MIT. 2017. Lark - a parsing toolkit for Python. https://github.com/lark-parser/lark.Google Scholar
Changsung Moon, Paul Jones, and Nagiza F. Samatova. 2017. Learning Entity Type Embeddings for Knowledge Graph Completion. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, CIKM 2017, Singapore, November 06 - 10, 2017, Ee-Peng Lim, Marianne Winslett, Mark Sanderson, Ada Wai- Chee Fu, Jimeng Sun, J. Shane Culpepper, Eric Lo, Joyce C. Ho, Debora Donato, Rakesh Agrawal, Yu Zheng, Carlos Castillo, Aixin Sun, Vincent S. Tseng, and Chenliang Li (Eds.). ACM, 2215--2218. https://doi.org/10.1145/3132847.3133095Google ScholarDigital Library
Frederic Morin and Yoshua Bengio. 2005. Hierarchical Probabilistic Neural Network Language Model. In Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics, AISTATS 2005, Bridgetown, Barbados, January 6--8, 2005, Robert G. Cowell and Zoubin Ghahramani (Eds.). Society for Artificial Intelligence and Statistics. http://www.gatsby.ucl.ac.uk/aistats/fullpapers/208.pdfGoogle Scholar
Kenneth Morton, Patrick Wang, Chris Bizon, Steven Cox, James Balhoff, Yaphet Kebede, Karamarie Fecho, and Alexander Tropsha. 2019. ROBOKOP: an abstraction layer and user interface for knowledge graphs to support question answering. Bioinformatics 35, 24 (2019), 5382--5384. https://doi.org/10.1093/bioinformatics/btz604Google ScholarCross Ref
Walter Nelson, Marinka Zitnik, Bo Wang, Jure Leskovec, Anna Goldenberg, and Roded Sharan. 2019. To embed or not: Network embedding as a paradigm in computational biology. Frontiers in Genetics 10, MAY (2019). https://doi.org/10.3389/fgene.2019.00381Google Scholar
Inc Neo4j. 2020. The Neo4j Python Driver Manual v4.3. https://neo4j.com/docs/pdf/neo4j-driver-manual-4.3-python.pdf.Google Scholar
N. Nosengo. 2016. Can you teach old drugs new tricks? Nature 534 (2016), 314--316. https://doi.org/10.1038/534314aGoogle ScholarCross Ref
Jiajie Peng, Jiaojiao Guan, and Xuequn Shang. 2019. Predicting Parkinson's disease genes based on node2vec and autoencoder. Frontiers in Genetics 10, APR (2019). https://doi.org/10.3389/fgene.2019.00226Google Scholar
Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. DeepWalk: Online Learning of Social Representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '14). Association for Computing Machinery, New York, NY, USA, 701--710. https://doi.org/10.1145/2623330.2623732Google ScholarDigital Library
F. Prati, E. Uliassi, and M. L. Bolognesi. 2014. Two diseases, one approach: multitarget drug discovery in Alzheimer's and neglected tropical diseases. Med- ChemComm 5, 7 (2014), 853--861. https://doi.org/10.1039/C4MD00069BGoogle Scholar
Dragomir R. Radev, Hong Qi, Harris Wu, and Weiguo Fan. 2002. Evaluating web-based question answering systems. Proceedings of the 3rd International Conference on Language Resources and Evaluation, LREC 2002 (2002), 1153--1156.Google Scholar
Andrea Rossi and Antonio Matinata. 2020. Knowledge Graph Embeddings: Are Relation-Learning Models Learning Relations?. In Proceedings of the Workshops of the EDBT/ICDT 2020 Joint Conference, Copenhagen, Denmark, March 30, 2020 (CEUR Workshop Proceedings, Vol. 2578), Alexandra Poulovassilis, David Auber, Nikos Bikakis, Panos K. Chrysanthis, George Papastefanatos, Mohamed A. Sharaf, Nikos Pelekis, Chiara Renso, Yannis Theodoridis, Karine Zeitouni, Tania Cerquitelli, Silvia Chiusano, Genoveva Vargas-Solar, Behrooz Omidvar-Tehrani, Katharina Morik, Jean-Michel Renders, Donatella Firmani, Letizia Tanca, Davide Mottin, Matteo Lissandrini, and Yannis Velegrakis (Eds.). CEUR-WS.org. http://ceur-ws.org/Vol-2578/PIE2.pdfGoogle Scholar
S. Roweis and G. Hinton. 2002. Stochastic Neighbor Embedding. In Proceedings of the 15th International Conference on Neural Information Processing Systems. 857--864.Google Scholar
Daniel Ruffinelli, Samuel Broscheit, and Rainer Gemulla. 2020. You CAN Teach an Old Dog New Tricks! On Training Knowledge Graph Embeddings. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26--30, 2020. OpenReview.net. https://openreview.net/forum?id=BkxSmlBFvrGoogle Scholar
Amit Singhal et al . 2001. Modern information retrieval: A brief overview. IEEE Data Eng. Bull. 24, 4 (2001), 35--43.Google Scholar
Michael Sipser. 2013. Introduction to the theory of computation. Boston, MA : Cengage Learning, [2013], Boston, MA.Google Scholar
Sunghwan Sohn, Kavishwar B Wagholikar, Dingcheng Li, Siddhartha R Jonnalagadda, Cui Tao, Ravikumar Komandur Elayavilli, and Hongfang Liu. 2013. Comprehensive temporal information detection from clinical text: medical events, time, and TLINK identification. Journal of the American Medical Informatics Association 20, 5 (04 2013), 836--842. https://doi.org/10.1136/amiajnl-2013-001622 arXiv:https://academic.oup.com/jamia/article-pdf/20/5/836/5875467/20--5--836.pdfGoogle ScholarCross Ref
Chang Su, Jie Tong, Yongjun Zhu, Peng Cui, and Fei Wang. 2020. Network embedding in biomedical data science. Briefings in Bioinformatics 21, 1 (2020), 182--197. https://doi.org/10.1093/bib/bby117Google ScholarCross Ref
D.R. Swanson. 1986. Fish oil, Raynaud's syndrome, and undiscovered public knowledge. Perspect Biol Med 30, 1 (1986), 7--18. https://doi.org/10.1353/pbm.1986.0087Google ScholarCross Ref
D.R. Swanson. 1988. Migraine and magnesium: eleven neglected connections. Perspect Biol Med 31, 4 (1988), 526--557. https://doi.org/10.1353/pbm.1986.0087Google ScholarCross Ref
George J. Tourlakis. 2012. Theory of computation. Hoboken, N.J. : Wiley, 2012., Hoboken, N.J.Google ScholarDigital Library
Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research 9, 11 (2008).Google Scholar
Alina Vretinaris, Chuan Lei, Vasilis Efthymiou, Xiao Qin, and Fatma Özcan. 2021. Medical Entity Disambiguation Using Graph Neural Networks. In SIGMOD '21: International Conference on Management of Data, Virtual Event, China, June 20--25, 2021, Guoliang Li, Zhanhuai Li, Stratos Idreos, and Divesh Srivastava (Eds.). ACM, 2310--2318. https://doi.org/10.1145/3448016.3457328Google ScholarDigital Library
Hongzhi Wang, Jiabao Han, Bin Shao, and Jianzhong Li. 2019. Regular Expression Matching on billion-nodes Graphs. CoRR abs/1904.11653 (2019). arXiv:1904.11653 http://arxiv.org/abs/1904.11653Google Scholar
Quan Wang, Zhendong Mao, Bin Wang, and Li Guo. 2017. Knowledge graph embedding: A survey of approaches and applications. IEEE Transactions on Knowledge and Data Engineering 29, 12 (2017), 2724--2743. https://doi.org/10.1109/TKDE.2017.2754499Google ScholarCross Ref
Bo Xu, Yu Liu, Shuo Yu, Lei Wang, Jie Dong, Hongfei Lin, Zhihao Yang, Jian Wang, and Feng Xia. 2019. A network embedding model for pathogenic genes prediction by multi-path random walking on heterogeneous network. BMC Medical Genomics 12 (2019). https://doi.org/10.1186/s12920-019-0627-zGoogle Scholar
Xiao Yu, Yizhou Sun, Peixiang Zhao, and Jiawei Han. 2012. Query-Driven Discovery of Semantically Similar Substructures in Heterogeneous Networks. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '12). Association for Computing Machinery, New York, NY, USA, 1500--1503. https://doi.org/10.1145/2339530.2339765Google ScholarDigital Library
Xiang Yue, Zhen Wang, Jingong Huang, Srinivasan Parthasarathy, Soheil Moosavinasab, Yungui Huang, Simon M. Lin, Wen Zhang, Ping Zhang, and Huan Sun. 2020. Graph embedding on biomedical networks: Methods, applications and evaluations. Bioinformatics 36, 4 (2020), 1241--1251. https://doi.org/10.1093/bioinformatics/btz718 arXiv:1906.05017Google ScholarCross Ref
Harald Zauner, Benedikt Linse, Tim Furche, and François Bry. 2010. A RPL through RDF: Expressive Navigation in RDF Graphs. In Web Reasoning and Rule Systems - Fourth International Conference, RR 2010, Bressanone/Brixen, Italy, September 22--24, 2010. Proceedings (Lecture Notes in Computer Science, Vol. 6333), Pascal Hitzler and Thomas Lukasiewicz (Eds.). Springer, 251--257. https://doi.org/10.1007/978--3--642--15918--3_25Google ScholarCross Ref
Siyi Zhu, Jiaxin Bing, Xiaoping Min, Chen Lin, and Xiangxiang Zeng. 2018. Prediction of Drug--Gene Interaction by Using Metapath2vec. Frontiers in Genetics 9 (2018). https://doi.org/10.3389/fgene.2018.00248Google Scholar
Nansu Zong, Hyeoneui Kim, Victoria Ngo, and Olivier Harismendy. 2017. Deep mining heterogeneous networks of biomedical linked data to predict novel drug-target associations. Bioinformatics 33, 15 (2017), 2337--2344. https://doi.org/10.1093/bioinformatics/btx160Google ScholarCross Ref

Index Terms

Compact Walks: Taming Knowledge-Graph Embeddings with Domain- and Task-Specific Pathways

Recommendations

Learning Entity Type Embeddings for Knowledge Graph Completion
CIKM '17: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management

Missing data is a severe problem for algorithms that operate over knowledge graphs (KGs). Most previous research in KG completion has focused on the problem of inferring missing entities and missing relation types between entities. However, in addition ...
Read More
A survey on knowledge graph embeddings with literals: Which model links better literal-ly?
Advancing Agriculture through Semantic Data Management

Knowledge Graphs (KGs) are composed of structured information about a particular domain in the form of entities and relations. In addition to the structured information KGs help in facilitating interconnectivity and interoperability between different ...
Read More
Extracting entity-specific substructures for RDF graph embeddings
Knowledge Graphs: Construction, Management and Querying

Knowledge Graphs (KGs) have become useful sources of structured data for information retrieval and data analytics tasks. Enabling complex analytics, however, requires entities in KGs to be represented in a way that is suitable for Machine Learning tasks. ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGMOD '22: Proceedings of the 2022 International Conference on Management of Data
June 2022
2597 pages
ISBN:9781450392495
DOI:10.1145/3514221
General Chair:
Zachary Ives
University of Pennsylvania (USA)
,
Program Chairs:
Angela Bonifati
Lyon 1 University (France)
,
Amr El Abbadi
University of California, Santa Barbara (USA)
Copyright © 2022 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 11 June 2022
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Badges
- Artifacts Evaluated & Reusable / v1.1
Author Tags
KG embeddings
biomedical knowledge graphs (kgs)
domain- and task-specific regular expressions for creating node neighborhoods
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate785of4,003submissions,20%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 648
  Total Downloads
- Downloads (Last 12 months)301
- Downloads (Last 6 weeks)58
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Compact Walks: Taming Knowledge-Graph Embeddings with Domain- and Task-Specific Pathways

SIGMOD '22: Proceedings of the 2022 International Conference on Management of Data

ABSTRACT

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

Learning Entity Type Embeddings for Knowledge Graph Completion

A survey on knowledge graph embeddings with literals: Which model links better literal-ly?

Extracting entity-specific substructures for RDF graph embeddings