skip to main content
research-article

Pairwise Link Prediction Model for Out of Vocabulary Knowledge Base Entities

Published: 02 September 2020 Publication History

Abstract

Real-world knowledge bases such as DBPedia, Yago, and Freebase contain sparse linkage connectivity, which poses a severe challenge to link prediction between entities. To cope with such data scarcity issues, recent models have focused on learning interactions between entity pairs by means of relations that exist between them. However promising, some relations are associated with very few tail entities or head entities, resulting in poor estimation of the relation interaction between entities. In this article, we break the sole dependency of modeling relation interactions between entity pairs by associating a triple with pairwise embeddings, i.e., distributed vector representations for pairs of word-based entities and relation of a triple. We capture the interactions that exist between pairwise embeddings by means of a Pairwise Factorization Model that employs a factorization machine with relation attention. This approach allows parameters for related interactions to be estimated efficiently, ensuring that the pairwise embeddings are discriminative, providing strong supervisory signals for the decoding task of link prediction. The Pairwise Factorization Model we propose exploits a neural bag-of-words model as the encoder, which effectively encodes word-based entities into distributed vector representations for the decoder. The proposed model is simple and enjoys efficiency and capability, showing superior link prediction performance over state-of-the-art complex models on benchmark datasets DBPedia50K and FB15K-237.

References

[1]
Sören Auer, Christian Bizer, Georgi Kobilarov, Jens Lehmann, Richard Cyganiak, and Zachary G. Ives. 2007. DBpedia: A nucleus for a web of open data. In Proceedings of the 6th International Semantic Web and 2nd Asian Conference on Asian Semantic Web Conference (ISWC’07/ASWC’07). 722--735.
[2]
Kurt D. Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, and Jamie Taylor. 2008. Freebase: A collaboratively created graph database for structuring human knowledge. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 1247--1250.
[3]
Antoine Bordes, Nicolas Usunier, Alberto Garcia-Durán, Jason Weston, and Oksana Yakhnenko. 2013. Translating embeddings for modeling multi-relational data. In Proceedings of the 26th International Conference on Advances in Neural Information Processing Systems. 2787--2795.
[4]
Silviu Cucerzan. 2007. Large-scale named entity disambiguation based on Wikipedia data. In Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL’07). 708--716.
[5]
Jeffrey Dalton, Laura Dietz, and James Allan. 2014. Entity query feature expansion using knowledge base links. In Proceedings of the 37th International ACM SIGIR Conference on Research 8 Development in Information Retrieval. 365--374.
[6]
David A. Ferrucci, Eric W. Brown, Jennifer Chu-Carroll, James Fan, David Gondek, Aditya Kalyanpur, Adam Lally, J. William Murdock, Eric Nyberg, John M. Prager, Nico Schlaefer, and Christopher A. Welty. 2010. Building Watson: An overview of the DeepQA project. AI Mag. 31, 3 (2010), 59--79.
[7]
Alberto Garcia-Durán, Antoine Bordes, and Nicolas Usunier. 2015. Composing relationships with translations. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 286--290.
[8]
Takuo Hamaguchi, Hidekazu Oiwa, Masashi Shimbo, and Yuji Matsumoto. 2017. Knowledge transfer for out-of-knowledge-base entities: A graph neural network approach. In Proceedings of the 26th International Joint Conference on Artificial Intelligence. 1802--1808.
[9]
William L. Hamilton, Rex Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. arXiv preprint arXiv:1706.02216 (2017).
[10]
Song Han, Jeff Pool, John Tran, and William Dally. 2015. Learning both weights and connections for efficient neural network. In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 1135--1143.
[11]
Shizhu He, Kang Liu, Guoliang Ji, and Jian Zhao. 2015. Learning to represent knowledge graphs with Gaussian embedding. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management. 623--632.
[12]
Xiangnan He and Tat-Seng Chua. 2017. Neural factorization machines for sparse predictive analytics. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’17).
[13]
Liangjie Hong, Aziz S. Doumith, and Brian D. Davison. 2013. Co-factorization machines: Modeling user interests and predicting individual decisions in Twitter. In Proceedings of the 6th ACM International Conference on Web Search and Data Mining. 557--566.
[14]
Heyan Huang, Xiaochi Wei, Liqiang Nie, Xianling Mao, and Xin-Shun Xu. 2019. From question to text: Question-oriented feature attention for answer selection. ACM Trans. Inf. Syst. 37, 1 (2019), 6:1--6:33.
[15]
Yuchin Juan, Yong Zhuang, Wei-Sheng Chin, and Chih-Jen Lin. 2016. Field-aware factorization machines for CTR prediction. In Proceedings of the 10th ACM Conference on Recommender Systems. 43--50.
[16]
Thomas N. Kipf and Max Welling. 2017. Semi-supervised classification with graph convolutional networks. In Proceedings of the International Conference on Learning Representations.
[17]
Fanshuang Kong, Richong Zhang, Hongyu Guo, Samuel Mensah, Zhiyuan Hu, and Yongyi Mao. 2019. A neural bag-of-words modelling framework for link prediction in knowledge bases with sparse connectivity. In Proceedings of the World Wide Web Conference. 2929--2935.
[18]
Yankai Lin, Zhiyuan Liu, Huanbo Luan, Maosong Sun, Siwei Rao, and Song Liu. 2015. Modeling relation paths for representation learning of knowledge bases. arXiv preprint arXiv:1506.00379 (2015).
[19]
Yankai Lin, Zhiyuan Liu, Maosong Sun, Yang Liu, and Xuan Zhu. 2015. Learning entity and relation embeddings for knowledge graph completion. In Proceedings of the 29th AAAI Conference on Artificial Intelligence (AAAI’15). 2181--2187.
[20]
Yankai Lin, Zhiyuan Liu, Maosong Sun, Yang Liu, and Xuan Zhu. 2015. Learning entity and relation embeddings for knowledge graph completion. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI’15). 2181--2187.
[21]
Hanxiao Liu, Yuexin Wu, and Yiming Yang. 2017. Analogical inference for multi-relational embeddings. In Proceedings of the International Conference on Machine Learning. 2168--2178.
[22]
Dat Quoc Nguyen, Kairit Sirts, Lizhen Qu, and Mark Johnson. 2016. STransE: A novel embedding model of entities and relationships in knowledge bases. arXiv preprint arXiv:1606.08140 (2016).
[23]
Maximilian Nickel, Lorenzo Rosasco, and Tomaso A. Poggio. 2016. Holographic embeddings of knowledge graphs. In Proceedings of the National Conference on Artificial Intelligence. 1955--1961.
[24]
Richard J. Oentaryo, Ee-Peng Lim, Jia-Wei Low, David Lo, and Michael Finegold. 2014. Predicting response in mobile advertising with hierarchical importance-aware factorization machine. In Proceedings of the 7th ACM International Conference on Web Search and Data Mining. 123--132.
[25]
Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. GloVe: Global vectors for word representation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 1532--1543.
[26]
Fabio Petroni, Luciano Del Corro, and Rainer Gemulla. 2015. CORE: Context-aware open relation extraction with factorization machines. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’15). 1763--1773.
[27]
Steffen Rendle. 2010. Factorization machines. In Proceedings of the IEEE International Conference on Data Mining.
[28]
Michael Sejr Schlichtkrull, Thomas N. Kipf, Peter Bloem, Rianne van den Berg, Ivan Titov, and Max Welling. 2018. Modeling relational data with graph convolutional networks. In Proceedings of the Extended Semantic Web Conference. 593--607.
[29]
Haseeb Shah, Johannes Villmow, Adrian Ulges, Ulrich Schwanecke, and Faisal Shafait. 2019. An open-world extension to knowledge graph completion models. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 3044--3051.
[30]
Baoxu Shi and Tim Weninger. 2016. ProjE: Embedding projection for knowledge graph completion. arXiv preprint arXiv:1611.05425 1 (2016).
[31]
Baoxu Shi and Tim Weninger. 2018. Open-world knowledge graph completion. In Proceedings of the National Conference on Artificial Intelligence.
[32]
Fabian M. Suchanek, Gjergji Kasneci, and Gerhard Weikum. 2007. YAGO: A core of semantic knowledge. In Proceedings of the 16th International Conference on World Wide Web. 697--706.
[33]
Théo Trouillon, Johannes Welbl, Sebastian Riedel, Ãric Gaussier, and Guillaume Bouchard. 2016. Complex embeddings for simple link prediction. In Proceedings of the 33rd International Conference on International Conference on Machine Learning. 2071--2080.
[34]
Zhen Wang, Jianwen Zhang, Jianlin Feng, and Zheng Chen. 2014. Knowledge graph embedding by translating on hyperplanes. In Proceedings of the 28th AAAI Conference on Artificial Intelligence (AAAI’14). 1112--1119.
[35]
Johannes Welbl, Guillaume Bouchard, and Sebastian Riedel. 2016. A factorization machine framework for testing bigram embeddings in knowledgebase completion. In Proceedings of the 5th Workshop on Automated Knowledge Base Construction.
[36]
Ruobing Xie, Zhiyuan Liu, Jia Jia, Huanbo Luan, and Maosong Sun. 2016. Representation learning of knowledge graphs with entity descriptions. In Proceedings of the 30th AAAI Conference on Artificial Intelligence (AAAI’16). 2659--2665.
[37]
Chenyan Xiong, Russell Power, and Jamie Callan. 2017. Explicit semantic ranking for academic search via knowledge graph embedding. In Proceedings of the 26th International Conference on World Wide Web (WWW’17). 1271--1279.
[38]
Bishan Yang and Tom M. Mitchell. 2017. Leveraging knowledge bases in LSTMs for improving machine reading. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Vol. 1. 1436--1446.
[39]
Bishan Yang, Wen-tau Yih, Xiaodong He, Jianfeng Gao, and Li Deng. 2015. Embedding entities and relations for learning and inference in knowledge bases. In Proceedings of the International Conference on Learning Representations.
[40]
Xiaowang Zhang, Mingyue Zhang, Peng Peng, Jiaming Song, Zhiyong Feng, and Lei Zou. 2018. gSMat: A scalable sparse matrix-based join for SPARQL query processing. arXiv preprint arXiv:1807.07691 (2018).
[41]
Guangyou Zhou and Jimmy Xiangji Huang. 2017. Modeling and mining domain shared knowledge for sentiment analysis. ACM Trans. Inf. Syst. 36, 2 (2017), 18:1--18:36.
[42]
Marinka Zitnik, Monica Agrawal, and Jure Leskovec. 2018. Modeling polypharmacy side effects with graph convolutional networks. Intell. Syst. Molec. Biol. 34, 13 (2018), 258814.

Cited By

View all
  • (2024)Learning distributed representations of knowledge that preserve deductive reasoningKnowledge-Based Systems10.1016/j.knosys.2024.111635293(111635)Online publication date: Jun-2024
  • (2023)Relation-aware Graph Convolutional Networks for Multi-relational Network AlignmentACM Transactions on Intelligent Systems and Technology10.1145/357982714:2(1-23)Online publication date: 9-Jan-2023
  • (2022)Knowledge Base Embedding for Sampling-Based PredictionACM Transactions on Information Systems10.1145/353376941:2(1-25)Online publication date: 11-Jun-2022
  • Show More Cited By

Index Terms

  1. Pairwise Link Prediction Model for Out of Vocabulary Knowledge Base Entities

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Information Systems
      ACM Transactions on Information Systems  Volume 38, Issue 4
      October 2020
      375 pages
      ISSN:1046-8188
      EISSN:1558-2868
      DOI:10.1145/3402434
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 02 September 2020
      Accepted: 01 June 2020
      Revised: 01 April 2020
      Received: 01 September 2019
      Published in TOIS Volume 38, Issue 4

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Knowledge bases
      2. graph convolutional networks
      3. representation learning

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Funding Sources

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)27
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 28 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Learning distributed representations of knowledge that preserve deductive reasoningKnowledge-Based Systems10.1016/j.knosys.2024.111635293(111635)Online publication date: Jun-2024
      • (2023)Relation-aware Graph Convolutional Networks for Multi-relational Network AlignmentACM Transactions on Intelligent Systems and Technology10.1145/357982714:2(1-23)Online publication date: 9-Jan-2023
      • (2022)Knowledge Base Embedding for Sampling-Based PredictionACM Transactions on Information Systems10.1145/353376941:2(1-25)Online publication date: 11-Jun-2022
      • (2022)Learning Implicit and Explicit Multi-task Interactions for Information ExtractionACM Transactions on Information Systems10.1145/353302041:2(1-29)Online publication date: 11-Jun-2022
      • (2022)Cross-Modal Knowledge Graph Construction for Multiple Food AdditivesProceedings of 2022 Chinese Intelligent Systems Conference10.1007/978-981-19-6226-4_80(839-847)Online publication date: 24-Sep-2022
      • (2021)LinkPred: a high performance library for link prediction in complex networksPeerJ Computer Science10.7717/peerj-cs.5217(e521)Online publication date: 21-May-2021
      • (2021)LkeRec: Toward Lightweight End-to-End Joint Representation Learning for Building Accurate and Effective RecommendationACM Transactions on Information Systems10.1145/348667340:3(1-28)Online publication date: 14-Dec-2021
      • (2021)Profiling Users for Question Answering Communities via Flow-Based Constrained Co-Embedding ModelACM Transactions on Information Systems10.1145/347056540:2(1-38)Online publication date: 24-Nov-2021
      • (2021)A Review on Question Generation from Natural Language TextACM Transactions on Information Systems10.1145/346888940:1(1-43)Online publication date: 8-Sep-2021
      • (2021)CHA: Categorical Hierarchy-based Attention for Next POI RecommendationACM Transactions on Information Systems10.1145/346430040:1(1-22)Online publication date: 8-Sep-2021
      • Show More Cited By

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media