research-article

Pairwise Link Prediction Model for Out of Vocabulary Knowledge Base Entities

Authors:

Fanshuang Kong,

Xudong LiuAuthors Info & Claims

ACM Transactions on Information Systems (TOIS), Volume 38, Issue 4

Article No.: 36, Pages 1 - 28

https://doi.org/10.1145/3406116

Published: 02 September 2020 Publication History

Abstract

Real-world knowledge bases such as DBPedia, Yago, and Freebase contain sparse linkage connectivity, which poses a severe challenge to link prediction between entities. To cope with such data scarcity issues, recent models have focused on learning interactions between entity pairs by means of relations that exist between them. However promising, some relations are associated with very few tail entities or head entities, resulting in poor estimation of the relation interaction between entities. In this article, we break the sole dependency of modeling relation interactions between entity pairs by associating a triple with pairwise embeddings, i.e., distributed vector representations for pairs of word-based entities and relation of a triple. We capture the interactions that exist between pairwise embeddings by means of a Pairwise Factorization Model that employs a factorization machine with relation attention. This approach allows parameters for related interactions to be estimated efficiently, ensuring that the pairwise embeddings are discriminative, providing strong supervisory signals for the decoding task of link prediction. The Pairwise Factorization Model we propose exploits a neural bag-of-words model as the encoder, which effectively encodes word-based entities into distributed vector representations for the decoder. The proposed model is simple and enjoys efficiency and capability, showing superior link prediction performance over state-of-the-art complex models on benchmark datasets DBPedia50K and FB15K-237.

References

[1]

Sören Auer, Christian Bizer, Georgi Kobilarov, Jens Lehmann, Richard Cyganiak, and Zachary G. Ives. 2007. DBpedia: A nucleus for a web of open data. In Proceedings of the 6th International Semantic Web and 2nd Asian Conference on Asian Semantic Web Conference (ISWC’07/ASWC’07). 722--735.

[2]

Kurt D. Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, and Jamie Taylor. 2008. Freebase: A collaboratively created graph database for structuring human knowledge. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 1247--1250.

Digital Library

[3]

Antoine Bordes, Nicolas Usunier, Alberto Garcia-Durán, Jason Weston, and Oksana Yakhnenko. 2013. Translating embeddings for modeling multi-relational data. In Proceedings of the 26th International Conference on Advances in Neural Information Processing Systems. 2787--2795.

[4]

Silviu Cucerzan. 2007. Large-scale named entity disambiguation based on Wikipedia data. In Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL’07). 708--716.

[5]

Jeffrey Dalton, Laura Dietz, and James Allan. 2014. Entity query feature expansion using knowledge base links. In Proceedings of the 37th International ACM SIGIR Conference on Research 8 Development in Information Retrieval. 365--374.

Digital Library

[6]

David A. Ferrucci, Eric W. Brown, Jennifer Chu-Carroll, James Fan, David Gondek, Aditya Kalyanpur, Adam Lally, J. William Murdock, Eric Nyberg, John M. Prager, Nico Schlaefer, and Christopher A. Welty. 2010. Building Watson: An overview of the DeepQA project. AI Mag. 31, 3 (2010), 59--79.

Digital Library

[7]

Alberto Garcia-Durán, Antoine Bordes, and Nicolas Usunier. 2015. Composing relationships with translations. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 286--290.

[8]

Takuo Hamaguchi, Hidekazu Oiwa, Masashi Shimbo, and Yuji Matsumoto. 2017. Knowledge transfer for out-of-knowledge-base entities: A graph neural network approach. In Proceedings of the 26th International Joint Conference on Artificial Intelligence. 1802--1808.

[9]

William L. Hamilton, Rex Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. arXiv preprint arXiv:1706.02216 (2017).

[10]

Song Han, Jeff Pool, John Tran, and William Dally. 2015. Learning both weights and connections for efficient neural network. In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 1135--1143.

[11]

Shizhu He, Kang Liu, Guoliang Ji, and Jian Zhao. 2015. Learning to represent knowledge graphs with Gaussian embedding. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management. 623--632.

Digital Library

[12]

Xiangnan He and Tat-Seng Chua. 2017. Neural factorization machines for sparse predictive analytics. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’17).

Digital Library

[13]

Liangjie Hong, Aziz S. Doumith, and Brian D. Davison. 2013. Co-factorization machines: Modeling user interests and predicting individual decisions in Twitter. In Proceedings of the 6th ACM International Conference on Web Search and Data Mining. 557--566.

[14]

Heyan Huang, Xiaochi Wei, Liqiang Nie, Xianling Mao, and Xin-Shun Xu. 2019. From question to text: Question-oriented feature attention for answer selection. ACM Trans. Inf. Syst. 37, 1 (2019), 6:1--6:33.

Digital Library

[15]

Yuchin Juan, Yong Zhuang, Wei-Sheng Chin, and Chih-Jen Lin. 2016. Field-aware factorization machines for CTR prediction. In Proceedings of the 10th ACM Conference on Recommender Systems. 43--50.

Digital Library

[16]

Thomas N. Kipf and Max Welling. 2017. Semi-supervised classification with graph convolutional networks. In Proceedings of the International Conference on Learning Representations.

[17]

Fanshuang Kong, Richong Zhang, Hongyu Guo, Samuel Mensah, Zhiyuan Hu, and Yongyi Mao. 2019. A neural bag-of-words modelling framework for link prediction in knowledge bases with sparse connectivity. In Proceedings of the World Wide Web Conference. 2929--2935.

Digital Library

[18]

Yankai Lin, Zhiyuan Liu, Huanbo Luan, Maosong Sun, Siwei Rao, and Song Liu. 2015. Modeling relation paths for representation learning of knowledge bases. arXiv preprint arXiv:1506.00379 (2015).

[19]

Yankai Lin, Zhiyuan Liu, Maosong Sun, Yang Liu, and Xuan Zhu. 2015. Learning entity and relation embeddings for knowledge graph completion. In Proceedings of the 29th AAAI Conference on Artificial Intelligence (AAAI’15). 2181--2187.

Digital Library

[20]

Yankai Lin, Zhiyuan Liu, Maosong Sun, Yang Liu, and Xuan Zhu. 2015. Learning entity and relation embeddings for knowledge graph completion. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI’15). 2181--2187.

[21]

Hanxiao Liu, Yuexin Wu, and Yiming Yang. 2017. Analogical inference for multi-relational embeddings. In Proceedings of the International Conference on Machine Learning. 2168--2178.

[22]

Dat Quoc Nguyen, Kairit Sirts, Lizhen Qu, and Mark Johnson. 2016. STransE: A novel embedding model of entities and relationships in knowledge bases. arXiv preprint arXiv:1606.08140 (2016).

[23]

Maximilian Nickel, Lorenzo Rosasco, and Tomaso A. Poggio. 2016. Holographic embeddings of knowledge graphs. In Proceedings of the National Conference on Artificial Intelligence. 1955--1961.

[24]

Richard J. Oentaryo, Ee-Peng Lim, Jia-Wei Low, David Lo, and Michael Finegold. 2014. Predicting response in mobile advertising with hierarchical importance-aware factorization machine. In Proceedings of the 7th ACM International Conference on Web Search and Data Mining. 123--132.

Digital Library

[25]

Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. GloVe: Global vectors for word representation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 1532--1543.

[26]

Fabio Petroni, Luciano Del Corro, and Rainer Gemulla. 2015. CORE: Context-aware open relation extraction with factorization machines. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’15). 1763--1773.

[27]

Steffen Rendle. 2010. Factorization machines. In Proceedings of the IEEE International Conference on Data Mining.

Digital Library

[28]

Michael Sejr Schlichtkrull, Thomas N. Kipf, Peter Bloem, Rianne van den Berg, Ivan Titov, and Max Welling. 2018. Modeling relational data with graph convolutional networks. In Proceedings of the Extended Semantic Web Conference. 593--607.

[29]

Haseeb Shah, Johannes Villmow, Adrian Ulges, Ulrich Schwanecke, and Faisal Shafait. 2019. An open-world extension to knowledge graph completion models. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 3044--3051.

Digital Library

[30]

Baoxu Shi and Tim Weninger. 2016. ProjE: Embedding projection for knowledge graph completion. arXiv preprint arXiv:1611.05425 1 (2016).

[31]

Baoxu Shi and Tim Weninger. 2018. Open-world knowledge graph completion. In Proceedings of the National Conference on Artificial Intelligence.

[32]

Fabian M. Suchanek, Gjergji Kasneci, and Gerhard Weikum. 2007. YAGO: A core of semantic knowledge. In Proceedings of the 16th International Conference on World Wide Web. 697--706.

Digital Library

[33]

Théo Trouillon, Johannes Welbl, Sebastian Riedel, Ãric Gaussier, and Guillaume Bouchard. 2016. Complex embeddings for simple link prediction. In Proceedings of the 33rd International Conference on International Conference on Machine Learning. 2071--2080.

[34]

Zhen Wang, Jianwen Zhang, Jianlin Feng, and Zheng Chen. 2014. Knowledge graph embedding by translating on hyperplanes. In Proceedings of the 28th AAAI Conference on Artificial Intelligence (AAAI’14). 1112--1119.

[35]

Johannes Welbl, Guillaume Bouchard, and Sebastian Riedel. 2016. A factorization machine framework for testing bigram embeddings in knowledgebase completion. In Proceedings of the 5th Workshop on Automated Knowledge Base Construction.

[36]

Ruobing Xie, Zhiyuan Liu, Jia Jia, Huanbo Luan, and Maosong Sun. 2016. Representation learning of knowledge graphs with entity descriptions. In Proceedings of the 30th AAAI Conference on Artificial Intelligence (AAAI’16). 2659--2665.

[37]

Chenyan Xiong, Russell Power, and Jamie Callan. 2017. Explicit semantic ranking for academic search via knowledge graph embedding. In Proceedings of the 26th International Conference on World Wide Web (WWW’17). 1271--1279.

Digital Library

[38]

Bishan Yang and Tom M. Mitchell. 2017. Leveraging knowledge bases in LSTMs for improving machine reading. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Vol. 1. 1436--1446.

[39]

Bishan Yang, Wen-tau Yih, Xiaodong He, Jianfeng Gao, and Li Deng. 2015. Embedding entities and relations for learning and inference in knowledge bases. In Proceedings of the International Conference on Learning Representations.

[40]

Xiaowang Zhang, Mingyue Zhang, Peng Peng, Jiaming Song, Zhiyong Feng, and Lei Zou. 2018. gSMat: A scalable sparse matrix-based join for SPARQL query processing. arXiv preprint arXiv:1807.07691 (2018).

[41]

Guangyou Zhou and Jimmy Xiangji Huang. 2017. Modeling and mining domain shared knowledge for sentiment analysis. ACM Trans. Inf. Syst. 36, 2 (2017), 18:1--18:36.

Digital Library

[42]

Marinka Zitnik, Monica Agrawal, and Jure Leskovec. 2018. Modeling polypharmacy side effects with graph convolutional networks. Intell. Syst. Molec. Biol. 34, 13 (2018), 258814.

Cited By

Chen YMensah SLi J(2024)Learning distributed representations of knowledge that preserve deductive reasoningKnowledge-Based Systems10.1016/j.knosys.2024.111635293(111635)Online publication date: Jun-2024
https://doi.org/10.1016/j.knosys.2024.111635
Fang YLi XYe RTan XZhao PWang M(2023)Relation-aware Graph Convolutional Networks for Multi-relational Network AlignmentACM Transactions on Intelligent Systems and Technology10.1145/357982714:2(1-23)Online publication date: 9-Jan-2023
https://dl.acm.org/doi/10.1145/3579827
Zhang RKim JMei JMao Y(2022)Knowledge Base Embedding for Sampling-Based PredictionACM Transactions on Information Systems10.1145/353376941:2(1-25)Online publication date: 11-Jun-2022
https://dl.acm.org/doi/10.1145/3533769
Show More Cited By

Index Terms

Pairwise Link Prediction Model for Out of Vocabulary Knowledge Base Entities
1. Computing methodologies
  1. Artificial intelligence
    1. Knowledge representation and reasoning
    2. Natural language processing
      1. Information extraction

Recommendations

Discovering and disambiguating named entities in text
SIGMOD'13 PhD Symposium: Proceedings of the 2013 SIGMOD/PODS Ph.D. symposium

Disambiguating named entities in natural language texts maps ambiguous names to canonical entities registered in a knowledge base such as DBpedia, Freebase, or YAGO. Knowing the specific entity is an important asset for several other tasks, e.g. entity-...
Incorporating topic and property for knowledge base synchronization
Abstract
Open-domain knowledge bases have been widely used in many applications, and it is critical to maintain their freshness. Most existing studies update an open knowledge base by predicting the change frequencies of the entities and then updating ...
Taxonomical hierarchy of canonicalized relations from multiple Knowledge Bases
CoDS COMAD 2020: Proceedings of the 7th ACM IKDD CoDS and 25th COMAD

This work addresses two important questions pertinent to Relation Extraction (RE). First, what are all possible relations that could exist between any two given entity types? Second, how do we define an unambiguous taxonomical (is-a) hierarchy among the ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Information Systems

ACM Transactions on Information Systems Volume 38, Issue 4

October 2020

375 pages

ISSN:1046-8188

EISSN:1558-2868

DOI:10.1145/3402434

Editor:
Min Zhang
Tsinghua University, China

Issue’s Table of Contents

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 September 2020

Accepted: 01 June 2020

Revised: 01 April 2020

Received: 01 September 2019

Published in TOIS Volume 38, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

National Natural Science Foundation of China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

13
Total Citations
View Citations
352
Total Downloads

Downloads (Last 12 months)27
Downloads (Last 6 weeks)1

Reflects downloads up to 28 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Chen YMensah SLi J(2024)Learning distributed representations of knowledge that preserve deductive reasoningKnowledge-Based Systems10.1016/j.knosys.2024.111635293(111635)Online publication date: Jun-2024
https://doi.org/10.1016/j.knosys.2024.111635
Fang YLi XYe RTan XZhao PWang M(2023)Relation-aware Graph Convolutional Networks for Multi-relational Network AlignmentACM Transactions on Intelligent Systems and Technology10.1145/357982714:2(1-23)Online publication date: 9-Jan-2023
https://dl.acm.org/doi/10.1145/3579827
Zhang RKim JMei JMao Y(2022)Knowledge Base Embedding for Sampling-Based PredictionACM Transactions on Information Systems10.1145/353376941:2(1-25)Online publication date: 11-Jun-2022
https://dl.acm.org/doi/10.1145/3533769
Sun KZhang RMensah SMao YLiu X(2022)Learning Implicit and Explicit Multi-task Interactions for Information ExtractionACM Transactions on Information Systems10.1145/353302041:2(1-29)Online publication date: 11-Jun-2022
https://dl.acm.org/doi/10.1145/3533020
Wang DLi HWang WQiao L(2022)Cross-Modal Knowledge Graph Construction for Multiple Food AdditivesProceedings of 2022 Chinese Intelligent Systems Conference10.1007/978-981-19-6226-4_80(839-847)Online publication date: 24-Sep-2022
https://doi.org/10.1007/978-981-19-6226-4_80
Kerrache S(2021)LinkPred: a high performance library for link prediction in complex networksPeerJ Computer Science10.7717/peerj-cs.5217(e521)Online publication date: 21-May-2021
https://doi.org/10.7717/peerj-cs.521
Yan SLin KZheng XWang H(2021)LkeRec: Toward Lightweight End-to-End Joint Representation Learning for Building Accurate and Effective RecommendationACM Transactions on Information Systems10.1145/348667340:3(1-28)Online publication date: 14-Dec-2021
https://dl.acm.org/doi/10.1145/3486673
Liang SLuo YMeng Z(2021)Profiling Users for Question Answering Communities via Flow-Based Constrained Co-Embedding ModelACM Transactions on Information Systems10.1145/347056540:2(1-38)Online publication date: 24-Nov-2021
https://dl.acm.org/doi/10.1145/3470565
Zhang RGuo JChen LFan YCheng X(2021)A Review on Question Generation from Natural Language TextACM Transactions on Information Systems10.1145/346888940:1(1-43)Online publication date: 8-Sep-2021
https://dl.acm.org/doi/10.1145/3468889
Zang HHan DLi XWan ZWang M(2021)CHA: Categorical Hierarchy-based Attention for Next POI RecommendationACM Transactions on Information Systems10.1145/346430040:1(1-22)Online publication date: 8-Sep-2021
https://dl.acm.org/doi/10.1145/3464300
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Issue’s Table of Contents