skip to main content
10.1145/3460210.3493575acmconferencesArticle/Chapter ViewAbstractPublication Pagesk-capConference Proceedingsconference-collections
research-article

Cat2Type: Wikipedia Category Embeddings for Entity Typing in Knowledge Graphs

Published: 02 December 2021 Publication History

Abstract

The entity type information in Knowledge Graphs (KGs) such as DBpedia, Freebase, etc. is often incomplete due to automated generation. Entity Typing is the task of assigning or inferring the semantic type of an entity in a KG. This paper introduces an approach named Cat2Type which exploits the Wikipedia Categories to predict the missing entity types in a KG. This work extracts information from Wikipedia Category names and the Wikipedia Category graph which are the sources of rich semantic information about the entities. In Cat2Type, the characteristic features of the entities encapsulated in Wikipedia Category names are exploited using Neural Language Models. On the other hand, a Wikipedia Category graph is constructed to capture the connection between the categories. The Node level representations are learned by optimizing the neighbourhood information on the Wikipedia category graph. These representations are then used for entity type prediction via classification. The performance of Cat2Type is assessed on two real-world benchmark datasets DBpedia630k and FIGER. The experiments depict that Cat2Type obtained a significant improvement over state-of-the-art approaches.

References

[1]
Mehwish Alam, Aleksey Buzmakov, V'i ctor Codocedo, and Amedeo Napoli. [n.d.]. Mining Definitions from RDF Annotations Using Formal Concept Analysis. In Twenty-Fourth International Joint Conference on Artificial Intelligence 2015 .
[2]
Mohamed Ben Aouicha, Mohamed Ali Hadj Taieb, and Malek Ezzeddine. 2016. Derivation of "is a" taxonomy from Wikipedia Category Graph. Eng. Appl. Artif. Intell. (2016).
[3]
Sören Auer, Christian Bizer, Georgi Kobilarov, Jens Lehmann, Richard Cyganiak, and Zachary Ives. 2007. Dbpedia: A nucleus for a web of open data. In The semantic web .
[4]
Russa Biswas, Radina Sofronova, Mehwish Alam, Nicolas Heist, Heiko Paulheim, and Harald Sack. [n.d.]. Do Judge an Entity by Its Name! Entity Typing Using Language Models. In ESWC 2021 P & D .
[5]
Russa Biswas, Radina Sofronova, Mehwish Alam, and Harald Sack. 2020. Entity Type Prediction in Knowledge Graphs using Embeddings. arXiv (2020).
[6]
Peter Bloem, Xander Wilcke, Lucas van Berkel, and Victor de Boer. 2021. kgbench: A Collection of Knowledge Graph Datasets for Evaluating Relational and Multimodal Machine Learning. In European Semantic Web Conference .
[7]
Kurt Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, and Jamie Taylor. 2008. Freebase: a collaboratively created graph database for structuring human knowledge. In ACM SIGMOD international conference on Management of data .
[8]
Silviu Cucerzan. 2007. Large-Scale Named Entity Disambiguation Based on Wikipedia Data. In Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning .
[9]
Arjun Das, Debasis Ganguly, and Utpal Garain. 2017. Named Entity Recognition with Word Embeddings and Wikipedia Categories for a Low-Resource Language. ACM Trans. Asian Low Resour. Lang. Inf. Process. (2017).
[10]
Gerard De Melo and Gerhard Weikum. 2010. MENTA: Inducing multilingual taxonomies from Wikipedia. In 19th ACM international conference on Information and knowledge management .
[11]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies .
[12]
Kawin Ethayarajh. 2019. How Contextual are Contextualized Word Representations? Comparing the Geometry of BERT, ELMo, and GPT-2 Embeddings. In Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing .
[13]
Marco Fossati, Dimitris Kontokostas, and Jens Lehmann. 2015. Unsupervised learning of an extensive and usable taxonomy for DBpedia. In 11th International Conference on Semantic Systems .
[14]
Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for networks. In 22nd ACM SIGKDD international conference on Knowledge discovery and data mining .
[15]
Nicolas Heist and Heiko Paulheim. 2019. Uncovering the Semantics of Wikipedia Categories. In 18th International Semantic Web Conference .
[16]
Hailong Jin, Lei Hou, Juanzi Li, and Tiansi Dong. 2018. Attributed and Predictive Entity Embedding for Fine-Grained Entity Typing in Knowledge Bases. In 27th International Conference on Computational Linguistics .
[17]
Hailong Jin, Lei Hou, Juanzi Li, and Tiansi Dong. 2019. Fine-Grained Entity Typing via Hierarchical Multi Graph Convolutional Networks. In Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing .
[18]
Qi Liu, Matt J. Kusner, and Phil Blunsom. 2020. A Survey on Contextual Embeddings. CoRR (2020).
[19]
Qiaoling Liu, Kaifeng Xu, Lei Zhang, Haofen Wang, Yong Yu, and Yue Pan. 2008. Catriple: Extracting triples from wikipedia categories. In Asian Semantic Web Conference .
[20]
Xiaofei Ma, Zhiguo Wang, Patrick Ng, Ramesh Nallapati, and Bing Xiang. 2019. Universal text representation from bert: An empirical study. arXiv preprint arXiv:1910.07973 (2019).
[21]
A. Melo, H. Paulheim, and J. Völker. 2016. Type Prediction in RDF Knowledge Bases Using Hierarchical Multilabel Classification. In WIMS .
[22]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Gregory S. Corrado, and Jeffrey Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality. In NIPS .
[23]
H. Paulheim and C. Bizer. 2013. Type Inference on Noisy RDF Data. In ISWC .
[24]
Jeffrey Pennington, Richard Socher, and Christopher D Manning. 2014. Glove: Global vectors for word representation. In Empirical methods in natural language processing .
[25]
Simone Paolo Ponzetto and Michael Strube. 2011. Taxonomy induction based on a collaboratively built knowledge repository. Artificial Intelligence (2011).
[26]
Simone Paolo Ponzetto, Michael Strube, et almbox. 2007. Deriving a large scale taxonomy from Wikipedia. In AAAI .
[27]
Xipeng Qiu, Tianxiang Sun, Yige Xu, Yunfan Shao, Ning Dai, and Xuanjing Huang. 2020. Pre-trained models for natural language processing: A survey. Science China Technological Sciences (2020).
[28]
Alec Radford and Karthik Narasimhan. 2018. Improving Language Understanding by Generative Pre-Training.
[29]
Avijit Thawani, Jay Pujara, Pedro A Szekely, and Filip Ilievski. 2021. Representing Numbers in NLP: a Survey and a Vision. arXiv preprint arXiv:2103.13136 (2021).
[30]
Denny Vrandevc i? and Markus Krötzsch. 2014. Wikidata: a free collaborative knowledgebase. Commun. ACM (2014).
[31]
WX Wilcke, P Bloem, V de Boer, RH van't Veer, and FAH van Harmelen. 2020. End-to-End Entity Classification on Multimodal Knowledge Graphs. arXiv (2020).
[32]
Bo Xu, Yi Zhang, Jiaqing Liang, Yanghua Xiao, Seung-won Hwang, and Wei Wang. 2016. Cross-Lingual Type Inference. In Database Systems for Advanced Applications - 21st International Conference, DASFAA.
[33]
Yadollah Yaghoobzadeh, Heike Adel, and Hinrich Schü tze. 2018. Corpus-Level Fine-Grained Entity Typing. J. Artif. Intell. Res. (2018).
[34]
Yadollah Yaghoobzadeh and Hinrich Schü tze. 2017. Multi-level Representations for Fine-Grained Typing of Knowledge Base Entities. In 15th Conference of the European Chapter of the Association for Computational Linguistics .
[35]
Ikuya Yamada, Akari Asai, Jin Sakuma, Hiroyuki Shindo, Hideaki Takeda, Yoshiyasu Takefuji, and Yuji Matsumoto. 2018. Wikipedia2vec: An efficient toolkit for learning and visualizing the embeddings of words and entities from wikipedia. arXiv preprint arXiv:1812.06280 (2018).
[36]
Xiang Zhang, Junbo Jake Zhao, and Yann LeCun. 2015. Character-level Convolutional Networks for Text Classification. In Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems .

Cited By

View all
  • (2024)Fine-Grained Entity-Type Completion Based on Neighborhood-Attention and Cartesian–Polar Coordinates MappingInternational Journal of Software Engineering and Knowledge Engineering10.1142/S0218194024500268(1-28)Online publication date: 19-Jun-2024
  • (2022)Knowledge Graph Entity Type Prediction with Relational Aggregation Graph Attention NetworkThe Semantic Web10.1007/978-3-031-06981-9_3(39-55)Online publication date: 29-May-2022

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
K-CAP '21: Proceedings of the 11th Knowledge Capture Conference
December 2021
300 pages
ISBN:9781450384575
DOI:10.1145/3460210
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 December 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. entity type prediction
  2. language models
  3. node embeddings
  4. wikipedia categories

Qualifiers

  • Research-article

Conference

K-CAP '21
Sponsor:
K-CAP '21: Knowledge Capture Conference
December 2 - 3, 2021
Virtual Event, USA

Acceptance Rates

Overall Acceptance Rate 55 of 198 submissions, 28%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)26
  • Downloads (Last 6 weeks)0
Reflects downloads up to 10 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Fine-Grained Entity-Type Completion Based on Neighborhood-Attention and Cartesian–Polar Coordinates MappingInternational Journal of Software Engineering and Knowledge Engineering10.1142/S0218194024500268(1-28)Online publication date: 19-Jun-2024
  • (2022)Knowledge Graph Entity Type Prediction with Relational Aggregation Graph Attention NetworkThe Semantic Web10.1007/978-3-031-06981-9_3(39-55)Online publication date: 29-May-2022

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media