research-article

Cat2Type: Wikipedia Category Embeddings for Entity Typing in Knowledge Graphs

Authors:

Radina Sofronova,

Mehwish AlamAuthors Info & Claims

K-CAP '21: Proceedings of the 11th Knowledge Capture Conference

Pages 81 - 88

https://doi.org/10.1145/3460210.3493575

Published: 02 December 2021 Publication History

Abstract

The entity type information in Knowledge Graphs (KGs) such as DBpedia, Freebase, etc. is often incomplete due to automated generation. Entity Typing is the task of assigning or inferring the semantic type of an entity in a KG. This paper introduces an approach named Cat2Type which exploits the Wikipedia Categories to predict the missing entity types in a KG. This work extracts information from Wikipedia Category names and the Wikipedia Category graph which are the sources of rich semantic information about the entities. In Cat2Type, the characteristic features of the entities encapsulated in Wikipedia Category names are exploited using Neural Language Models. On the other hand, a Wikipedia Category graph is constructed to capture the connection between the categories. The Node level representations are learned by optimizing the neighbourhood information on the Wikipedia category graph. These representations are then used for entity type prediction via classification. The performance of Cat2Type is assessed on two real-world benchmark datasets DBpedia630k and FIGER. The experiments depict that Cat2Type obtained a significant improvement over state-of-the-art approaches.

References

[1]

Mehwish Alam, Aleksey Buzmakov, V'i ctor Codocedo, and Amedeo Napoli. [n.d.]. Mining Definitions from RDF Annotations Using Formal Concept Analysis. In Twenty-Fourth International Joint Conference on Artificial Intelligence 2015 .

[2]

Mohamed Ben Aouicha, Mohamed Ali Hadj Taieb, and Malek Ezzeddine. 2016. Derivation of "is a" taxonomy from Wikipedia Category Graph. Eng. Appl. Artif. Intell. (2016).

[3]

Sören Auer, Christian Bizer, Georgi Kobilarov, Jens Lehmann, Richard Cyganiak, and Zachary Ives. 2007. Dbpedia: A nucleus for a web of open data. In The semantic web .

[4]

Russa Biswas, Radina Sofronova, Mehwish Alam, Nicolas Heist, Heiko Paulheim, and Harald Sack. [n.d.]. Do Judge an Entity by Its Name! Entity Typing Using Language Models. In ESWC 2021 P & D .

[5]

Russa Biswas, Radina Sofronova, Mehwish Alam, and Harald Sack. 2020. Entity Type Prediction in Knowledge Graphs using Embeddings. arXiv (2020).

[6]

Peter Bloem, Xander Wilcke, Lucas van Berkel, and Victor de Boer. 2021. kgbench: A Collection of Knowledge Graph Datasets for Evaluating Relational and Multimodal Machine Learning. In European Semantic Web Conference .

Digital Library

[7]

Kurt Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, and Jamie Taylor. 2008. Freebase: a collaboratively created graph database for structuring human knowledge. In ACM SIGMOD international conference on Management of data .

Digital Library

[8]

Silviu Cucerzan. 2007. Large-Scale Named Entity Disambiguation Based on Wikipedia Data. In Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning .

[9]

Arjun Das, Debasis Ganguly, and Utpal Garain. 2017. Named Entity Recognition with Word Embeddings and Wikipedia Categories for a Low-Resource Language. ACM Trans. Asian Low Resour. Lang. Inf. Process. (2017).

[10]

Gerard De Melo and Gerhard Weikum. 2010. MENTA: Inducing multilingual taxonomies from Wikipedia. In 19th ACM international conference on Information and knowledge management .

[11]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies .

[12]

Kawin Ethayarajh. 2019. How Contextual are Contextualized Word Representations? Comparing the Geometry of BERT, ELMo, and GPT-2 Embeddings. In Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing .

[13]

Marco Fossati, Dimitris Kontokostas, and Jens Lehmann. 2015. Unsupervised learning of an extensive and usable taxonomy for DBpedia. In 11th International Conference on Semantic Systems .

Digital Library

[14]

Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for networks. In 22nd ACM SIGKDD international conference on Knowledge discovery and data mining .

Digital Library

[15]

Nicolas Heist and Heiko Paulheim. 2019. Uncovering the Semantics of Wikipedia Categories. In 18th International Semantic Web Conference .

[16]

Hailong Jin, Lei Hou, Juanzi Li, and Tiansi Dong. 2018. Attributed and Predictive Entity Embedding for Fine-Grained Entity Typing in Knowledge Bases. In 27th International Conference on Computational Linguistics .

[17]

Hailong Jin, Lei Hou, Juanzi Li, and Tiansi Dong. 2019. Fine-Grained Entity Typing via Hierarchical Multi Graph Convolutional Networks. In Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing .

[18]

Qi Liu, Matt J. Kusner, and Phil Blunsom. 2020. A Survey on Contextual Embeddings. CoRR (2020).

[19]

Qiaoling Liu, Kaifeng Xu, Lei Zhang, Haofen Wang, Yong Yu, and Yue Pan. 2008. Catriple: Extracting triples from wikipedia categories. In Asian Semantic Web Conference .

Digital Library

[20]

Xiaofei Ma, Zhiguo Wang, Patrick Ng, Ramesh Nallapati, and Bing Xiang. 2019. Universal text representation from bert: An empirical study. arXiv preprint arXiv:1910.07973 (2019).

[21]

A. Melo, H. Paulheim, and J. Völker. 2016. Type Prediction in RDF Knowledge Bases Using Hierarchical Multilabel Classification. In WIMS .

[22]

Tomas Mikolov, Ilya Sutskever, Kai Chen, Gregory S. Corrado, and Jeffrey Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality. In NIPS .

[23]

H. Paulheim and C. Bizer. 2013. Type Inference on Noisy RDF Data. In ISWC .

[24]

Jeffrey Pennington, Richard Socher, and Christopher D Manning. 2014. Glove: Global vectors for word representation. In Empirical methods in natural language processing .

[25]

Simone Paolo Ponzetto and Michael Strube. 2011. Taxonomy induction based on a collaboratively built knowledge repository. Artificial Intelligence (2011).

[26]

Simone Paolo Ponzetto, Michael Strube, et almbox. 2007. Deriving a large scale taxonomy from Wikipedia. In AAAI .

[27]

Xipeng Qiu, Tianxiang Sun, Yige Xu, Yunfan Shao, Ning Dai, and Xuanjing Huang. 2020. Pre-trained models for natural language processing: A survey. Science China Technological Sciences (2020).

[28]

Alec Radford and Karthik Narasimhan. 2018. Improving Language Understanding by Generative Pre-Training.

[29]

Avijit Thawani, Jay Pujara, Pedro A Szekely, and Filip Ilievski. 2021. Representing Numbers in NLP: a Survey and a Vision. arXiv preprint arXiv:2103.13136 (2021).

[30]

Denny Vrandevc i? and Markus Krötzsch. 2014. Wikidata: a free collaborative knowledgebase. Commun. ACM (2014).

[31]

WX Wilcke, P Bloem, V de Boer, RH van't Veer, and FAH van Harmelen. 2020. End-to-End Entity Classification on Multimodal Knowledge Graphs. arXiv (2020).

[32]

Bo Xu, Yi Zhang, Jiaqing Liang, Yanghua Xiao, Seung-won Hwang, and Wei Wang. 2016. Cross-Lingual Type Inference. In Database Systems for Advanced Applications - 21st International Conference, DASFAA.

[33]

Yadollah Yaghoobzadeh, Heike Adel, and Hinrich Schü tze. 2018. Corpus-Level Fine-Grained Entity Typing. J. Artif. Intell. Res. (2018).

[34]

Yadollah Yaghoobzadeh and Hinrich Schü tze. 2017. Multi-level Representations for Fine-Grained Typing of Knowledge Base Entities. In 15th Conference of the European Chapter of the Association for Computational Linguistics .

[35]

Ikuya Yamada, Akari Asai, Jin Sakuma, Hiroyuki Shindo, Hideaki Takeda, Yoshiyasu Takefuji, and Yuji Matsumoto. 2018. Wikipedia2vec: An efficient toolkit for learning and visualizing the embeddings of words and entities from wikipedia. arXiv preprint arXiv:1812.06280 (2018).

[36]

Xiang Zhang, Junbo Jake Zhao, and Yann LeCun. 2015. Character-level Convolutional Networks for Text Classification. In Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems .

Cited By

Zhang XLi XWang H(2024)Fine-Grained Entity-Type Completion Based on Neighborhood-Attention and Cartesian–Polar Coordinates MappingInternational Journal of Software Engineering and Knowledge Engineering10.1142/S0218194024500268(1-28)Online publication date: 19-Jun-2024
https://doi.org/10.1142/S0218194024500268
Zou CAn JLi G(2022)Knowledge Graph Entity Type Prediction with Relational Aggregation Graph Attention NetworkThe Semantic Web10.1007/978-3-031-06981-9_3(39-55)Online publication date: 29-May-2022
https://dl.acm.org/doi/10.1007/978-3-031-06981-9_3

Index Terms

Cat2Type: Wikipedia Category Embeddings for Entity Typing in Knowledge Graphs
1. Computing methodologies
  1. Artificial intelligence
    1. Knowledge representation and reasoning
      1. Semantic networks
    2. Natural language processing
      1. Information extraction
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks

Recommendations

A Neighborhood-Attention Fine-grained Entity Typing for Knowledge Graph Completion
WSDM '22: Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining

Knowledge graph (KG) entity typing focuses on inferring possible entity type instances, which is a significant subtask of knowledge graph completion (KGC). Existing entity typing methods usually exploit the entity representation to model the ...
Entity Type Prediction Leveraging Graph Walks and Entity Descriptions
The Semantic Web – ISWC 2022
Abstract
The entity type information in Knowledge Graphs (KGs) such as DBpedia, Freebase, etc. is often incomplete due to automated generation or human curation. Entity typing is the task of assigning or inferring the semantic type of an entity in a KG. ...
Generating Entity Embeddings for Populating Wikipedia Knowledge Graph by Notability Detection
Natural Language Processing and Information Systems
Abstract
Knowledge graphs (KGs) have been playing a crucial role in leveraging information on web for several downstream tasks. Despite previous efforts in populating KGs, these methods typically do not focus on analyzing entity-specific content ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

K-CAP '21: Proceedings of the 11th Knowledge Capture Conference

December 2021

300 pages

ISBN:9781450384575

DOI:10.1145/3460210

General Chair:
Anna Lisa Gentile
IBM Research Almaden, USA
,
Program Chair:
Rafael Gonçalves
Center for Computational Biomedicine, Harvard Medical School, USA

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGAI: ACM Special Interest Group on Artificial Intelligence

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 December 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

K-CAP '21

Sponsor:

SIGAI

K-CAP '21: Knowledge Capture Conference

December 2 - 3, 2021

Virtual Event, USA

Acceptance Rates

Overall Acceptance Rate 55 of 198 submissions, 28%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
203
Total Downloads

Downloads (Last 12 months)26
Downloads (Last 6 weeks)0

Reflects downloads up to 10 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhang XLi XWang H(2024)Fine-Grained Entity-Type Completion Based on Neighborhood-Attention and Cartesian–Polar Coordinates MappingInternational Journal of Software Engineering and Knowledge Engineering10.1142/S0218194024500268(1-28)Online publication date: 19-Jun-2024
https://doi.org/10.1142/S0218194024500268
Zou CAn JLi G(2022)Knowledge Graph Entity Type Prediction with Relational Aggregation Graph Attention NetworkThe Semantic Web10.1007/978-3-031-06981-9_3(39-55)Online publication date: 29-May-2022
https://dl.acm.org/doi/10.1007/978-3-031-06981-9_3

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten