ABSTRACT
While high-quality taxonomies are essential to the Semantic Web, building them for large knowledge graphs is an expensive process. Likewise, creating taxonomies that accurately reflect the content of dynamic knowledge graphs is another challenge. In this paper, we propose a method to automatically extract a taxonomy from knowledge graph embeddings, and evaluate it on DBpedia. Our approach produces a taxonomy by leveraging the type information contained in the graph and the tree-like structure of an unsupervised hierarchical clustering performed over entity embeddings. We then extend our method with an axiom induction mechanism which allows us to identify new classes from the data and describe them with logical axioms, thus leading to expressive taxonomy extraction.
- Alfred V. Aho, Michael R Garey, and Jeffrey D. Ullman. 1972. The transitive reduction of a directed graph. SIAM J. Comput. 1, 2 (1972), 131--137.Google ScholarDigital Library
- Muhammad Nabeel Asim, Muhammad Wasim, Muhammad Usman Ghani Khan, Waqar Mahmood, and Hafiza Mahnoor Abbasi. 2018. A survey of ontology learning techniques and applications. Database (2018).Google Scholar
- Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. 2013. Translating embeddings for modeling multi-relational data. In Advances in Neural Information Processing Systems. 2787--2795.Google Scholar
- Peter Brown, Yuedong Yang, Yaoqi Zhou, and Wayne Pullan. 2017. A heuristic for the time constrained asymmetric linear sum assignment problem. Journal of Combinatorial Optimization 33, 2 (2017), 551--566.Google ScholarDigital Library
- Michael Cochez, Petar Ristoski, Simone Paulo Ponzetto, and Heiko Paulheim. 2017. RDF2Vec DBpedia uniform embeddings.Google Scholar
- Stefano Faralli, Alexander Panchenko, Chris Biemann, and Simone Paolo Ponzetto. 2017. The ContrastMedium algorithm: Taxonomy induction from noisy knowledge graphs with just a few links. In Proc. 15th Conference of the European Chapter of the Association for Computational Linguistics, Vol. 1. 590--600.Google ScholarCross Ref
- Ruiji Fu, Jiang Guo, Bing Qin, Wanxiang Che, Haifeng Wang, and Ting Liu. 2014. Learning semantic hierarchies via word embeddings. In Proc. 52nd Annual Meeting of the Association for Computational Linguistics, Vol. 1. 1199--1209.Google ScholarCross Ref
- Niharika Gupta, Sanjay Podder, KM Annervaz, and Shubhashis Sengupta. 2016. Domain ontology induction using word embeddings. In 15th IEEE International Conference on Machine Learning and Applications. IEEE, 115--119.Google ScholarCross Ref
- Xu Han, Shulin Cao, Xin Lv, Yankai Lin, Zhiyuan Liu, Maosong Sun, and Juanzi Li. 2018. OpenKE: An Open Toolkit for Knowledge Embedding. In Proc. 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. Association for Computational Linguistics, 139--144.Google ScholarCross Ref
- Marti A Hearst. 1992. Automatic acquisition of hyponyms from large text corpora. In Proc. 14th conference on Computational Linguistics, Vol. 2. ACL, 539--545.Google ScholarDigital Library
- Roy Jonker and Anton Volgenant. 1987. A shortest augmenting path algorithm for dense and sparse linear assignment problems. Computing 38, 4 (1987), 325--340.Google ScholarDigital Library
- Zornitsa Kozareva and Eduard Hovy. 2010. A semi-supervised method to learn and construct taxonomies using the web. In Proc. 2010 conference on Empirical Methods in Natural Language Processing. ACL, 1110--1118.Google Scholar
- Jens Lehmann. 2009. DL-Learner: learning concepts in description logics. Journal of Machine Learning Research 10 (2009), 2639--2642.Google ScholarDigital Library
- Na Li, Zied Bouraoui, and Steven Schockaert. 2019. Ontology completion using graph convolutional networks. In International Semantic Web Conference. 435--452.Google ScholarDigital Library
- Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems. 3111--3119.Google Scholar
- Maximillian Nickel and Douwe Kiela. 2017. Poincaré Embeddings for Learning Hierarchical Representations. In Advances in Neural Information Processing Systems 30. Curran Associates, Inc., 6338--6347.Google Scholar
- Maximilian Nickel and Douwe Kiela. 2018. Learning Continuous Hierarchies in the Lorentz Model of Hyperbolic Geometry. In Proc. ICML.Google Scholar
- Giulio Petrucci, Marco Rospocher, and Chiara Ghidini. 2018. Expressive ontology learning as neural machine translation. Journal of Web Semantics 52 (2018), 66--82.Google ScholarCross Ref
- Marcin Pietrasik and Marek Reformat. 2020. A Simple Method for Inducing Class Taxonomies in Knowledge Graphs. In European Semantic Web Conference. Springer, 53--68.Google ScholarDigital Library
- Petar Ristoski, Stefano Faralli, Simone Paolo Ponzetto, and Heiko Paulheim. 2017. Large-scale taxonomy induction using entity and word embeddings. In Proc. International Conference on Web Intelligence. ACM, 81--87.Google ScholarDigital Library
- Petar Ristoski and Heiko Paulheim. 2016. Rdf2vec: RDF graph embeddings for data mining. In International Semantic Web Conference. Springer, 498--514.Google ScholarDigital Library
- Bariş Sertkaya. 2009. Ontocomp: A protege plugin for completing OWL ontologies. In European Semantic Web Conference. Springer, 898--902.Google ScholarDigital Library
- Théo Trouillon, Johannes Welbl, Sebastian Riedel, Éric Gaussier, and Guillaume Bouchard. 2016. Complex embeddings for simple link prediction. In Proc. International Conference on Machine Learning.Google Scholar
- Johanna Völker and Mathias Niepert. 2011. Statistical schema induction. In Extended Semantic Web Conference. Springer, 124--138.Google ScholarCross Ref
- Quan Wang, Zhendong Mao, Bin Wang, and Li Guo. 2017. Knowledge graph embedding: A survey of approaches and applications. IEEE Transactions on Knowledge and Data Engineering 29, 12 (2017), 2724--2743.Google ScholarCross Ref
- Wilson Wong, Wei Liu, and Mohammed Bennamoun. 2012. Ontology learning from text: A look back and into the future. Comput. Surveys 44, 4 (2012), 20.Google ScholarDigital Library
- Fei Wu and Daniel S Weld. 2008. Automatically refining the wikipedia infobox ontology. In Proc. 17th International Conference on World Wide Web. ACM, 635--644.Google ScholarDigital Library
- Bishan Yang, Wen-tau Yih, Xiaodong He, Jianfeng Gao, and Li Deng. 2015. Embedding Entities and Relations for Learning and Inference in Knowledge Bases. In Proc. 3rd International Conference on Learning Representations.Google Scholar
- Bushra Zafar, Michael Cochez, and Usman Qamar. 2016. Using distributional semantics for automatic taxonomy induction. In Proc. 2016 International Conference on Frontiers of Information Technology. IEEE, 348--353.Google ScholarCross Ref
- Amal Zouaq and Felix Martel. 2020. What is the schema of your knowledge graph? leveraging knowledge graph embeddings and clustering for expressive taxonomy learning. In Proceedings of The International Workshop on Semantic Big Data. 1--6.Google ScholarDigital Library
Index Terms
- Taxonomy extraction using knowledge graph embeddings and hierarchical clustering
Recommendations
What is the schema of your knowledge graph?: leveraging knowledge graph embeddings and clustering for expressive taxonomy learning
SBD '20: Proceedings of The International Workshop on Semantic Big DataLarge-scale knowledge graphs have become prevalent on the Web and have demonstrated their usefulness for several tasks. One challenge associated to knowledge graphs is the necessity to keep a knowledge graph schema (which is generally manually defined) ...
Named Entity Recognition using Knowledge Graph Embeddings and DistilBERT
NLPIR '21: Proceedings of the 2021 5th International Conference on Natural Language Processing and Information RetrievalNamed Entity Recognition (NER) is a Natural Language Processing (NLP) task of identifying entities from a natural language text and classifies them into categories like Person, Location, Organization etc. Pre-trained neural language models (PNLM) based ...
Cross-Lingual Taxonomy Alignment with Bilingual Knowledge Graph Embeddings
Semantic TechnologyAbstractRecently, different knowledge graphs have become the essential components of many intelligent applications, but no research has explored the use of knowledge graphs to cross-lingual taxonomy alignment (CLTA), which is the task of mapping each ...
Comments