skip to main content
10.1145/3412841.3441959acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
research-article
Best Paper

Taxonomy extraction using knowledge graph embeddings and hierarchical clustering

Published:22 April 2021Publication History

ABSTRACT

While high-quality taxonomies are essential to the Semantic Web, building them for large knowledge graphs is an expensive process. Likewise, creating taxonomies that accurately reflect the content of dynamic knowledge graphs is another challenge. In this paper, we propose a method to automatically extract a taxonomy from knowledge graph embeddings, and evaluate it on DBpedia. Our approach produces a taxonomy by leveraging the type information contained in the graph and the tree-like structure of an unsupervised hierarchical clustering performed over entity embeddings. We then extend our method with an axiom induction mechanism which allows us to identify new classes from the data and describe them with logical axioms, thus leading to expressive taxonomy extraction.

References

  1. Alfred V. Aho, Michael R Garey, and Jeffrey D. Ullman. 1972. The transitive reduction of a directed graph. SIAM J. Comput. 1, 2 (1972), 131--137.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Muhammad Nabeel Asim, Muhammad Wasim, Muhammad Usman Ghani Khan, Waqar Mahmood, and Hafiza Mahnoor Abbasi. 2018. A survey of ontology learning techniques and applications. Database (2018).Google ScholarGoogle Scholar
  3. Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. 2013. Translating embeddings for modeling multi-relational data. In Advances in Neural Information Processing Systems. 2787--2795.Google ScholarGoogle Scholar
  4. Peter Brown, Yuedong Yang, Yaoqi Zhou, and Wayne Pullan. 2017. A heuristic for the time constrained asymmetric linear sum assignment problem. Journal of Combinatorial Optimization 33, 2 (2017), 551--566.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Michael Cochez, Petar Ristoski, Simone Paulo Ponzetto, and Heiko Paulheim. 2017. RDF2Vec DBpedia uniform embeddings.Google ScholarGoogle Scholar
  6. Stefano Faralli, Alexander Panchenko, Chris Biemann, and Simone Paolo Ponzetto. 2017. The ContrastMedium algorithm: Taxonomy induction from noisy knowledge graphs with just a few links. In Proc. 15th Conference of the European Chapter of the Association for Computational Linguistics, Vol. 1. 590--600.Google ScholarGoogle ScholarCross RefCross Ref
  7. Ruiji Fu, Jiang Guo, Bing Qin, Wanxiang Che, Haifeng Wang, and Ting Liu. 2014. Learning semantic hierarchies via word embeddings. In Proc. 52nd Annual Meeting of the Association for Computational Linguistics, Vol. 1. 1199--1209.Google ScholarGoogle ScholarCross RefCross Ref
  8. Niharika Gupta, Sanjay Podder, KM Annervaz, and Shubhashis Sengupta. 2016. Domain ontology induction using word embeddings. In 15th IEEE International Conference on Machine Learning and Applications. IEEE, 115--119.Google ScholarGoogle ScholarCross RefCross Ref
  9. Xu Han, Shulin Cao, Xin Lv, Yankai Lin, Zhiyuan Liu, Maosong Sun, and Juanzi Li. 2018. OpenKE: An Open Toolkit for Knowledge Embedding. In Proc. 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. Association for Computational Linguistics, 139--144.Google ScholarGoogle ScholarCross RefCross Ref
  10. Marti A Hearst. 1992. Automatic acquisition of hyponyms from large text corpora. In Proc. 14th conference on Computational Linguistics, Vol. 2. ACL, 539--545.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Roy Jonker and Anton Volgenant. 1987. A shortest augmenting path algorithm for dense and sparse linear assignment problems. Computing 38, 4 (1987), 325--340.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Zornitsa Kozareva and Eduard Hovy. 2010. A semi-supervised method to learn and construct taxonomies using the web. In Proc. 2010 conference on Empirical Methods in Natural Language Processing. ACL, 1110--1118.Google ScholarGoogle Scholar
  13. Jens Lehmann. 2009. DL-Learner: learning concepts in description logics. Journal of Machine Learning Research 10 (2009), 2639--2642.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Na Li, Zied Bouraoui, and Steven Schockaert. 2019. Ontology completion using graph convolutional networks. In International Semantic Web Conference. 435--452.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems. 3111--3119.Google ScholarGoogle Scholar
  16. Maximillian Nickel and Douwe Kiela. 2017. Poincaré Embeddings for Learning Hierarchical Representations. In Advances in Neural Information Processing Systems 30. Curran Associates, Inc., 6338--6347.Google ScholarGoogle Scholar
  17. Maximilian Nickel and Douwe Kiela. 2018. Learning Continuous Hierarchies in the Lorentz Model of Hyperbolic Geometry. In Proc. ICML.Google ScholarGoogle Scholar
  18. Giulio Petrucci, Marco Rospocher, and Chiara Ghidini. 2018. Expressive ontology learning as neural machine translation. Journal of Web Semantics 52 (2018), 66--82.Google ScholarGoogle ScholarCross RefCross Ref
  19. Marcin Pietrasik and Marek Reformat. 2020. A Simple Method for Inducing Class Taxonomies in Knowledge Graphs. In European Semantic Web Conference. Springer, 53--68.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Petar Ristoski, Stefano Faralli, Simone Paolo Ponzetto, and Heiko Paulheim. 2017. Large-scale taxonomy induction using entity and word embeddings. In Proc. International Conference on Web Intelligence. ACM, 81--87.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Petar Ristoski and Heiko Paulheim. 2016. Rdf2vec: RDF graph embeddings for data mining. In International Semantic Web Conference. Springer, 498--514.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Bariş Sertkaya. 2009. Ontocomp: A protege plugin for completing OWL ontologies. In European Semantic Web Conference. Springer, 898--902.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Théo Trouillon, Johannes Welbl, Sebastian Riedel, Éric Gaussier, and Guillaume Bouchard. 2016. Complex embeddings for simple link prediction. In Proc. International Conference on Machine Learning.Google ScholarGoogle Scholar
  24. Johanna Völker and Mathias Niepert. 2011. Statistical schema induction. In Extended Semantic Web Conference. Springer, 124--138.Google ScholarGoogle ScholarCross RefCross Ref
  25. Quan Wang, Zhendong Mao, Bin Wang, and Li Guo. 2017. Knowledge graph embedding: A survey of approaches and applications. IEEE Transactions on Knowledge and Data Engineering 29, 12 (2017), 2724--2743.Google ScholarGoogle ScholarCross RefCross Ref
  26. Wilson Wong, Wei Liu, and Mohammed Bennamoun. 2012. Ontology learning from text: A look back and into the future. Comput. Surveys 44, 4 (2012), 20.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Fei Wu and Daniel S Weld. 2008. Automatically refining the wikipedia infobox ontology. In Proc. 17th International Conference on World Wide Web. ACM, 635--644.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Bishan Yang, Wen-tau Yih, Xiaodong He, Jianfeng Gao, and Li Deng. 2015. Embedding Entities and Relations for Learning and Inference in Knowledge Bases. In Proc. 3rd International Conference on Learning Representations.Google ScholarGoogle Scholar
  29. Bushra Zafar, Michael Cochez, and Usman Qamar. 2016. Using distributional semantics for automatic taxonomy induction. In Proc. 2016 International Conference on Frontiers of Information Technology. IEEE, 348--353.Google ScholarGoogle ScholarCross RefCross Ref
  30. Amal Zouaq and Felix Martel. 2020. What is the schema of your knowledge graph? leveraging knowledge graph embeddings and clustering for expressive taxonomy learning. In Proceedings of The International Workshop on Semantic Big Data. 1--6.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Taxonomy extraction using knowledge graph embeddings and hierarchical clustering

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SAC '21: Proceedings of the 36th Annual ACM Symposium on Applied Computing
        March 2021
        2075 pages
        ISBN:9781450381048
        DOI:10.1145/3412841

        Copyright © 2021 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 22 April 2021

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate1,650of6,669submissions,25%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader