skip to main content
10.1145/3488560.3498437acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article
Public Access

DualDE: Dually Distilling Knowledge Graph Embedding for Faster and Cheaper Reasoning

Published: 15 February 2022 Publication History

Abstract

Knowledge Graph Embedding (KGE) is a popular method for KG reasoning and training KGEs with higher dimension are usually preferred since they have better reasoning capability. However, high-dimensional KGEs pose huge challenges to storage and computing resources and are not suitable for resource-limited or time-constrained applications, for which faster and cheaper reasoning is necessary. To address this problem, we propose DualDE, a knowledge distillation method to build low-dimensional student KGE from pre-trained high-dimensional teacher KGE. DualDE considers the dual-influence between the teacher and the student. In DualDE, we propose a soft label evaluation mechanism to adaptively assign different soft label and hard label weights to different triples, and a two-stage distillation approach to improve the student's acceptance of the teacher. Our DualDE is general enough to be applied to various KGEs. Experimental results show that our method can successfully reduce the embedding parameters of a high-dimensional KGE by 7× - 15× and increase the inference speed by 2× - 6× while retaining a high performance. We also experimentally prove the effectiveness of our soft label evaluation mechanism and two-stage distillation approach via ablation study.

Supplementary Material

MP4 File (WSDM22-fp348.mp4)
Presentation video

References

[1]
Jonathan Berant, Andrew Chou, Roy Frostig, and Percy Liang. Semantic parsing on freebase from question-answer pairs. In EMNLP, pages 1533--1544. ACL, 2013.
[2]
Jonathan Berant and Percy Liang. Semantic parsing via paraphrasing. In ACL (1), pages 1415--1425. The Association for Computer Linguistics, 2014.
[3]
Raphael Hoffmann, Congle Zhang, Xiao Ling, Luke S. Zettlemoyer, and Daniel S. Weld. Knowledge-based weak supervision for information extraction of overlapping relations. In ACL, pages 541--550. The Association for Computer Linguistics, 2011.
[4]
Joachim Daiber, Max Jakob, Chris Hokamp, and Pablo N. Mendes. Improving efficiency and accuracy in multilingual entity extraction. In I-SEMANTICS, pages 121--124. ACM, 2013.
[5]
Yuanzhe Zhang, Kang Liu, Shizhu He, Guoliang Ji, Zhanyi Liu, Hua Wu, and Jun Zhao. Question answering over knowledge base with neural attention combining global knowledge information. CoRR, abs/1606.00979, 2016.
[6]
Dennis Diefenbach, Kamal Deep Singh, and Pierre Maret. Wdaqua-core1: A question answering service for RDF knowledge bases. In WWW (Companion Volume), pages 1087--1091. ACM, 2018.
[7]
Antoine Bordes, Nicolas Usunier, Alberto Garc'i a-Durá n, Jason Weston, and Oksana Yakhnenko. Translating embeddings for modeling multi-relational data. In NIPS, pages 2787--2795, 2013.
[8]
Thé o Trouillon, Johannes Welbl, Sebastian Riedel, É ric Gaussier, and Guillaume Bouchard. Complex embeddings for simple link prediction. In ICML, volume 48 of JMLR Workshop and Conference Proceedings, pages 2071--2080. JMLR.org, 2016.
[9]
Zhiqing Sun, Zhi-Hong Deng, Jian-Yun Nie, and Jian Tang. Rotate: Knowledge graph embedding by relational rotation in complex space. In ICLR (Poster). OpenReview.net, 2019.
[10]
Geoffrey E. Hinton, Oriol Vinyals, and Jeffrey Dean. Distilling the knowledge in a neural network. CoRR, abs/1503.02531, 2015.
[11]
Wonpyo Park, Dongju Kim, Yan Lu, and Minsu Cho. Relational knowledge distillation. In CVPR, pages 3967--3976. Computer Vision Foundation / IEEE, 2019.
[12]
Seyed-Iman Mirzadeh, Mehrdad Farajtabar, Ang Li, Nir Levine, Akihiro Matsukawa, and Hassan Ghasemzadeh. Improved knowledge distillation via teacher assistant. In AAAI, pages 5191--5198. AAAI Press, 2020.
[13]
Kai Wang, Yu Liu, Qian Ma, and Quan Z. Sheng. Mulde: Multi-teacher knowledge distillation for low-dimensional knowledge graph embeddings. In WWW, pages 1716--1726. ACM / IW3C2, 2021.
[14]
Siqi Sun, Yu Cheng, Zhe Gan, and Jingjing Liu. Patient knowledge distillation for BERT model compression. In EMNLP/IJCNLP (1), pages 4322--4331. Association for Computational Linguistics, 2019.
[15]
Maximilian Nickel, Volker Tresp, and Hans-Peter Kriegel. A three-way model for collective learning on multi-relational data. In ICML, pages 809--816. Omnipress, 2011.
[16]
Bishan Yang, Wen-tau Yih, Xiaodong He, Jianfeng Gao, and Li Deng. Embedding entities and relations for learning and inference in knowledge bases. In ICLR (Poster), 2015.
[17]
Seyed Mehran Kazemi and David Poole. Simple embedding for link prediction in knowledge graphs. In NeurIPS, pages 4289--4300, 2018.
[18]
Zhen Wang, Jianwen Zhang, Jianlin Feng, and Zheng Chen. Knowledge graph embedding by translating on hyperplanes. In AAAI, pages 1112--1119. AAAI Press, 2014.
[19]
Yankai Lin, Zhiyuan Liu, Maosong Sun, Yang Liu, and Xuan Zhu. Learning entity and relation embeddings for knowledge graph completion. In AAAI, pages 2181--2187. AAAI Press, 2015.
[20]
Guoliang Ji, Shizhu He, Liheng Xu, Kang Liu, and Jun Zhao. Knowledge graph embedding via dynamic mapping matrix. In ACL (1), pages 687--696. The Association for Computer Linguistics, 2015.
[21]
Shuai Zhang, Yi Tay, Lina Yao, and Qi Liu. Quaternion knowledge graph embeddings. In NeurIPS, pages 2731--2741, 2019.
[22]
Canran Xu and Ruijiang Li. Relation embedding with dihedral group in knowledge graph. In ACL (1), pages 263--272. Association for Computational Linguistics, 2019.
[23]
Mrinmaya Sachan. Knowledge graph embedding compression. In ACL, pages 2681--2691. Association for Computational Linguistics, 2020.
[24]
Ruihao Gong, Xianglong Liu, Shenghu Jiang, Tianxiang Li, Peng Hu, Jiazhen Lin, Fengwei Yu, and Junjie Yan. Differentiable soft quantization: Bridging full-precision and low-bit neural networks. In ICCV, pages 4851--4860. IEEE, 2019.
[25]
Giovanna Castellano, Anna Maria Fanelli, and Marcello Pelillo. An iterative pruning algorithm for feedforward neural networks. IEEE Trans. Neural Networks, 8(3):519--531, 1997.
[26]
Pavlo Molchanov, Stephen Tyree, Tero Karras, Timo Aila, and Jan Kautz. Pruning convolutional neural networks for resource efficient inference. In ICLR (Poster). OpenReview.net, 2017.
[27]
Darryl Dexu Lin, Sachin S. Talathi, and V. Sreekanth Annapureddy. Fixed point quantization of deep convolutional networks. In ICML, volume 48 of JMLR Workshop and Conference Proceedings, pages 2849--2858. JMLR.org, 2016.
[28]
Mostafa Dehghani, Stephan Gouws, Oriol Vinyals, Jakob Uszkoreit, and Lukasz Kaiser. Universal transformers. In ICLR (Poster). OpenReview.net, 2019.
[29]
Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, and Radu Soricut. ALBERT: A lite BERT for self-supervised learning of language representations. In ICLR. OpenReview.net, 2020.
[30]
Raphael Tang, Yao Lu, Linqing Liu, Lili Mou, Olga Vechtomova, and Jimmy Lin. Distilling task-specific knowledge from BERT into simple neural networks. CoRR, abs/1903.12136, 2019.
[31]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: pre-training of deep bidirectional transformers for language understanding. In NAACL-HLT (1), pages 4171--4186. Association for Computational Linguistics, 2019.
[32]
Yonglong Tian, Dilip Krishnan, and Phillip Isola. Contrastive representation distillation. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26--30, 2020. OpenReview.net, 2020.
[33]
Sanqiang Zhao, Raghav Gupta, Yang Song, and Denny Zhou. Extreme language model compression with optimal subwords and shared projections. CoRR, abs/1909.11687, 2019.
[34]
Alessandro Achille and Stefano Soatto. Emergence of invariance and disentanglement in deep representations. J. Mach. Learn. Res., 19:50:1--50:34, 2018.
[35]
Kristina Toutanova, Danqi Chen, Patrick Pantel, Hoifung Poon, Pallavi Choudhury, and Michael Gamon. Representing text for joint embedding of text and knowledge bases. In EMNLP, pages 1499--1509. The Association for Computational Linguistics, 2015.
[36]
Tim Dettmers, Pasquale Minervini, Pontus Stenetorp, and Sebastian Riedel. Convolutional 2d knowledge graph embeddings. In AAAI, pages 1811--1818. AAAI Press, 2018.
[37]
Meng-Chieh Wu, Ching-Te Chiu, and Kun-Hsuan Wu. Multi-teacher knowledge distillation for compressed video action recognition on deep neural networks. In ICASSP, pages 2202--2206. IEEE, 2019.
[38]
Xu Han, Shulin Cao, Xin Lv, Yankai Lin, Zhiyuan Liu, Maosong Sun, and Juanzi Li. Openke: An open toolkit for knowledge embedding. In EMNLP (Demonstration), pages 139--144. Association for Computational Linguistics, 2018.
[39]
Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization. In ICLR (Poster), 2015.

Cited By

View all
  • (2024)Improvement of Web Semantic and Transformer-Based Knowledge Graph Completion in Low-Dimensional SpacesInternational Journal on Semantic Web and Information Systems10.4018/IJSWIS.33691920:1(1-18)Online publication date: 31-Jan-2024
  • (2024)Knowledge Graph Embedding Using a Multi-Channel Interactive Convolutional Neural Network with Triple AttentionMathematics10.3390/math1218282112:18(2821)Online publication date: 11-Sep-2024
  • (2024)Towards continual knowledge graph embedding via incremental distillationProceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v38i8.28722(8759-8768)Online publication date: 20-Feb-2024
  • Show More Cited By

Index Terms

  1. DualDE: Dually Distilling Knowledge Graph Embedding for Faster and Cheaper Reasoning

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      WSDM '22: Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining
      February 2022
      1690 pages
      ISBN:9781450391320
      DOI:10.1145/3488560
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 15 February 2022

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. fast embedding
      2. knowledge distillation
      3. knowledge graph embedding

      Qualifiers

      • Research-article

      Funding Sources

      Conference

      WSDM '22

      Acceptance Rates

      Overall Acceptance Rate 498 of 2,863 submissions, 17%

      Upcoming Conference

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)260
      • Downloads (Last 6 weeks)14
      Reflects downloads up to 13 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Improvement of Web Semantic and Transformer-Based Knowledge Graph Completion in Low-Dimensional SpacesInternational Journal on Semantic Web and Information Systems10.4018/IJSWIS.33691920:1(1-18)Online publication date: 31-Jan-2024
      • (2024)Knowledge Graph Embedding Using a Multi-Channel Interactive Convolutional Neural Network with Triple AttentionMathematics10.3390/math1218282112:18(2821)Online publication date: 11-Sep-2024
      • (2024)Towards continual knowledge graph embedding via incremental distillationProceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v38i8.28722(8759-8768)Online publication date: 20-Feb-2024
      • (2024)A Survey of Knowledge Graph Reasoning on Graph Types: Static, Dynamic, and Multi-ModalIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.341745146:12(9456-9478)Online publication date: Dec-2024
      • (2024)From Wide to Deep: Dimension Lifting Network for Parameter-Efficient Knowledge Graph EmbeddingIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.343747936:12(8341-8348)Online publication date: Dec-2024
      • (2024)Overview of knowledge reasoning for knowledge graphNeurocomputing10.1016/j.neucom.2024.127571585:COnline publication date: 7-Jun-2024
      • (2024)Hierarchical knowledge graph relationship prediction leverage of axiomatic fuzzy set graph structureExpert Systems with Applications: An International Journal10.1016/j.eswa.2024.124090251:COnline publication date: 24-Jul-2024
      • (2024)VEML: an easy but effective framework for fusing text and structure knowledge on sparse knowledge graph completionData Mining and Knowledge Discovery10.1007/s10618-023-01001-y38:2(343-371)Online publication date: 6-Feb-2024
      • (2023)Neuro-Logic Learning for Relation Reasoning over Event Knowledge Graph2023 IEEE International Conference on Intelligence and Security Informatics (ISI)10.1109/ISI58743.2023.10297212(1-6)Online publication date: 2-Oct-2023
      • (2023)Compression Models via Meta-Learning and Structured Distillation for Named Entity Recognition2023 International Conference on Asian Language Processing (IALP)10.1109/IALP61005.2023.10336991(90-94)Online publication date: 18-Nov-2023

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Login options

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media