Elsevier

Neural Networks

Volume 139, July 2021, Pages 140-148
Neural Networks

Knowledge graph embedding with shared latent semantic units

https://doi.org/10.1016/j.neunet.2021.02.013Get rights and content

Highlights

  • We propose a general technique that considers latent semantic units in KGE.

  • We propose two learning strategies for the proposed technique.

  • Experiments on popular benchmarks demonstrate the effectiveness of our models.

Abstract

Knowledge graph embedding (KGE) aims to project both entities and relations into a continuous low-dimensional space. However, for a given knowledge graph (KG), only a small number of entities and relations occur many times, while the vast majority of entities and relations occur less frequently. This data sparsity problem has largely been ignored by most of the existing KGE models. To this end, in this paper, we propose a general technique to enable knowledge transfer among semantically similar entities or relations. Specifically, we define latent semantic units (LSUs), which are the sub-components of entity and relation embeddings. Semantically similar entities or relations are supposed to share the same LSUs, and thus knowledge can be transferred among entities or relations. Finally, extensive experiments show that the proposed technique is able to enhance existing KGE models and can provide better representations of KGs.

Introduction

Nowadays, large-scale knowledge graphs (KGs), such as Freebase (Bollacker, Evans, Paritosh, Sturge, & Taylor, 2008), WordNet (Miller, 1995) and Yago (Suchanek, Kasneci, & Weikum, 2007) are extremely useful resources, which have changed the paradigm for AI-related applications, such as information retrieval (Dalton, Dietz, & Allan, 2014), question answering (Ferrucci, et al., 2010) and information extraction (Mintz, Bills, Snow, & Jurafsky, 2009). Indeed, KGs are multi-relational graphs composed of entities as nodes and relations as edges. They represent information about real-world entities and relations as triples of the form (h,r,t), where h and t correspond to the head and tail entities, and r denotes the relation between them, e.g., (Donald Trump, presidentOf, USA). Although effective in representing structured data, the underlying symbolic nature of such triples often makes KGs hard to manipulate.

Recently, knowledge graph embedding (KGE), which aims to transform entities and relations into continuous low-dimensional vectors, has attracted massive attention. Such embeddings encode rich information of entities and relations, and can be widely utilized in neural network based models for knowledge completion, fusion and inference (Kazemi and Poole, 2018, Xiong et al., 2017, Zhang et al., 2018). However, most of the existing KGE models fail to give proper attention to the data sparsity problem of the long-tail entities and relations, i.e., for a given KG, only a small number of entities and relations occur many times, while the vast majority of entities and relations occur less frequently. TranSparse (Ji, Liu, He, & Zhao, 2016) and ITransF (Xie, Ma, Dai, & Hovy, 2017) try to solve the above problem with sparse matrices and shared concept projection matrices respectively. However, these two models only focus on the data sparsity problem of relations, while paying no attention to entities. Besides, both models lack extensibility. Fig. 1 shows the number of occurrences of entities and relations in a popular KG dataset FB15k-237 (Toutanova & Chen, 2015), which clearly indicates this problem. From Fig. 1, it can be observed that the number of occurrences of most entities and relations, which lie in the tail of the curve, are small and therefore may lead to KGE models failing to obtain high-quality embeddings for these entities and relations.

In this paper, we propose a general technique for the data sparsity problem. Inspired by that every color is composed of the three primary colors, we define latent semantic units (LSUs),1 which are the bases of entity and relation embeddings. More specifically, our model regards the embedding of each entity and relation as a combination of several LSUs. And similar entities or relations are supposed to share the same LSUs, in this way, knowledge can be transferred among these entities or relations through the shared LSUs. Thus less frequent entities and relations can be enriched with valuable information from semantically similar and more frequent entities and relations. In addition, an attention mechanism is utilized to find the appropriate combination of LSUs for each entity and relation. Fig. 2 presents a simple example with 5 LSUs in total. Entity e1 is made up of the 1st, 3rd and 4th LSUs, and e2 is comprised of the 3rd, 4th and 5th LSUs. The 3rd and 4th LSUs are shared by e1 and e2, which enable knowledge transfer between the two entities. To take full advantage of LSUs, we propose two learning strategies for the LSU-based approach, including a dense attention strategy and a reinforcement learning strategy. Moreover, our model does not need additional information like text or paths.

In this paper, we extend three popular KGE models TransE  (Bordes, Usunier, Garcia-Duran, Weston, & Yakhnenko, 2013), DistMult (Yang, Yih, He, Gao, & Deng, 2015) and ConvE (Dettmers, Minervini, Stenetorp, & Riedel, 2018) to learn knowledge representations via leveraging the LSU information. Moreover, the same technique is capable of being extended to other KGE models. Experiments on benchmark datasets clearly validate the effectiveness of our approach.

In summary, we highlight our key contributions as follows,

  • 1.

    We propose a general technique that considers each entity and relation in KGs being comprised of LSUs. Our technique can be utilized by most KGE models. Particularly, we extend TransE, DistMult and ConvE in this paper.

  • 2.

    To take full advantage of the LSUs, we propose two learning strategies for the proposed technique without utilizing additional information like text or paths.

  • 3.

    Extensive experiments on popular benchmarks demonstrate the effectiveness of our models.

Section snippets

Preliminaries

In this section, we briefly introduce the three models which we aim to extend in this paper.

TransE (Bordes et al., 2013) is one of the most widely used KGE models, which is motivated by the linear translation phenomenon observed in well trained word embeddings (Mikolov, Sutskever, Chen, Corrado, & Dean, 2013). TransE assumes h+rt when (h,r,t) holds. The score function of TransE is defined as f(h,r,t)=h+rtLn. Ln can be L1 or L2 norm, which is decided based on the model performance on the

Methodology

In this section, we present our approach that considers LSU information for the KGE task. A KG is viewed as a graph G=h,r,tE×R×E, where E and R are the entity (node) set and the relation (edge) set respectively. In this paper, we propose that entities are composed of entity LSUs Ue=u1e,u2e,u3e,,u|Ue|e, and relations are comprised of relation LSUs Ur=u1r,u2r,u3r,,u|Ur|r. Note that parameters with superscripts e and r are for entities and relations respectively. In order to make the following

Experiments

In this section, we evaluate the proposed technique on link prediction and triple classification tasks.

Related work

Existing KGE models roughly fall into three categories: (i) Translation-based models, which view relations as translations from a head entity to a tail entity (Bordes et al., 2013, Lin et al., 2015b, Wang et al., 2014). TransE (Bordes et al., 2013) is one of the most widely used KGE models, which views a relation as a translation from a head entity to a tail entity on the same low-dimensional hyperplane, i.e., h+rt when (h,r,t) holds. TransH (Wang et al., 2014) introduces a mechanism of

Conclusion

In this paper, we proposed a general technique for KGE to alleviate the data sparsity problem by leveraging the LSUs, and developed two learning strategies to take full advantage of the LSUs. Along this line, we extended three popular KGE models TransE, DistMult and ConvE by sharing LSUs. Particularly, the same technique can be applied to extend other state-of-the-art KGE models. Finally, we evaluated our models on popular benchmarks. The results showed that semantically similar entities and

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

The research work is supported by the National Key Research and Development Program of China under Grant No. 2017YFB1002104, the National Natural Science Foundation of China under Grant Nos. U1836206, U1811461, 61773361, the Project of Youth Innovation Promotion Association CAS, China under Grant No. 2017146.

References (35)

  • Bollacker, K., Evans, C., Paritosh, P., Sturge, T., & Taylor, J. (2008). Freebase: a collaboratively created graph...
  • Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., & Yakhnenko, O. (2013). Translating embeddings for modeling...
  • DaltonJ. et al.

    Entity query feature expansion using knowledge base links

  • Dettmers, T., Minervini, P., Stenetorp, P., & Riedel, S. (2018). Convolutional 2d knowledge graph embeddings. In...
  • Feng, J., Huang, M., Zhao, L., Yang, Y., & Zhu, X. (2018). Reinforcement learning for relation classification from...
  • FerrucciD. et al.

    Building Watson: An overview of the DeepQA project

    AI Magazine

    (2010)
  • Ji, G., Liu, K., He, S., & Zhao, J. (2016). Knowledge graph completion with adaptive sparse transfer matrix. In AAAI...
  • Jiang, J. (2009). Multi-task transfer learning for weakly-supervised relation extraction. In ACL (pp....
  • Kazemi, S. M., & Poole, D. (2018). SimplE embedding for link prediction in knowledge graphs. In NIPS (pp....
  • KingmaD. et al.

    Adam: A method for stochastic optimization

    Computer Science

    (2014)
  • Lee, H., Battle, A., Raina, R., & Ng, A. Y. (2007). Efficient sparse coding algorithms. In NIPS (pp....
  • Lin, Y., Liu, Z., Luan, H., Sun, M., Rao, S., & Liu, S. (2015). Modeling relation paths for representation learning of...
  • Lin, Y., Liu, Z., Sun, M., Liu, Y., & Zhu, X. (2015) Learning entity and relation embeddings for knowledge graph...
  • Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. Distributed representations of words and phrases and...
  • MillerG.A.

    Wordnet: a lexical database for english

    Communications of the ACM

    (1995)
  • Mintz, M., Bills, S., Snow, R., & Jurafsky, D. (2009). Distant supervision for relation extraction without labeled...
  • Nathani, D., Chauhan, J., Sharma, C., & Kaul, M. (2019). Learning attention-based embeddings for relation prediction in...
  • Cited by (6)

    • UniSKGRep: A unified representation learning framework of social network and knowledge graph

      2023, Neural Networks
      Citation Excerpt :

      However, KG suffers from the incompletion circumstance, which results from rare human annotations and difficulty in gathering (Huang et al., 2022). The low-resource feature of KG causes the long-tail effect of entity occurrence frequency (Zhang et al., 2021), which indicates that the background knowledge of most people in KG is sparse, even blank. For leveraging these two types of graph-structure data, graph representation learning (GRL) methods have been devoted to encoding nodes to low-dimensional representations, which can be applied for the node features in various downstream tasks on graphs (Hamilton, 2020), and have recently garnered considerable attention from social network analysis (Hamilton, Ying, & Leskovec, 2017) and knowledge computing (Ji et al., 2021).

    • MRGAT: Multi-Relational Graph Attention Network for knowledge graph completion

      2022, Neural Networks
      Citation Excerpt :

      The neighborhood information and topology of KG have been considered by models using GNNs, such as RGCN, SACN, CompGCN (Vashishth et al., 2020) and KMAE. The process of embedding generation does not involve GNNs in these models, such as translation-based models TransE and RotatE (Sun et al., 2018), tensor factorization-based models DistMult and ComplEx (Trouillon, Welbl, Riedel, Gaussier, & Bouchard, 2016), and convolutional neural network-based models ConvE, Conv-TransE, ConvE-RL (Zhang, Zhuang et al., 2021) and InteractE (Vashishth et al., 2020). Also, we have compared with pLogicNet (rule-based) in the experiment.

    • Weighted Knowledge Graph Embedding

      2023, SIGIR 2023 - Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval
    View full text