KGEL: A novel end-to-end embedding learning framework for knowledge graph completion

https://doi.org/10.1016/j.eswa.2020.114164Get rights and content

Highlights

  • Embedding models perform the link prediction task on the basis of individual triples.

  • Gathering neighborhood information leads to richer node embeddings.

  • A GCN based model is used to produce two embedding vectors of each node.

  • A novel tensor factorization model predicts missing entities to derive new triples.

  • The results prove that richer node embeddings lead to significant performance gain.

Abstract

Knowledge graphs (KGs) have recently become increasingly popular due to the broad range of essential applications in various downstream tasks including intelligent search, personalized recommendations, intelligent financial data analytics, etc. During an automated construction of a KG, the knowledge facts from multiple knowledge sources are automatically extracted in the form of triples, and these observed triples are used to derive new unobserved triples for KG completion (also known as link prediction). State-of-the-art link prediction methods are known to be primarily KG embedding models, among which tensor factorization models have recently drawn much attention due to their scalability and expressive feature embeddings, and hence, perform well for link prediction. However, these embedding models consider each KG triple individually and fail to capture the useful information present in the neighborhood of a node. To this end, we propose a novel end-to-end KG embedding learning framework that consists of an encoder of a dual weighted graph convolutional network, and a decoder of a novel fully expressive tensor factorization model. The proposed encoder extends weighted graph convolutional network to generate two rich and high quality embedding vectors for each node by aggregating information from the neighboring nodes. The proposed decoder has a flexible and powerful tensor representation form of the Tensor Train decomposition that takes benefit of the two representations of each node in its embedding space to accurately model the KG triples. We also derive a bound on the size of the embeddings for full expressivity and show that our proposed tensor factorization model is fully expressive. Additionally, we show the relationship of our tensor factorization model to previous tensor factorization models. The experimental results show the effectiveness of the proposed framework that consistently marks performance gains over several previous models on recent standard link prediction datasets.

Introduction

Expert and intelligent systems perform decision making on the basis of knowledge from multiple sources. An effective knowledge representation is important to integrate and organize knowledge in a form that can easily be explored and interpreted by these intelligent systems (Martinez et al., 2018, Nickel et al., 2015). Knowledge graphs (KGs) provide a semantically structured graphical representation of knowledge facts, where the factual information is represented as triples of the form (es,r,eo), where es,eo are the subject and object entities, represented by nodes, and r is the relationship between entities, represented by an edge between the nodes. KGs provide an effective way to extract and integrate structural information of entities from diverse knowledge sources at large scale, link these entities together into flexible ontologies which can easily be extended with new information, and can effectively be illustrated that how the entities are semantically interconnected so that the intelligent systems can effectively explore and interpret knowledge for sufficient inference (Chen et al., 2020, Zhao et al., 2017). Various companies have created KGs for performance improvement of intelligent systems including search engines, conversational agents, data analytic systems, etc. For instance, eBay’s conversational agent (Pittman, Srivastava, Hewavitharana, Kale, & Mansour, 2017) uses KG to encode product details and customer shopping behavior to find the relevant products based on customer queries. Bloomberg (Edgar, 2019) uses KG to encode tweets and news reports to empower their intelligent data analytic system to detect emerging events that may affect stock values (Hogan et al., 2020).

KGs can also be integrated with various expert and intelligent systems to provide new insights for decision making. For example, a relational data based intelligent financial fraud detection system may not identify a potential fraudulent customer by features, however, a KG based system can identify the same customer (entity) as fraudulent if connected with another fraudulent customer (entity) through a relation such as phone number/email address/IP address which suggests that both customers are same. Similarly, KGs can be integrated in intelligent automatic driving systems to find similarity between locations (entities) by using information from the distance (relation) to guide a vehicle to the potential targeted location. Moreover, KGs can be used in intelligent investment advisers to identify similarity between features of a new and existing customers (entities) connected by some relation such as financial status to recommend same investment plan to the new customer. Furthermore, in medical expert systems, KGs can be used to identify a relation such as high temperature to connect all diseases with high temperature which can consequently advise to avoid some medicines that are forbidden at high temperature.

Several previous studies (Chen et al., 2018, Kertkeidkachorn and Ichise, 2018, Malik et al., 2020, Zhao et al., 2017) have attempted automated construction of KGs from various information sources. These automated KG construction approaches mainly perform automated extraction of triples, however, they do not focus on deriving new unobserved triples based on the extracted ones as done by Alobaidi, Malik, and Hussain (2018) for automated ontology generation. More precisely, it is not only important to automatically extract entities and relations, and link them into a semantically structured representation, but also to discover existing entities that have relations with other existing entities to derive new unobserved triples for improving and maximizing knowledge with the ultimate goal of KG completion to provide new insights for interpretation and sufficient inference. Thus, KG completion methods can be considered an integral part of the automated KG construction approaches and an important prerequisite for the development of more intelligent systems. KG completion methods predict new unobserved triples based on the existing ones for KG completion. In particular, link prediction is a KG completion task, where the object (es,r,?) or subject entity (?,r,eo) is predicted to derive new triples.

For link prediction, several KG embedding learning approaches have been proposed and gained significant attention. These KG embedding models can be broadly classified as translational distance models (Bordes et al., 2013, Ji et al., 2015, Lin et al., 2015, Sun et al., 2018), tensor factorization models (Nickel et al., 2016, Trouillon et al., 2016, Yang et al., 2015), and deep learning models (Dettmers et al., 2018, Dong et al., 2014, Nguyen et al., 2018). Given a KG, these models first embed the components of a KG including entities and relations into continuous vector space, and define a scoring function on each triple to measure its plausibility of being a valid fact (Wang, Mao, Wang, & Guo, 2017).

Among KG embedding models, tensor factorization models have gained most popularity due to their simplicity, scalability, expressivity, and significant performance. The key idea in tensor factorization approaches is to represent the triples (es,r,eo) as a 3rd order binary tensor Y, and consider link prediction as a binary tensor completion task based on the low rank factorization (embeddings) to infer a predicted tensor P that approximates the given tensor Y (Trouillon et al., 2016). To do so, several tensor factorization approaches decompose a given tensor Y into a product of embedding matrices, resulting in r-dimensional vector and vector/matrix representation of entities and relations respectively. For a given triple (es,r,eo), the subject entity es and object entity eo are linked together through relation r. Thus, the score is recovered through a bi-linear product between the embeddings of es,r and eo (Trouillon et al., 2016).

Tensor factorization models perform the link prediction task solely on the basis of individual triples. They ignore the knowledge graph connectivity structure, and thus fail to embed useful information from the local neighborhood in the node embedding space. Weighted Graph Convolutional Networks (WGCNs) (Liu et al., 2019, Shang et al., 2019) have been effective tools for aggregating the local information from the neighborhood of each node to generate richer node embeddings. WGCNs learn to assign different weights to the edges connecting the neighboring nodes and simply aggregates all neighboring nodes to update the embeddings of a given node. However, KGs are one of the various data structures where nodes often exhibit different behaviors simultaneously depending upon their appearance as a subject or object entity in different triples. For example (Lion, eat, Wildcat) and (Wildcat, eat, Rabbit) are two triples from the animal food chain. In this case, the node Wildcat has two neighbors (Lion and Rabbit). However, in the former triple, Wildcat appears as an object entity where it exhibits a different behavior (e.g. prey) than the behavior (e.g. hunter) it exhibits in the later triple as a subject entity. Despite being effective tools for gathering neighborhood information, WGCNs may fail to differentiate between the behaviors of a node as an object or subject of a specific triple.

Motivated by the aforementioned observations, we propose KGEL, a novel end-to-end KG embedding learning framework. KGEL consists of an encoder of a dual weighted graph convolutional network (which we call DualWGCN), and a decoder of a novel fully expressive tensor factorization model (which we call TensorT), based on the Tensor Train decomposition (Oseledets, 2011). The proposed DualWGCN contains a clustering layer, followed by a graph embedding layer. The clustering layer splits the neighborhood of each node into two clusters, based on the appearance of a given node as a subject or object with respect to its neighboring nodes in the corresponding triples. These two clusters are assigned to two similar but independent WGCN models in the graph embedding layer to produce two embedding vectors of each node, one captures the behavior as a subject node, and the second captures the behavior as an object node.

As a decoder, our proposed TensorT defines a scoring function that distinguishes between the subject and object embeddings of an entity to take benefit from the two embedding vectors produced by the encoder for each entity, in contrast to many tensor factorization models, which do not consider any difference between the embeddings of an entity (i.e. consider a single embedding vector for each entity) whether it appears as an object or subject in a specific triple. Since a KG contains various types of relation patterns (e.g. symmetric and asymmetric), a representative embedding model should be fully expressive to model all the relation patterns. Thus, we further derive a dimensionality bound that guarantees fully expressiveness of TensorT.

Our specific contributions are summarized as follows:

  • A novel KG embedding learning framework consisting of (1) an encoder: a DualWGCN that extends WGCN to produce two high quality embedding vectors of each entity, one as an object and another as a subject of the triples, and (2) a decoder: a novel fully expressive tensor factorization model TensorT, which exploits those embedding vectors for predicting missing entities.

  • Deriving a bound on the size of the embeddings for fully expressiveness and show that TensorT is fully expressive.

  • Theoretical analysis to show how the two existing representative models DistMult and ComplEx can be considered as special cases of our TensorT.

  • Evaluating the proposed framework on three recent standard link prediction datasets to demonstrate the relative improvements over several state-of-the-art link prediction models.

The rest of the paper is organized in the following sections. Related work and the proposed framework are discussed in Sections 2 Related work, 3 Methodology respectively, followed by the theoretical insights, experimental results, and concluding remarks in Sections 4, 5 and 6 respectively.

Section snippets

Related work

In this section, we discuss several existing KG embedding models for link prediction. The entity embeddings, relation embeddings, and scoring functions of these models are summarized in Table 1.

Methodology

We first provide a brief background on link prediction, followed by a brief background on WGCN. Finally, we present the proposed framework.

Bound on embedding dimensionality for full expressiveness

Any embedding model is considered fully expressive, if it can represent any ground truth, i.e., there exists an assignment of values to the relations and entities embeddings that accurately separates true and false triples. The following Preposition 1 establishes bound on the size of the relation and entity embeddings for full expressivity of TensorT.

Preposition 1

For any ground truth over a set of entities E and relations R, there exists a TensorT model with embeddings of dimensionality |E|.|R|, that

Dataset description

Enterprise KGs such as Airbnb (Chang, 2018), eBay (Pittman et al., 2017), and Bloomberg (Edgar, 2019) are typically internal to a company, applied for commercial use-cases and are not publicly available. However, some open source KGs such as Freebase (Bollacker, Evans, Paritosh, Sturge, & Taylor, 2008) and WordNet (Miller, 1995) are publicly accessible. These open source KGs are both extremely large and highly incomplete by nature. For instance, Freebase contains around 1.2 billion triples and

Conclusion

In this paper, we proposed an end-to-end KG embedding learning framework that consists of an encoder of a DualWGCN and a decoder of TensorT, a novel fully expressive tensor factorization model, along with an efficient linear-time formulation of TensorT. We presented a theoretical analysis to show that how TensorT subsumes previous tensor factorization models to link prediction. We also derived a bound on the size of embeddings for full expressivity and showed that TensorT is a fully expressive

CRediT authorship contribution statement

Adnan Zeb: Conceptualization, Investigation, Methodology, Writing - original draft. Anwar Ul Haq: Writing - review & editing, Validation. Defu Zhang: Formal analysis, Writing - review & editing, Supervision. Junde Chen: Resources, Visualization. Zhiguo Gong: Formal analysis, Writing - review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgment

This work is partially supported by the National Natural Science Foundation of China (Grant no. 61672439).

References (52)

  • Dettmers, T., Minervini, P., Stenetorp, P., & Riedel, S. (2018). Convolutional 2D knowledge graph embeddings. In...
  • Dong, X., Gabrilovich, E., Heitz, G., Horn, W., Lao, N., & Murphy, K., et al. (2014). Knowledge vault: A web-scale...
  • EdgarM.

    Understanding news using the bloomberg knowledge graph

  • GlorotX. et al.

    A semantic matching energy function for learning with multi-relational data

    Machine Learning

    (2013)
  • He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE...
  • HoganA. et al.

    Knowledge graphs

    (2020)
  • IoffeS. et al.

    Batch normalization: Accelerating deep network training by reducing internal covariate shift

    (2015)
  • Ji, G., He, S., Xu, L., Liu, K., & Zhao, J. (2015). Knowledge graph embedding via dynamic mapping matrix. In...
  • KertkeidkachornN. et al.

    An automatic knowledge graph creation framework from natural language text

    IEICE Transactions on Information and Systems

    (2018)
  • KingmaD.P. et al.

    Adam: A method for stochastic optimization

    (2015)
  • KipfT. et al.

    Semi-supervised classification with graph convolutional networks

    (2017)
  • Lin, Y., Liu, Z., Sun, M., Liu, Y., & Zhu, X. (2015). Learning entity and relation embeddings for knowledge graph...
  • Lin, X. V., Socher, R., & Xiong, C. (2018). Multi-hop knowledge graph reasoning with reward shaping. In Proceedings of...
  • LiuS. et al.

    High-order weighted graph convolutional networks

    (2019)
  • LiuH. et al.

    Analogical inference for multi-relational embeddings

  • MartinezJ.L. et al.

    Openie-based approach for knowledge graph construction from text

    Expert Systems with Applications

    (2018)
  • Cited by (24)

    • Mutual Boost Network for attributed graph clustering

      2023, Expert Systems with Applications
    • Learning knowledge graph embedding with a dual-attention embedding network

      2023, Expert Systems with Applications
      Citation Excerpt :

      Li et al. (2020) addresses the issue of object-tag prediction by dividing a KG into an object graph and a tag graph to respectively encode high-order proximities for objects and tags. Splitting an entity’s neighbors into two sets based on the direction of relations, Zeb et al. (2021) devises a dual weighted GCN framework based on WGCN to learn two different representations for entities in the KGE field. Guo et al. (2021) generates an attribute graph and a collaborative graph of users and items to respectively perform graph convolution for CTR prediction.

    View all citing articles on Scopus
    View full text