Path-based reasoning approach for knowledge graph completion using CNN-BiLSTM with attention mechanism

doi:10.1016/j.eswa.2019.112960

Expert Systems with Applications

Volume 142, 15 March 2020, 112960

https://doi.org/10.1016/j.eswa.2019.112960 Get rights and content

Highlights

•
Coupled CNN and BiLSTM model accurately encoded paths for knowledge graph completion.
•
Combining embeddings of paths between two entities exhibits the semantic relation between the entities.
•
Multistep reasoning efficiently predicts the missing links between two entities.
•
Attention based CNN-BiLSTM performs better than the recent state-of-the-art path-reasoning methods.

Abstract

Knowledge graphs are valuable resources for building intelligent systems such as question answering or recommendation systems. However, most knowledge graphs are impaired by missing relationships between entities. Embedding methods that translate entities and relations into a low-dimensional space achieve great results, but they only focus on the direct relations between entities and neglect the presence of path relations in graphs. On the contrary, path-based embedding methods consider a single path to make inferences. It also relies on simple recurrent neural networks while highly efficient neural network models are available for processing sequence data. We propose a new approach for knowledge graph completion that combines bidirectional long short-term memory (BiLSTM) and convolutional neural network modules with an attention mechanism. Given a candidate relation and two entities, we encode paths that connect the entities into a low-dimensional space using a convolutional operation followed by BiLSTM. Then, an attention layer is applied to capture the semantic correlation between a candidate relation and each path between two entities and attentively extract reasoning evidence from the representation of multiple paths to predict whether the entities should be connected by the candidate relation. We extend our model to perform multistep reasoning over path representations in an embedding space. A recurrent neural network is designed to repeatedly interact with an attention module to derive logical inference from the representation of multiple paths. We perform link prediction tasks on several knowledge graphs and show that our method achieves better performance compared with recent state-of-the-art path-reasoning methods.

Introduction

Knowledge graphs (KGs), such as Freebase, WordNet, or NELL, are valuable resources for building intelligent systems such as question answering or recommendation systems. These KGs contain millions of facts about real-world entities and relations in the form of triples, e.g., (Bill Gates, founded, Microsoft). Additionally, a large amount of missing relations (triples) exists between the entities in KGs. To effectively use KGs for other applications, one must perform a KG completion (KGC) task and infer missing links or triples. The basic idea of a KGC task is to automatically infer missing triples by utilizing existing triples. In recent years, the embedding methods that translate entities and relations into a low-dimensional space have achieved great results on KGC tasks. However, most embedding methods only consider the direct relations between entities and overlook the presence of paths. Previously, path ranking algorithms (PRAs), such as those proposed by Lao, Mitchell, and Cohen (2011) and Gardner and Mitchell (2015), have shown that the relation paths that consist of the relation types between two entities can be effectively used for KGC. Such methods perform random walks over a graph and construct a feature matrix by enumerating the paths between all entities (entity pairs) given a candidate relation. Then, a binary classification method, such as logistic regression or decision tree, is trained on the feature matrix to infer missing links. In recent years, path-based reasoning methods (Das, Dhuliawala, Zaheer, Vilnis, Durugkar, Krishnamurthy, et al., 2018, Das, Neelakantan, Belanger, McCallum, 2017, Nickel, Tresp, Kriegel, 2011, Xiaotian, Quan, Baoyuan, Yongqin, Peng, Bin, 2017) have successfully applied recurrent neural networks (RNNs) to KGC tasks by embedding reasoning paths into a low-dimensional space and have shown significant improvements over PRA methods. The idea behind these path-reasoning approaches is that the semantic of a relation between entities can be represented by the semantic of multiple paths that connect the entities. Therefore, the missing relations between two entities can be inferred by learning the paths that connect the entities. However, these reasoning methods train a simple RNN, whereas highly efficient methods on sequence data processing are available and exhibit better performance compared to RNNs. Most of these methods use max-pooling or mean operations to combine multiple paths and neglect the fact that each path provides different reasoning evidence. In fact, an individual path, such as $(s, spouse, e) \land (e, born In, t),$ frequently does not provide any indication of a semantic relationship between entities s and t.

In this paper, we propose a new attention-based approach for KGC that couples a convolutional neural network (CNN) with a bidirectional long a short-term memory (BiLSTM) module. First, given a candidate relation and two entities, our method encodes multiple reasoning paths between the entities into low-dimensional embeddings using the CNN followed by the BiLSTM module. Second, we assume that not all paths between two entities equally contribute to inferring the missing relation between the entities. To this end, an attention mechanism (Bahdanau, Cho, Bengio, 2015, Vaswani, Shazeer, Parmar, Uszkoreit, Jones, Gomez, et al., 2017) is applied to capture the semantic correlation between a candidate relation and each path between two entities and to generate a vector representation for all paths between the entities. The paths of varying lengths that connect entities are encoded into a fixed-length real-valued vector. Finally, the summation of the relation embedding and the vector representation of the paths is passed through a fully-connected layer to predict if two entities should be connected by the candidate relation. The principle behind our method is that the CNN extracts the local features in the path and the BiLSTM network uses the ordering of the local features to learn about the entities and the relation orderings of each path. Finally, the attention layer extracts reasoning evidence from the paths that are correlated to the candidate relation. The attention mechanism in our model is identical to that of Xiaotian et al. (2017). The only difference is that instead of computing the dot product of the target relation and weighted path vectors, we applied the additive attention function using a feedforward network that scales well to smaller values. Dot-product attention is faster and space-efficient but in a few cases, requires an additional scaling factor to compute correct attention weights, which was not implemented in the previous study.

Furthermore, when the search space in path representation is very large, combining all paths does not provide sufficient evidence to make an inference about the relationship between entities. Therefore, to narrow down the search in a continuous space, we suggest using multiple steps of reasoning. Owing to this issue, we extend our model to perform multistep reasoning over path distribution. We adopt an RNN-like multihop (Sukhbaatar, Szlam, Weston, & Fergus, 2015) reasoning network that enables the model to read the embeddings of the same paths multiple times and update the encoding vector at each step before producing the final output. Through experiments, we demonstrate that multistep reasoning over path distribution can significantly improve the reasoning performance of KGC tasks. Moreover, our model is collectively trained end-to-end with gradient descent for all candidate relations.

In experiments, we perform link prediction tasks on four different KGs, i.e., NELL, Freebase, Kinship, and Countries. We compare our method with the most recent state-of-the art methods of path-based reasoning using various measures. For link prediction tasks, given test triples, we replace the source or target entity for each test triple with random entities and measure the rank of the corrupted set in terms of the original triple using each method. We further visualize multiple reasoning paths and observe that the paths that connect similar entity pairs are closely clustered together. Empirically, we show that our approach achieves comparable results with previous methods and exhibits better performance in a few cases.

Section snippets

Related work

This section reviews previous studies on KGC tasks. Previous works are broadly divided into two categories, i.e., path-based reasoning and KG embedding. KG embedding predicts missing links by applying low-dimensional embedding approaches to KGs (Bordes, Usunier, Garcia-Duran, Weston, Yakhnenko, 2013, Nickel, Murphy, Tresp, Gabrilovich, 2015, Nickel, Tresp, Kriegel, 2011, Wang, Mao, Wang, Guo, 2017). The key idea of embedding-based KGC is to represent entities and relations as low-dimensional

Method

In this section, we present our approach for KGC via link prediction tasks, which aim to predict missing links in a graph. An overview of the approach is shown in Figs. 1 and 2. First, we briefly review the problem of KGC and the PRA and describe how we obtain paths. Then, we introduce the CNN and BiLSTM modules, which embed relational paths into a low-dimensional space and combine those paths using an attention module according to a query relation. Then, we describe the RNN, which performs

Experiments

We evaluate our model on link prediction tasks and report the results on four different KGs. The statistics of the graph datasets are presented in Table 1. The hyperparameters of our model that result in the best performance on the development set are selected via a small grid search. Several measures are adopted to quantitatively evaluate our model, including F1, mean average precision (MAP), and mean reciprocal rank (MRR). MAP is the average of precision values at the ranks where relevant

Conclusion

In this paper, we propose a new approach for KGC that combines BiLSTM and CNN modules with an attention mechanism. Given a candidate relation and two entities, we encode the paths that connect the entities into a low-dimensional space using a convolutional operation followed by BiLSTM. Then, an attention layer is applied to combine multiple paths efficiently. We further extend our model to perform multistep reasoning over path representations in an embedding space. Compared to other models, our

Acknowledgement

This work was supported by Institute of Information & communications Technology Planning & evaluation (IITP) grant funded by the Korea government (MSIT) (2019000067, Semantic Analysis Reasoning Methods for Automatic Completion of Large Scale Knowledge Graph).

CRediT authorship contribution statement

Batselem Jagvaral: Conceptualization, Data curation, Writing - original draft, Writing - review & editing. Wan-Kon Lee: Writing - review & editing. Jae-Seung Roh: Data curation. Min-Sung Kim: Data curation. Young-Tack Park: Conceptualization, Writing - original draft, Writing - review & editing.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References (30)

D. Bahdanau et al.
Neural machine translation by jointly learning to align and translate
(2015)
A. Bordes et al.
Translating embeddings for modeling multi-relational data
Advances in neural information processing systems 26
(2013)
G. Bouchard et al.
On approximate reasoning capabilities of low-rank vector spaces
AAAI spring symposia
(2015)
R. Das et al.
Go for a walk and arrive at the answer: Reasoning over paths in knowledge bases using reinforcement learning
International conference on learning representations (ICLR)
(2018)
R. Das et al.
Chains of reasoning over entities, relations, and text using recurrent neural networks
Proceedings of the 15th conference of the European chapter of the Association for Computational Linguistics: Volume 1, Long papers
(2017)
T. Dettmers et al.
Convolutional 2D knowledge graph embeddings
AAAI
(2018)
Z. Gan et al.
Multi-step reasoning via recurrent dual attention for visual dialog
(2019)
M. Gardner et al.
Efficient and expressive knowledge base completion using subgraph feature extraction
Proceedings of the 2015 conference on empirical methods in natural language processing
(2015)
M. Gardner et al.
Improving learning and inference in a large knowledge-base using latent syntactic cues
Proceedings of the 2013 conference on empirical methods in natural language processing
(2013)
S. Hochreiter et al.
Long short-term memory
Neural Computation
(1997)

S. Kok et al.

Statistical Predicate Invention

Proceedings of the 24th international conference on machine learning

(2007)

N. Lao et al.

Relational retrieval using a combination of path-constrained random walks

Machine Learning

(2010)

N. Lao et al.

Random walk inference and learning in a large scale knowledge base

Proceedings of the Conference on Empirical Methods in Natural Language Processing

(2011)

Y. Lin et al.

Modeling relation paths for representation learning of knowledge bases

Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

(2015)

L. van der Maaten et al.

Viualizing data using t-SNE

Journal of Machine Learning Research

(2008)

Cited by (53)

Improving flight delays prediction by developing attention-based bidirectional LSTM network
2024, Expert Systems with Applications
Recently, the significance of accurate aircraft delay forecasting has grown in the aviation sector, which caused multi-billion-dollar losses faced by airlines and airports and passenger loyalty losses. Due to the importance of accurate flight delay prediction for all stakeholders involved, the aviation sector seeks to develop techniques for more robust flight delay prediction. Is quickly becoming an important research issue to improve airport and airline service performance and offer customers dependable travel itineraries. Machine learning techniques have been used in a number of studies to evaluate and resolve issues with flight delay prediction. This paper proposes a framework integrated network called ‘Attention-based Bidirectional long short-term memory’ (ATT-BI-LSTM) for flight delay prediction. The Bidirectional LSTM model extracts the spatial and temporal of the flight network with weather features. The ‘Attention mechanism’ has been proposed to enable the model to discover significant and discriminating features that contribute to categorization. The first stage of the proposed framework is the ‘preprocessing of dataset’ which is performed through two steps. The first step is data transformation using MinMax scaler to reduce the variation in the data. The second step is ‘balancing the dataset’ using SMOTE technique for balancing data. The second stage is the establishment of the ATT-BI-LSTM network through the deep tuning experiments of network structure to identify the best combination of parameters and network architecture. To validate the performance of the proposed framework, a wide network of US domestic flights tested in two scenarios. In Scenario 1, the objective is to predict the delay of flight arrival and departure, by using basic flight features with the weather. In Scenario 2, the objective is to predict the delay of flight arrival, by using basic flight features with departure delays. Simulation results show that in scenario 1, the training accuracy of flights’ delay is 88% in both flight delay arrivals and departure, and the testing accuracies of flights’ delay 83% and 82% in departure and arrival respectively. On the other hand, in scenario 2, testing accuracies are 94.30% and 93.71% in the two datasets respectively. The simulation results show that the ATT-BI-LSTM model outperforms other models found in the literature. Therefore, the developed ATT-BI-LSTM framework can contribute strongly to mitigating flight delays by providing a high accuracy prediction system in real-time monitoring to airport and aviation authorities.
Knowledge graph relation reasoning with variational reinforcement network
2023, Information Fusion
Knowledge graphs typically suffer from incompleteness due to construction defects and therefore need to be complemented by reasoning methods to facilitate advanced applications. Although many approaches have been proposed to address knowledge graph reasoning, they are heuristic and limited by the search quality and quantity of paths between entities. Inspired by variational inference and reinforcement learning, this paper proposes a variational reinforcement network (termed VRNet), which aims to infer new relation by fusing the information found on the paths connecting a pair of entities to complete the knowledge graph. Specifically, we assume that the direct relation between two entities can be inferred by multiple paths, which are likely to be multi-hop and modeled by Markov chains. We introduce latent variables to bridge the paths and relation, and design a multi-class classifier and score functions to determine the relations. Instead of traversing all the paths, we use the variational approach combined with reinforcement learning to search necessary paths with relational discrimination information. Experimental results on multiple real-world datasets indicate that VRNet integrates information from different paths and has achieved competitive performances in relation reasoning tasks.
Knowledge graph completion method based on quantum embedding and quaternion interaction enhancement
2023, Information Sciences
Knowledge graphs (KG) are used for many downstream tasks in artificial intelligence (AI). However, owing to accuracy issues associated with information extraction, KGs are often incomplete. This has led to the emergence of knowledge graph completion (KGC) tasks. Their purpose is to learn known facts to infer the missing entities in triples. Traditional embedding-based methods usually only focus on the information of individual triples and do not use the deep logical relationships of the KG. In this study, we propose a new KGC method referred to as QIQE-KGC. It uses quantum embedding and quaternion space interaction to capture the external logical relationship between triples in a KG and enhance the connection between entities and relations within a single triple to model and represent the KG. The proposed QIQE-KGC model can capture richer logical information and has more powerful and complex relationship modeling capabilities. Extensive experimental results using QIQE-KGC on 11 datasets demonstrate that the model achieves outstanding performance. Compared to the baseline models, QIQE-KGC produced the best results on most datasets.
RLAT: Multi-hop temporal knowledge graph reasoning based on Reinforcement Learning and Attention Mechanism
2023, Knowledge-Based Systems
Temporal knowledge graphs (TKGs) have been widely used in artificial intelligence but are usually incomplete. Recently, researchers have proposed reasoning methods to infer missing relations in TKGs. TKG reasoning is mainly divided into single-hop reasoning and multi-hop reasoning. Multi-hop reasoning takes the semantics of facts into account. However, multi-hop reasoning methods of TKGs lack a path memory component, and the reasoning results depend on good training. In addition, different adjacent feature information is assigned the same weight, which makes it difficult to distinguish their importance to the central entity. We propose a multi-hop reasoning model combining Reinforcement Learning with the ATtention mechanism (RLAT) to solve the above limitations. First, we use LSTM and attention mechanism as memory components, which are helpful to train multi-hop reasoning paths. Second, an attention mechanism with an influence factor is proposed. This mechanism measures the influence of neighbor information and provides different feature vectors. The strategy function obtained makes the agent focus on occurring relations with high frequency and then reasons multi-hop reasoning paths with higher correlation. Experimental results demonstrate that our approach outperforms most metrics compared to recent efforts.
Researcher influence prediction (ResIP) using academic genealogy network
2023, Journal of Informetrics
In academia researchers join a research community over time and contribute to the advancement of a field in a variety of ways. One of the most established ways to contribute to the field is by passing on knowledge to the future generations through academic advising. Many academic scholars have more influence, while others fail to make an impact. Typically, academic influence refers to the ability of a researcher to pass on her/his “academic gene” in future researchers. In this article, we propose the task of Researcher Influence Prediction (ResIP) to predict researchers’ future influence in an academic field through the analysis of the corresponding academic genealogy network. Researcher influence prediction has got several implications as far as different academic outcomes are concerned (e.g. funding, awards, career progression, collaboration, identifying prolific researchers etc.).
To address the ResIP, a number of end-to-end deep learning architectures have been proposed in the current work. The proposed architectures take as input the lineage graph of a researcher at a given time point and predicts the growth of his/her family in future time points. The design of encoder in the proposed architecture considers both temporal and structural information of the input lineage graph while the decoders are tuned towards the nature of the output (single point vs. sequence). The proposed models have been trained, validated and compared with strong baselines using datasets created out of a subset of researchers from the Mathematics Genealogy Project (MGP).
BiLSTM deep neural network model for imbalanced medical data of IoT systems
2023, Future Generation Computer Systems
Citation Excerpt :
As a result improved mechanism was able to work on life prediction data. Idea proposed in [9] was based on attention mechanism for CNN for path reasoning from the input data by BiLSTM cooperating with graph based network. Another interesting hybrid model was wave height forecasting for Australian coast line proposed in [10].
Health informatics is one of the most developed field in recent time. Computational Intelligence is among the most influential factors that may help to improve patient oriented and secure decision support model. In this article we present a model of IoT system, which combines BiLSTM deep learning with Decision Tree model and data balancing strategy used to help in automated diagnosis support. Presented solution include experimental series of data preprocessing using well established balancing algorithms with custom parameters and modifications in order to best prepare the data for the network training. Such algorithms are ADASYN, SMOTE-Tomek, etc. The system helps to evaluate questionnaires and securely exchange documents between patient and corresponding medical team. From the level of system patient and doctors are able to see automated diagnosis provided by deep learning model. The model gives an important advance to help patients faster. Results show that proposed BiLSTM deep learning with decision tree mode detects diseases from questionnaires with accuracy above 96%, precision above 88% and recall above 96% which proves efficiency of our proposed model.

View all citing articles on Scopus

View full text

Path-based reasoning approach for knowledge graph completion using CNN-BiLSTM with attention mechanism

Highlights

Abstract

Introduction

Section snippets

Related work

Method

Experiments

Conclusion

Acknowledgement

CRediT authorship contribution statement

Declaration of competing interest

Neural machine translation by jointly learning to align and translate

Translating embeddings for modeling multi-relational data

Advances in neural information processing systems 26

On approximate reasoning capabilities of low-rank vector spaces

AAAI spring symposia

Go for a walk and arrive at the answer: Reasoning over paths in knowledge bases using reinforcement learning

International conference on learning representations (ICLR)

Chains of reasoning over entities, relations, and text using recurrent neural networks

Proceedings of the 15th conference of the European chapter of the Association for Computational Linguistics: Volume 1, Long papers

Convolutional 2D knowledge graph embeddings

AAAI

Multi-step reasoning via recurrent dual attention for visual dialog

Efficient and expressive knowledge base completion using subgraph feature extraction

Proceedings of the 2015 conference on empirical methods in natural language processing

Improving learning and inference in a large knowledge-base using latent syntactic cues

Proceedings of the 2013 conference on empirical methods in natural language processing

Long short-term memory

Neural Computation