ADRL: An attention-based deep reinforcement learning framework for knowledge graph reasoning

https://doi.org/10.1016/j.knosys.2020.105910Get rights and content

Abstract

Knowledge graph reasoning is one of the key technologies for knowledge graph construction, which plays an important part in application scenarios such as vertical search and intelligent question answering. It is intended to infer the desired entity from the entities and relations that already exist in the knowledge graph. Most current methods for reasoning, such as embedding-based methods, globally embed all entities and relations, and then use the similarity of vectors to infer relations between entities or whether given triples are true. However, in real application scenarios, we require a clear and interpretable target entity as the output answer. In this paper, we propose a novel attention-based deep reinforcement learning framework (ADRL) for learning multi-hop relational paths, which improves the efficiency, generalization capacity, and interpretability of conventional approaches through the structured perception of deep learning and relational reasoning of reinforcement learning. We define the entire process of reasoning as a Markov decision process. First, we employ CNN to map the knowledge graph to a low-dimensional space, and a message-passing mechanism to sense neighbor entities at each level, and then employ LSTM to memorize and generate a sequence of historical trajectories to form a policy and value functions. We design a relational module that includes a self-attention mechanism that can infer and share the weights of neighborhood entity vectors and relation vectors. Finally, we employ the actor–critic algorithm to optimize the entire framework. Experiments confirm the effectiveness and efficiency of our method on several benchmark data sets.

Introduction

Relational reasoning [1] is the basic problem of statistical relational learning methodology research, and it is also a hot issue in the field of the knowledge graph. The knowledge graph is a knowledge base [2], [3] of graph-like structures in which knowledge is expressed and kept in the form of “entity-relation-entity” triples [4]. The object of this paper is the relational reasoning problem on the knowledge graph, and the task objective of relational reasoning is to use the existing knowledge in the knowledge graph to infer the possible relation between entities using statistical machine learning methods or deep learning methods [5], [6], [7].

Relational reasoning is also known as entity prediction, relational prediction or knowledge graph completion [8], which aims to infer the missing element from the existing elements in a given triple to form a complete valid triple [9], [10]. The reasoning models used in the mainstream current knowledge base include the latent factor model [11], [12] and the random walk model [13]. The former maps entities and relations into a low-dimensional real vector space, and implements reasoning through vector similarity calculation [14]. The latter is built on first-order predicate logic [15] for relational reasoning between entities, and the algorithm complexity is reduced by a random algorithm. In comparison, the former has higher computational complexity due to the need for large-scale matrix operations [16], while the latter uses the random sampling method, which makes it difficult to fully utilize the existing structural information in the knowledge base, resulting in a lower recall rat e [17].

In recent years, deep learning (DL) [18], [19] and reinforcement learning (RL) [20], as principal research hotspots in the field of machine learning, has achieved remarkable success in many areas of artificial intelligence [21], [22]. The DL approach focuses on the perception and expression of things and the basic idea is to combine the underlying features through a multi-layered network structure and nonlinear transformation to form an abstract, easily distinguishable high-level representation to discover the distributed feature representation of the data [23], [24]. While the RL method is more focused on learning strategies of solving problems, the basic idea of RL [25] is to learn the optimal strategy for accomplishing the goal by maximizing the cumulative reward value that the agent obtains from the environment. In more and more complex real-world tasks, the DL is needed to automatically learn the abstract representation of large-scale input data so that the RL self-motivate is based on this representation to optimize the problem-solving strategy [26]. Taking advantage of DL and RL can form a new research hotspot in the field of artificial intelligence, that is, deep reinforcement learning (DRL). [27], [28], [29]. Inferring and complementing is equivalent to querying and finding answers which require access to many nodes and edges in the knowledge graph, so such a process of finding answers can be modeled as a serialized decision problem, which can naturally be solved with deep reinforcement learning [30], [31].

In this work, we propose a novel relational reinforcement learning framework that advocates mapping relation states, actions, and policies to a low-dimensional space using deep learning architectures [32], so that the generalization of deep learning combines the inferential ability of reinforcement learning to solve searching and Q&A problems [33]. Our approach advocates using a learnable, reusable, entity- and relation-centric function to implicitly reason about relations, and we employ a deep RL agent with architectural inductive biases that may be better suited to learning (and computing) relations.

Our framework has a lot of desirable features. First, we are effective in path-based computations because the intricate operations of ranking all entities in the knowledge graph are avoided by searching in small neighborhoods around the entity. Second, we employ the self-attention [34], [35] to infer the weight of entities and relations sharing with the RL, and ADRL does not require pre-training, supervision, and fine-tuning, but rather training the knowledge graph from searching via reinforcement learning. Third, we use the actor–critic algorithm [36] in reinforcement learning to effectively increase the efficiency of the model. It does not require a complete trajectory and is a model-free algorithm. The actor generates trajectories, and the critic learns policies and value functions to take a position on the actors’ actions so that the model can learn better policies. Finally, paths and trajectories that our agents explored automatically generate the resources for the model to continue exploring.

Our contributions in this paper are as follows:

  • We propose a new network architecture based on deep reinforcement learning for knowledge graph reasoning, which can improve the efficiency and interpretability of traditional methods through structured perception and relational reasoning.

  • We design a relational module that can be viewed as a universal plug-in for an inference framework, where the self-attention mechanism introduced iteratively infer the relations between entities to guide a model-free policy, which is more conducive for agents to infer relational paths compared to previous work.

  • We employ the actor–critic architecture to effectively solve the problem of reward sparsity, where the reward depends on the value function, and finally, it will be trained and optimized at the same time as the policy. Besides, we adopt an actor–critic set-up, using a distributed agent-based on asynchronous advantage actor–critic (A3C) algorithm, therefore multi-thread agents will enhance efficiency and ability to be more suitable for large-scale knowledge graphs.

Section snippets

Related work

In the early stage of the development of statistical relational learning methods, the main relational reasoning is based on first-order predicate logic rules [37]. The representative work is the Markov Logic Networks proposed by [32], which combines Markov random field with first-order predicate logic rules, and builds a logical network to model and reason the relation between entities. Its principal advantage is the high accuracy of relational reasoning. The disadvantage is the fact that it

Methodology

In this section, we describe in detail our Attention-based deep reinforcement learning framework(ADRL) for multi-hop relational reasoning. The core idea of Relational reinforcement learning [58], [59], [60] is to integrate reinforcement learning with Inductive Logic Programming or relational learning via representing actions, states and policies by using the first-order language. The relational language facilitates the use of background knowledge, which can be provided by rules and logical

Experiments

In this section, we first describe the data sets we use in our experiments and the parameter settings in model training. Then we designed a series of experiments to prove the validity and efficiency of our model, which outperformed the traditional embedding-based method, the PRA method and the recent method based on reinforcement learning.

Conclusion and future work

In this paper, we propose a novel attention-based deep reinforcement learning framework that incorporates a message passing mechanism and inductive bias for relational reasoning on knowledge graphs. We combine CNN and LSTM to encode the knowledge graphs and remember the histories, introduce relational module contains self-attention to infer the weight of the entities the agent walk on. That facilitates better decision making and can be shared with the actor–critic algorithm which is a

CRediT authorship contribution statement

Yongsheng Hao: Writing review & editing. Jie Cao: Funding acquisition. Other are done by Qi Wang.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

The work was partly supported by the National Social Science Foundation of China (No. 16ZDA054), and the National Science Foundation of China (No. U1736105).

Qi Wang received the B.S. and M.Eng. degree in Software engineering from Jilin University (Changchun city) and Central South University (Changsha city) in 2012 and 2016, China, respectively. He is currently pursuing the PH.D. degree in Software engineering at the school of computer science, Fudan University, Shanghai, China.

His current research interests include knowledge graph, deep learning, and reinforcement learning.

References (71)

  • NickelMaximilian et al.

    A three-way model for collective learning on multi-relational data

  • SantoroAdam

    A simple neural network module for relational reasoning

  • GlorotXavier

    A semantic matching energy function for learning with multi-relational data

    (2013)
  • Matt Gardner, et al. Improving learning and inference in a large knowledge-base using latent syntactic cues, in:...
  • SocherRichard

    Reasoning with neural tensor networks for knowledge base completion

  • BordesAntoine

    Translating embeddings for modeling multi-relational data

  • LaoNi et al.

    Random walk inference and learning in a large scale knowledge base

  • BordesAntoine

    Translating embeddings for modeling multi-relational data

  • YangFan et al.

    Differentiable learning of logical rules for knowledge base reasoning

  • HeHe

    Learning symmetric collaborative dialogue agents with dynamic knowledge graph embeddings

    (2017)
  • Matt Gardner, et al. Improving learning and inference in a large knowledge-base using latent syntactic cues, in:...
  • HintonGeoffrey

    Deep neural networks for acoustic modeling in speech recognition

    IEEE Signal Process. Mag.

    (2012)
  • LeCunYann et al.

    Deep learning

    Nature

    (2015)
  • ZhangChiyuan

    A study on overfitting in deep reinforcement learning

    (2018)
  • SilverDavid

    Mastering the game of go with deep neural networks and tree search

    Nature

    (2016)
  • SilverDavid

    Mastering the game of go without human knowledge

    Nature

    (2017)
  • Xavier Glorot, Yoshua Bengio, Understanding the difficulty of training deep feedforward neural networks, in:...
  • SantoroAdam

    A simple neural network module for relational reasoning

  • DžeroskiSašo et al.

    Relational reinforcement learning

    Mach. Learn.

    (2001)
  • S.Sutton Richard et al.

    Reinforcement Learning: An Introduction

    (2018)
  • VolodymyrMnih

    Human-level control through deep reinforcement learning

    Nature

    (2015)
  • Kurt Driessens, Jan Ramon, Relational instance based regression for relational reinforcement learning, in: Proceedings...
  • Van OtterloMartijn

    Relational representations in reinforcement learning: Review and open problems

  • EspositoMassimo

    Hybrid query expansion using lexical resources and word embeddings for sentence retrieval in question answering

    Inform. Sci.

    (2019)
  • Veličković.Petar

    Graph attention networks

    (2017)
  • Cited by (48)

    View all citing articles on Scopus

    Qi Wang received the B.S. and M.Eng. degree in Software engineering from Jilin University (Changchun city) and Central South University (Changsha city) in 2012 and 2016, China, respectively. He is currently pursuing the PH.D. degree in Software engineering at the school of computer science, Fudan University, Shanghai, China.

    His current research interests include knowledge graph, deep learning, and reinforcement learning.

    Yongsheng Hao received his MS Degree of Engineering from Qingdao University in 2008. Now, he is a senior engineer of Network Center, Nanjing University of Information Science & Technology. His current research interests include distributed and parallel computing, mobile computing, Grid computing, web Service, particle swarm optimization algorithm and genetic algorithm. He has published more than 30 papers in international conferences and journals.

    Jie Cao received the Ph.D. degree from Southeast University, Nanjing, China, in 2005. He was an Associate Professor, from 1999 to 2006. From 2006 to 2009, he was a Postdoctoral Fellow of the Academy of Mathematics and Systems Science, Chinese Academy of Science. From 2009 to 2019, he was a Professor with the School of Management and Economics, Nanjing University of Information Science and Technology. Since May 2019, he has been the Vice President of the Xuzhou University of Technology. His research interests include system engineering, and management science and technology.

    View full text