HiAM: A Hierarchical Attention based Model for knowledge graph multi-hop reasoning

doi:10.1016/j.neunet.2021.06.008

Neural Networks

Volume 143, November 2021, Pages 261-270

https://doi.org/10.1016/j.neunet.2021.06.008 Get rights and content

Abstract

Learning to reason in large-scale knowledge graphs has attracted much attention from research communities recently. This paper targets a practical task of multi-hop reasoning in knowledge graphs, which can be applied in various downstream tasks such as question answering, and recommender systems. A key challenge in multi-hop reasoning is to synthesize structural information (e.g., paths) in knowledge graphs to perform deeper reasoning. Existing methods usually focus on connection paths between each entity pair. However, these methods ignore predecessor paths before connection paths and regard entities and relations within every single path as equally important. With our observations, predecessor paths before connection paths can provide more accurate semantic representations. Furthermore, entities and relations in a single path contribute variously to the right answers. To this end, we propose a novel model HiAM (Hierarchical Attention based Model) for knowledge graph multi-hop reasoning. HiAM makes use of predecessor paths to provide more accurate semantics for entities and explores the effects of different granularities. Firstly, we extract predecessor paths of head entities and connection paths between each entity pair. Then, a hierarchical attention mechanism is designed to capture the information of different granularities, including entity/relation-level and path-level features. Finally, multi-granularity features are fused together to predict the right answers. We go one step further to select the most significant path as the explanation for predicted answers. Comprehensive experimental results demonstrate that our method achieves competitive performance compared with the baselines on three benchmark datasets.

Introduction

Reasoning over incomplete knowledge graphs (KGs) is one of the research hotspots in recent years, which has exerted a large number of applications such as question answering (Cui et al., 2017, Deng et al., 2019) and recommender systems (Fan et al., 2019, Wang, Wang et al., 2019). Knowledge graph reasoning aims to infer missing facts using existing knowledge in KGs. Our research concentrates on multi-hop reasoning (also known as multi-hop relation reasoning). Taking Fig. 1 as an example, we can infer the missing fact that VillersCotterets located in France via the existing path $V i l l e r s C o t t e r e t s \overset{L o c a t e d I n}{⟶} A i s n e \overset{L o c a t e d I n}{⟶} F r a n c e$ .

The current reasoning methods can be roughly classified into two categories, path-based and embedding-based. Different from traditional rule-based reasoning methods (Kok and Domingos, 2007, Lenat et al., 1990, Lisi, 2007, Nilsson, 1991), current methods mainly utilize path and triple features to infer new facts in KGs. Path-based reasoning methods are developed to capture path structure information. The random walk (Lao, Mitchell, & Cohen, 2011) inference technique is usually used to extract and select paths. Path Ranking Algorithm (PRA) (Lao & Cohen, 2010) is one of the most popular reasoning strategies, which encodes KGs as multi-relation graphs. In path-based reasoning methods, the paths between each entity pair are utilized to explore whether there is a specific relation between two entities (Lao and Cohen, 2010, Lin, Liu, Luan et al., 2015, Wang et al., 2016), which has strong practicality but poor generalization performance. Embedding-based reasoning methods are developed to learn low-dimensional vectors for both entities and relations in KGs by projecting them into the continuous vector spaces. Then the prediction problem is transformed into a simple vector operation, such as TransE (Bordes, Usunier, Garcia-Duran, Weston, & Yakhnenko, 2013), TransH (Wang, Zhang, Feng, & Chen, 2014), and DistMult (Yang, tau Yih, He, Gao, & Deng, 2015). Although embedding-based reasoning methods have been developed in a more comprehensive way, challenges still exist. Because these methods do not incorporate semantic information and path (rule) information in depth, reasoning performance is limited in complicated reasoning tasks. In addition, some methods use neural networks to capture the structural information and semantic information in KGs (Das et al., 2017, Neelakantan et al., 2015, Xiong et al., 2017).

The aforementioned reasoning methods predict answers using the connection paths between entity pairs, which are defined as the paths from head entity to tail entity, like the paths $A l e x a n d r e D u m a s \overset{W o r k e d I n}{⟶} P a r i s \overset{L o c a t e d I n}{⟶} F r a n c e$ and $A l e x a n d r e D u m a s \overset{B o r n I n}{⟶} V i l l e r s C o t t e r e t s \overset{L o c a t e d I n}{⟶} A i s n e \overset{L o c a t e d I n}{⟶} F r a n c e$ between the entity pair (Alexandre Dumas, France) as shown in Fig. 1. We can infer the fact $A l e x a n d r e D u m a s \overset{N a t i o n a l i t y}{⟶} F r a n c e$ by utilizing the connection paths. However, when predicting the relation between the entity pair Alexandre Dumas and La Reine Margot, we cannot determine whether the relation Write holds since Alexandre Dumas can refer to Alexandre Dumas pere or Alexandre Dumas fils. If it refers to Alexandre Dumas pere, the relation $W r i t e$ holds. If it refers to Alexandre Dumas fils, the relation $W r i t e$ does not hold. Predecessor paths are defined as the precondition of connection paths, for example, $T h e T h r e e M u s k e t e e r s \overset{W r i t t e n B y}{⟶} A l e x a n d r e D u m a s$ is the predecessor path of the connection paths between the entity pair Alexandre Dumas and France. The predecessor paths can provide more accurate semantics for entities and help predict the right answer. By using the predecessor path $T h e T h r e e M u s k e t e e r s \overset{W r i t t e n B y}{⟶} A l e x a n d r e D u m a s$ , we can infer that Alexandre Dumas refers to Alexandre Dumas pere not Alexandre Dumas fils. Thus the relation $W r i t e$ holds for the entity pair (Alexandre Dumas, La Reine Margot).

Another problem in existing methods is that they mainly concentrate on different path effects, but rarely consider different effects of entities and relations within every single path. We observe that different entities, relations, and paths have different contributions to reasoning. Given the query “What is the nationality of Alexandre Dumas?”, there are three related paths, as shown in Fig. 1. All these paths have contributions to the final reasoning result, where the path $T h e T h r e e M u s k e t e e r s \overset{W r i t t e n B y}{⟶} A l e x a n d r e D u m a s$ $\overset{B o r n I n}{⟶} V i l l e r s C o t t e r e t s \overset{L o c a t e d I n}{⟶} A i s n e \overset{L o c a t e d I n}{⟶} F r a n c e$ has a greater impact than other paths. In this path, the entities Alexandre Dumas, VillersCotterets and the relation LocatedIn have greater impacts than others.

To solve the above problems, we propose a novel model HiAM (Hierarchical Attention based Model) for knowledge graph reasoning. More specifically, we first extract the predecessor paths and the connection paths. Then a hierarchical attention mechanism is applied in our model to utilize the features of different granularities, including entity/relation-level and path-level features. The entity/relation-level attention is devised to capture entity and relation features, and explore their respective impact on reasoning. The path-level attention is used to capture path features, which takes the output of the entity/relation-level attention as input. Finally, multi-granularity features are leveraged to predict the probability of correct answers. We go one step further to select the most significant path as the explanation for predicted answers, like the explanation “Alexandre Dumas was born in VillersCotterets, which is located in Aisne, France.” for the answer France.

The main contributions of this paper can be summarized as follows:

•
Predecessor paths are incorporated into our method to provide more accurate semantics for entities and enhance the path representations.
•
The hierarchical attention mechanism is employed in our method to capture multi-granularity features, including entity/relation-level features and path-level features. Moreover, explainability for reasoning results can be provided via the hierarchical attention mechanism.
•
Experimental results show that our method achieves competitive performance compared with the baselines on three benchmark datasets.

The rest of this paper is organized as follows. In Section 2, we briefly introduce related work of embedding-based and path-based reasoning methods. Then we describe our proposed method HiAM in Section 3. Experimental results and analyses are presented in Section 4. Finally, we conclude our work and plan future work in Section 5.

Section snippets

Related work

The existing research on knowledge graph reasoning can be roughly divided into embedding-based and path-based methods.

Embedding-based reasoning transforms the prediction into simple vector operations by learning the low-dimensional vector representations in KGs. TransE (Bordes et al., 2013) is proposed to learn the representations of entities and relations via additive functions. TransH (Wang et al., 2014) and TransR (CtransR) (Lin, Liu, Sun, Liu and Zhu, 2015) are the improved TransE, where

Methodology

In this section, we first give a formal definition of our task briefly. Then, we elaborate on our proposed method HiAM, in which the path extraction module and hierarchical attention network module are introduced. The hierarchical attention network consists of entity/relation-level attention and path-level attention. In addition, the model complexity is analyzed to illustrate the high efficiency of the proposed HiAM. Lastly, we describe the training and optimization of our method.

Experiments

In this section, we evaluate the performance of our proposed HiAM on three benchmark datasets. In the following subsections, we first introduce the experimental setup briefly. Then, the performance comparisons and analysis are presented to show the effectiveness of our method. Finally, a case study is introduced to show the reasoning of HiAM.

Conclusions and future work

In this paper, we introduce a novel hierarchical attention based model for knowledge graph multi-hop reasoning on large KGs. Compared with previous work, our proposed model makes use of predecessor paths to provide more accurate semantics for entities and explores the effects of different granularity features on reasoning. The predecessor paths of head entities and the connection paths between each entity pair are combined as the input of our proposed model. A hierarchical attention mechanism

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This research is supported in part by the Beijing Municipal Science and Technology Project under Grant Z191100007119008.

References (40)

NilssonN.J.
Logic and artificial intelligence
Artificial Intelligence
(1991)
BalazevicI. et al.
Hypernetwork knowledge graph embeddings
Bollacker, K., Evans, C., Paritosh, P., Sturge, T., & Taylor, J. (2008). Freebase: A collaboratively created graph...
Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., & Yakhnenko, O. (2013). Translating embeddings for modeling...
Carlson, A., Betteridge, J., Kisiel, B., Settles, B., ER, H. J., & TM, M. (2010). Toward an architecture for...
Chen, W., Xiong, W., Yan, X., & Wang, W. Y. (2018). Variational knowledge graph reasoning. In Proceedings of the 2018...
Cui, W., Xiao, Y., Wang, H., Song, Y., Hwang, S., & Wang, W. (2017). KBQA: Learning question answering over QA corpora...
DasR. et al.
Go for a walk and arrive at the answer: Reasoning over paths in knowledge bases using reinforcement learning
Das, R., Neelakantan, A., Belanger, D., & McCallum, A. (2017). Chains of reasoning over entities, relations, and text...
Deng, Y., Xie, Y., Li, Y., Yang, M., Du, N., & Fan, W., et al. (2019). Multi-task learning with multi-view attention...

Dettmers, T., Minervini, P., Stenetorp, P., & Riedel, S. (2018). Convolutional 2D knowledge graph embeddings. In Proc....

Fan, S., Zhu, J., Han, X., Shi, C., Hu, L., & Ma, B., et al. (2019). Metapath-guided heterogeneous graph neural network...

KingmaR. et al.

Adam: A method for stochastic optimization

Computer Science

(2014)

KokS. et al.

Statistical predicate invention

LaoN. et al.

Relational retrieval using a combination of path-constrained random walks

Machine Learning

(2010)

Lao, N., Mitchell, T. M., & Cohen, W. W. (2011). Random walk inference and learning in a large scale knowledge base. In...

LenatD.B. et al.

CYC: Toward programs with common sense

Communications of the ACM

(1990)

Lin, Y., Liu, Z., Luan, H., Sun, M., Rao, S., & Liu, S. (2015). Modeling relation paths for representation learning of...

Lin, Y., Liu, Z., Sun, M., Liu, Y., & Zhu, X. (2015). Learning entity and relation embeddings for knowledge graph...

Lisi, F. A. (2007). Reasoning with OWL-DL in inductive logic programming. In C. Golbreich, A. Kalyanpur and B. Parsia...

Cited by (0)

View full text