Elsevier

Knowledge-Based Systems

Volume 183, 1 November 2019, 104870
Knowledge-Based Systems

Multi-view multitask learning for knowledge base relation detection

https://doi.org/10.1016/j.knosys.2019.104870Get rights and content

Abstract

Relation detection is a key component of knowledge base question answering (KBQA). Existing methods mainly focus on learning semantic relevance between the question and the candidate relation, which is challenging due to the lexical variation (i.e. lexical gaps) especially for relations with few annotated samples. In this paper, we propose to estimate the semantic relevance from both the traditional question–relation view and a novel question–question view, which leverages the similarities among questions corresponding to the same relation. The question–question view can facilitate the learning of few-shot relations, and it supports accessing the annotated samples during inference, thus allowing new annotations to take effect on–the–fly. Moreover, a multi-task learning framework is devised to jointly optimize the models of different views. Experimental results on WebQSP and a Chinese KB relation detection dataset demonstrate the effectiveness and generalization ability of the proposal.

Introduction

Knowledge base question answering (KBQA), which answers natural language questions by querying a knowledge base (KB), is an important task of natural language processing. Typical KBQA consists of two sub-tasks, namely topic entity detection and relation detection. The former focuses on recognizing the topic KB entity being queried, and then relation detection aims to determine the relation chain from the topic entity to the answer entity. In this paper, we focus on the relation detection subtask, which remains a bottleneck of KBQA [1] and has attracted substantial research attentions.

One of the main challenges for relation detection lies in the lexical chasm between the relation expressions in user queries and KB. Representation learning methods [1], [2], [3], [4] usually encode the relation and the question into vectors, and estimate their semantic relevance via cosine similarity between the two vectors. Recent works [5], [6] resort to performing comparison at the fine-grained level and aggregating the comparison results to estimate the relevance. These methods can alleviate the problem caused by lexical gaps to some extent. However, they may fail to capture the relevance when quite different relation expressions are adopted in KB and questions, especially for relations with few annotated samples. For example, without enough labeled samples, it is hard for the model to identify that the question “who are the senators of New Jersey” in Fig. 1 refers to the KB relation path “representatives..office_holder” (the symbol .. denotes the presence of a relation chain). In fact, this is also challenging for people without background knowledge.

Besides the KB relation itself, the questions querying this relation (defined as its parallel questions) can also indicate its semantic meaning, and these questions are likely to have similar expressions, as is illustrated in Fig. 1. In this case, it is much easier to capture the similarity between questions. Motivated by this observation, this paper devises a multi-view architecture, incorporating both the traditional question–relation view and the implicit question–question view which considers relevance between the question and parallel questions of the relation. The question–relation view corresponds to the vanilla relation detection model. Besides, assume that the parallel questions of a relation are paraphrases of each other when their topic entities are replaced with a special token such as “e”, then the question–question view can be formulated as a paraphrase detection problem.

When large lexical gaps exist between the question and the relation, their semantic relevance may be better reflected by the synonymous relationships between the question and parallel questions, so the overall relation detection performance could be improved by utilizing the two complementary views. Meanwhile, the Wh-words in parallel questions, such as “where” and “who”, often indicate the tail entity’s type, providing extra cues for determining the relation. Besides the improvement on accuracy, another advantage of the paradigm is that incrementally annotated questions could take effect on-the-fly without re-training the model, which may be useful in personalized KBQA where newly annotated samples can be mined from user feedbacks.

Both the vanilla relation detection and the paraphrase detection could be posed as similarity measurement between two sequences. In order to make the two views reinforce each other, we further propose a multitask learning framework to jointly optimize the relation detection model and the paraphrase detection model. A sketch of the multitask learning framework is given in Fig. 2. The question representation block and sequence relevance feature extraction block are shared between the two models, reducing the total number of parameters and the risk of over-fitting. The multi-view matching approach proposed by Yu et al. [6] enriches the relation representation with more information extracted from the KB and the relation. Specifically, the type of the tail entity and the similarity between the entity name and its mention in the question are taken as inputs of the relation detection model. Our proposal differs from their approach in both the information utilized and the modeling method. We are the first to utilize the similarities among parallel questions. The parallel questions could be used to stand for the relation completely, so the question–question view can be used independently to rank the relations, while the views studied in the literature [6] could only be used as supplementary information. At the meanwhile, the characteristic of the exploited information determines the difference of modeling methods. Our multi-view multitask learning framework is model agnostic, so the model with multiple inputs designed by Yu et al. [6] could be adopted as the vanilla relation detection model RD(r|q) of the framework.

Experimental results on two relation detection datasetsdemonstrate that incorporation of the question–question view improves the overall performance. Additionally, both the relation detection and the paraphrase detection models are enhanced under the multitask learning framework, proving the effectiveness of the proposed approach.

To sum up, the main contributions are as follows:

  • We propose to model the relation detection task from two complementary views, namely the classical question-relation view and a novel question–question view, to better estimate the semantic relevance between questions and relations;

  • A multitask learning framework is further devised, empowering the two views to reinforce each other, and meanwhile reducing the risk of over-fitting;

  • Our approach outperforms the state-of-the-art methods on the two relation detection datasets, and additional experiments also show that the multi-view multitask learning mechanism can be generalized to different model paradigms.

Section snippets

Relation detection for knowledge base question answering

Relation detection models for KBQA should be capable of handling the relations that are absent from training data to support open domain KBQA, which makes it different from relation classification [7]. Traditional KBQA relation detection models rely on hand-crafted features [8], [9]. Recent years have witnessed the study of deep learning based methods. Generally, the question and the candidate relation are firstly encoded into semantic vectors, and their relevance is estimated via cosine

Approach

In this section, we describe the proposed multi-view multitask learning method for knowledge base relation detection. Notations used in this section are listed in Table 1 prior to the model description.

Webqsp.

WebQSP [3] contains real world questions whose answers are entities from Freebase. Yu et al. [1] build a knowledge base relation detection benchmark1 based on the WebQSP dataset. The relation chains (length 2) connected to the topic entity are taken as candidate relations of the question, and relation chains pointing to the answer entity are labeled as golden relations.

CKBRD.

Following the similar method, a Chinese knowledge base

Conclusion and future work

Traditional KB relation detection models rank the candidate relations according to their semantic similarities with the question. This paper presents a multi-view mechanism to capture the semantic similarities from both the traditional relation view and the novel question view which leverages the relevance among questions corresponding to the same relation. Besides, jointly optimizing the two views under the multitask learning paradigm further enhances the performance. Experimental results on

References (47)

  • H. Zhang, G. Xu, X. Liang, T. Huang, K. Fu, An Attention-Based Word-Level Interaction Model: Relation Detection for...
  • Y. Yu, K. Saidul Hasan, M. Yu, W. Zhang, Z. Wang, Knowledge Base Relation Detection via Multi-View Matching, ArXiv...
  • ZengD. et al.

    Relation classification via convolutional deep neural network

  • YaoX. et al.

    Information extraction over structured data: Question answering with freebase

  • BastH. et al.

    More accurate question answering on freebase

  • DaiZ. et al.

    Cfo: Conditional focused neural question answering with large-scale knowledge bases

  • BordesA. et al.

    Translating embeddings for modeling multi-relational data

  • WestR. et al.

    Knowledge base completion via search-based question answering

  • HeL. et al.

    Knowledge base completion by variational Bayesian neural tensor decomposition

    Cognit. Comput.

    (2018)
  • YinW. et al.

    Convolutional neural network for paraphrase identification

  • BowmanS.R. et al.

    A large annotated corpus for learning natural language inference

  • WangB. et al.

    Inner attention based recurrent neural networks for answer selection

  • BromleyJ.M. et al.

    Signature verification using a siamese time delay neural network

  • Cited by (6)

    • Multitask learning with single gradient step update for task balancing

      2022, Neurocomputing
      Citation Excerpt :

      Second, research on an optimization method that can reduce the computation of the single gradient step updates in the inner loop as in [40] can be conducted to improve the efficiency of the proposed method. Finally, because the proposed method can be easily adapted to general multitask learning, it can be applied to various real-world multitask learning applications by being combined with other techniques [14–18,41–43]. Sungjae Lee: Conceptualization, Methodology, Software, Validation, Formal analysis, Writing – original draft, Visualization.

    • Distant Supervision Relation Extraction via adaptive dependency-path and additional knowledge graph supervision

      2021, Neural Networks
      Citation Excerpt :

      Relation Extraction (RE) system can extract meaningful relation facts from texts to build Knowledge Base (KB), which can be used for applications like machine reading comprehension (Cui et al., 2017), search (Fazzinga, Gianforme, Gottlob, & Lukasiewicz, 2011), and text generation (Liao, Wang, Li, & Li, 2017; Zhang et al., 2019).

    No author associated with this paper has disclosed any potential or pertinent conflicts which may be perceived to have impending conflict with this work. For full disclosure statements refer to https://doi.org/10.1016/j.knosys.2019.104870.

    View full text