ABSTRACT
Why-question answering (why-QA) is a task to retrieve answers (or answer passages) to why-questions (e.g., "why are tsunamis generated?") from a text archive. Several previously proposed methods for why-QA improved their performance by automatically recognizing causalities that are expressed with such explicit cues as "because" in answer passages and using the recognized causalities as a clue for finding proper answers. However, in answer passages, causalities might be implicitly expressed, (i.e., without any explicit cues): "An earthquake suddenly displaced sea water and a tsunami was generated." The previous works did not deal with such implicitly expressed causalities and failed to find proper answers that included the causalities. We improve why-QA based on the following two ideas. First, implicitly expressed causalities in one text might be expressed in other texts with explicit cues. If we can automatically recognize such explicitly expressed causalities from a text archive and use them to complement the implicitly expressed causalities in an answer passage, we can improve why-QA. Second, the causes of similar events tend to be described with a similar set of words (e.g., "seismic energy" and "tectonic plates" for "the Great East Japan Earthquake" and "the 1906 San Francisco Earthquake"). As such, even if we cannot find in a text archive any explicitly expressed cause of an event (e.g., "the Great East Japan Earthquake") expressed in a question (e.g., "Why did the Great East Japan earthquake happen?"), we might be able to identify its implicitly expressed causes with a set of words (e.g., "tectonic plates") that appear in the explicitly expressed cause of a similar event (e.g., "the 1906 San Francisco Earthquake").
We implemented these two ideas in our multi-column convolutional neural networks with a novel attention mechanism, which we call causality attention. Through experiments on Japanese why-QA, we confirmed that our proposed method outperformed the state-of-the-art systems.
- J. Ba, V. Mnih, and K. Kavukcuoglu. Multiple object recognition with visual attention. In Proceedings of ICLR, 2015.Google Scholar
- D. Bahdanau, K. Cho, and Y. Bengio. Neural machine translation by jointly learning to align and translate. In Proceedings of ICLR, 2015.Google Scholar
- F. Bastien, P. Lamblin, R. Pascanu, J. Bergstra, I. J. Goodfellow, A. Bergeron, N. Bouchard, and Y. Bengio. Theano: new features and speed improvements. Deep Learning and Unsupervised Feature Learning NIPS 2012 Workshop, 2012.Google Scholar
- G. Bouma. Normalized (pointwise) mutual information in collocation extraction. In Proceedings of the Biennial GSCL Conference, pages 31--40, 2009.Google Scholar
- J. Chorowski, D. Bahdanau, D. Serdyuk, K. Cho, and Y. Bengio. Attention-based models for speech recognition. In Proceedings of NIPS, pages 577--585, 2015.Google Scholar
- D. C. Ciresan, U. Meier, and J. Schmidhuber. Multi-column deep neural networks for image classification. In Proceedings of CVPR, pages 3642--3649, 2012. Google ScholarCross Ref
- R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu, and P. Kuksa. Natural language processing (almost) from scratch. J. Mach. Learn. Res., 12:2493--2537, Nov. 2011.Google ScholarCross Ref
- Q. X. Do, Y. S. Chan, and D. Roth. Minimally supervised event causality identi cation. In Proceedings of EMNLP, pages 294--303, 2011.Google Scholar
- M. Feng, B. Xiang, M. R. Glass, L. Wang, and B. Zhou. Applying deep learning to answer selection: A study and an open task. In Proceedings of ASRU, pages 813--820, 2015. Google ScholarCross Ref
- R. Girju. Automatic detection of causal relations for question answering. In Proceedings of the ACL 2003 workshop on Multilingual summarization and question answering, pages 76--83, 2003. Google ScholarDigital Library
- A. S. Gordon, C. A. Bejan, and K. Sagae. Commonsense causal reasoning using millions of personal stories. In Proceedings of AAAI, pages 1180--1185, 2011.Google Scholar
- C. Hashimoto, K. Torisawa, J. Kloetzer, and J. Oh. Generating event causality hypotheses through semantic relations. In Proceedings of AAAI, pages 2396--2403, 2015.Google Scholar
- C. Hashimoto, K. Torisawa, J. Kloetzer, M. Sano, I. Varga, J. Oh, and Y. Kidawara. Toward future scenario generation: Extracting event causality exploiting semantic relation, context, and association features. In Proceedings of ACL, pages 987--997, 2014. Google ScholarCross Ref
- K. M. Hermann, T. Kočisky, E. Grefenstette, L. Espeholt, W. Kay, M. Suleyman, and P. Blunsom. Teaching machines to read and comprehend. In Proceedings of NIPS, 2015.Google ScholarDigital Library
- R. Higashinaka and H. Isozaki. Corpus-based question answering for why-questions. In Proceedings of IJCNLP, pages 418--425, 2008.Google Scholar
- R. Iida, K. Torisawa, J.-H. Oh, C. Kruengkrai, and J. Kloetzer. Intra-sentential subject zero anaphora resolution using multi-column convolutional neural network. In Proceedings of EMNLP, pages 1244--1254, 2016. Google ScholarCross Ref
- J. Kazama and K. Torisawa. Inducing gazetteers for named entity recognition by large-scale clustering of dependency relations. In Proceedings of ACL-HLT, pages 407--415, 2008.Google Scholar
- Y. Kim. Convolutional neural networks for sentence classification. In Proceedings of EMNLP, pages 1746--1751, 2014. Google ScholarCross Ref
- Y. Kim, Y. Jernite, D. Sontag, and A. M. Rush. Character-aware neural language models. In Proceedings of AAAI, pages 2741--2749, 2016.Google Scholar
- C. Kruengkrai, K. Torisawa, C. Hashimoto, J. Kloetzer, J.-H. Oh, and M. Tanaka. Improving event causality recognition with multiple background knowledge sources using multi-column convolutional neural networks. In Proceedings of AAAI, 2017.Google Scholar
- J. La erty, A. McCallum, and F. Pereira. Conditional random elds: Probabilistic models for segmenting and labeling sequence data. In Proceedings of ICML, pages 282--289, 2001.Google Scholar
- T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In Proceedings of NIPS, pages 3111--3119, 2013.Google ScholarDigital Library
- V. Mnih, N. Heess, A. Graves, and k. kavukcuoglu. Recurrent models of visual attention. In Proceedings of NIPS, pages 2204--2212, 2014Google ScholarDigital Library
- M. Murata, S. Tsukawaki, T. Kanamaru, Q. Ma, and H. Isahara. A system for answering non-factoid Japanese questions by using passage retrieval weighted based on type of answer. In Proceedings of NTCIR-6, 2007.Google Scholar
- J.-H. Oh, K. Torisawa, C. Hashimoto, R. Iida, M. Tanaka, and J. Kloetzer. A semi-supervised learning approach to why-question answering. In Proceedings of AAAI, pages 3022--3029, 2016.Google Scholar
- J.-H. Oh, K. Torisawa, C. Hashimoto, T. Kawada, S. D. Saeger, J. Kazama, and Y. Wang. Why question answering using sentiment analysis and word classes. In Proceedings of EMNLP-CoNLL, pages 368--378, 2012.Google Scholar
- J.-H. Oh, K. Torisawa, C. Hashimoto, M. Sano, S. D. Saeger, and K. Ohtake. Why-question answering using intra- and inter-sentential causal relations. In Proceedings of ACL, pages 1733--1743, 2013.Google Scholar
- A. M. Rush, S. Chopra, and J. Weston. A neural attention model for abstractive sentence summarization. In Proceedings of EMNLP, pages 379--389, 2015.Google ScholarCross Ref
- A. Severyn and A. Moschitti. Learning to rank short text pairs with convolutional deep neural networks. In Proceedings of SIGIR, pages 373--382, 2015. Google ScholarDigital Library
- S. Sukhbaatar, a. szlam, J. Weston, and R. Fergus. End-to-end memory networks. In Proceedings of NIPS, pages 2440--2448, 2015.Google Scholar
- M. Tan, B. Xiang, and B. Zhou. Lstm-based deep learning models for non-factoid answer selection. CoRR, abs/1511.04108, 2015.Google Scholar
- S. Verberne, H. van Halteren, D. Theijssen, S. Raaijmakers, and L. Boves. Learning to rank for why-question answering. Inf. Retr., 14(2):107--132, 2011. Google ScholarDigital Library
- D. Wang and E. Nyberg. A long short-term memory model for answer sentence selection in question answering. In Proceedings of ACL-IJCNLP, pages 707--712, 2015. Google ScholarCross Ref
- W. Yin, H. Schütze, B. Xiang, and B. Zhou. ABCNN: Attention-based convolutional neural network for modeling sentence pairs. Transactions of the Association for Computational Linguistics, 4:259--272, 2016.Google ScholarCross Ref
- L. Yu, K. M. Hermann, P. Blunsom, and S. Pulman. Deep learning for answer sentence selection. In NIPS Deep Learning Workshop, 2014.Google Scholar
- M. D. Zeiler. ADADELTA: an adaptive learning rate method. CoRR, abs/1212.5701, 2012.Google Scholar
- D. Zeng, K. Liu, Y. Chen, and J. Zhao. Distant supervision for relation extraction via piecewise convolutional neural networks. In Proceedings of EMNLP, pages 1753--1762, 2015. Google ScholarCross Ref
Index Terms
- Multi-Column Convolutional Neural Networks with Causality-Attention for Why-Question Answering
Recommendations
Quality-aware collaborative question answering: methods and evaluation
WSDM '09: Proceedings of the Second ACM International Conference on Web Search and Data MiningCommunity Question Answering (QA) portals contain questions and answers contributed by hundreds of millions of users. These databases of questions and answers are of great value if they can be used directly to answer questions from any user. In this ...
Exploring Diversification In Non-factoid Question Answering
ICTIR '18: Proceedings of the 2018 ACM SIGIR International Conference on Theory of Information RetrievalRetrieving short, precise answers to non-factoid queries is an increasingly important task, especially for mobile and voice search. Many of these questions may have multiple or alternative answers. In an environment where answers are presented ...
A Hierarchical Attention Retrieval Model for Healthcare Question Answering
WWW '19: The World Wide Web ConferenceThe growth of the Web in recent years has resulted in the development of various online platforms that provide healthcare information services. These platforms contain an enormous amount of information, which could be beneficial for a large number of ...
Comments