skip to main content
10.1145/3459104.3459149acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiseeieConference Proceedingsconference-collections
research-article

Answer Selection Using Reinforcement Learning for Complex Question Answering on the Open Domain

Published: 20 July 2021 Publication History

Abstract

Multiple-choice question answering for the open domain is a task that consists of answering challenging questions from multiple domains, without direct pieces of evidence in the text corpora. The main application of multiple-choice question answering is self-tutoring. We propose the Multiple-Choice Reinforcement Learner (MCRL) model, which uses a policy gradient algorithm in a partially observable Markov decision process to reformulate question-answer pairs in order to find new pieces of evidence to support each answer choice. Its inputs are the question and the answer choices. MCRL learns to generate queries that improve the evidence found for each answer choice, using iteration cycles. After a predefined number of iteration cycles, MCRL provides the best answer choice and the text passages that support it. We use accuracy and mean reward per episode to conduct an in-depth hyperparameter analysis of the number of iteration cycles, reward function design, and weight of the pieces of evidence found in each iteration cycle on the final answer choice. The MCRL model with the best performance reached an accuracy of 0.346, a value higher than naive, random, and the traditional end-to-end deep learning QA models. We conclude with recommendations for future developments of the model, which can be adapted for different languages using text corpora and word embedding models for each language.

References

[1]
Ruslan Mitkov. The Oxford Handbook of Computational Linguistics (Oxford Handbooks). Oxford University Press, Inc., USA, 2005.
[2]
Michael Boratko, Harshit Padigela, Divyendra Mikkilineni, Pritish Yuvraj, Rajarshi Das, Andrew McCallum, Maria Chang, Achille Fokoue-Nkoutche, Pavan Kapanipathi, Nicholas Mattei, Ryan Musa, Kartik Talamadupula, and Michael Witbrock. A systematic classification of knowledge, reasoning, and context within the ARC dataset. In Proceedings of the Workshop on Machine Reading for Question Answering, pages 60–70, Melbourne, Australia, July 2018. Association for Computational Linguistics.
[3]
Jianmo Ni, Chenguang Zhu, Weizhu Chen, and Julian McAuley. Learning to attend on essential terms: An enhanced retriever-reader model for open-domain question answering. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 335–344, Minneapolis, Minnesota, June 2019. Association for Computational Linguistics.
[4]
Rajarshi Das, Shehzaad Dhuliawala, Manzil Zaheer, and Andrew McCallum. Multi-step retriever-reader interaction for scalable open-domain question answering. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net, 2019.
[5]
Christian Buck, Jannis Bulian, Massimiliano Ciaramita, Wojciech Gajewski, Andrea Gesmundo, Neil Houlsby, and Wei Wang. Ask the right questions: Active question reformulation with reinforcement learning. In International Conference on Learning Representations, 2018.
[6]
Peter Clark, Isaac Cowhey, Oren Etzioni, Tushar Khot, Ashish Sabharwal, Carissa Schoenick, and Oyvind Tafjord. Think you have solved question answering? Try arc, the ai2 reasoning challenge. arXiv preprint arXiv:1803.05457, 2018.
[7]
Stuart Russell and Peter Norvig. Artificial Intelligence: A Modern Approach. Prentice Hall, 3 edition, 2010.
[8]
D. Barber. Bayesian Reasoning and Machine Learning. Cambridge University Press, 04-2011 edition, 2012.
[9]
Richard S. Sutton and Andrew G. Barto. Reinforcement Learning: An Introduction. The MIT Press, second edition, 2018.
[10]
Ronald J. Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn., 8(3–4):229–256, May 1992.
[11]
Bert F Green Jr, Alice K Wolf, Carol Chomsky, and Kenneth Laughery. Baseball: an automatic question-answerer. In Papers presented at the May 9-11, 1961, western joint IRE-AIEE-ACM Computer Conference, pages 219–224, 1961.
[12]
Matthew Dunn, Levent Sagun, Mike Higgins, V Ugur Guney, Volkan Cirik, and Kyunghyun Cho. Searchqa: A new q&a dataset augmented with context from a search engine. arXiv preprint arXiv:1704.05179, 2017.
[13]
Bhuwan Dhingra, Kathryn Mazaitis, and William W Cohen. Quasar: Datasets for question answering by search and reading. arXiv preprint arXiv:1707.03904, 2017.
[14]
Mandar Joshi, Eunsol Choi, Daniel Weld, and Luke Zettlemoyer. TriviaQA: A large scale distantly supervised challenge dataset for reading comprehension. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1601–1611, Vancouver, Canada, July 2017. Association for Computational Linguistics.
[15]
Jinxi Xu and W. Bruce Croft. Query expansion using local and global document analysis. In SIGIR, 1996.
[16]
Rodrigo Nogueira and Kyunghyun Cho. Task-oriented query reformulation with reinforcement learning. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 574–583, Copenhagen, Denmark, September 2017. Association for Computational Linguistics

Index Terms

  1. Answer Selection Using Reinforcement Learning for Complex Question Answering on the Open Domain
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Information & Contributors

          Information

          Published In

          cover image ACM Other conferences
          ISEEIE 2021: 2021 International Symposium on Electrical, Electronics and Information Engineering
          February 2021
          644 pages
          ISBN:9781450389839
          DOI:10.1145/3459104
          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          Published: 20 July 2021

          Permissions

          Request permissions for this article.

          Check for updates

          Qualifiers

          • Research-article
          • Research
          • Refereed limited

          Funding Sources

          • CNPq
          • CNPq
          • PBI

          Conference

          ISEEIE 2021

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • 0
            Total Citations
          • 69
            Total Downloads
          • Downloads (Last 12 months)6
          • Downloads (Last 6 weeks)0
          Reflects downloads up to 20 Feb 2025

          Other Metrics

          Citations

          View Options

          Login options

          View options

          PDF

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format.

          HTML Format

          Figures

          Tables

          Media

          Share

          Share

          Share this Publication link

          Share on social media