short-paper

MA-MRC: A Multi-answer Machine Reading Comprehension Dataset

Authors:

Tong RuanAuthors Info & Claims

SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval

Pages 2144 - 2148

https://doi.org/10.1145/3539618.3592015

Published: 18 July 2023 Publication History

Abstract

Machine reading comprehension (MRC) is an essential task for many question-answering applications. However, existing MRC datasets mainly focus on data with single answer and overlook multiple answers, which are common in the real world. In this paper, we aim to construct an MRC dataset with both data of single answer and multiple answers. To achieve this purpose, we design a novel pipeline method: data collection, data cleaning, question generation and test set annotation. Based on these procedures, we construct a high-quality multi-answer MRC dataset (MA-MRC) with 129K question-answer-context samples. We implement a sequence of baselines and carry out extensive experiments on MA-MRC. According to the experimental results, MA-MRC is a challenging dataset, which can facilitate the future research on the multi-answer MRC task.

Supplemental Material

MP4 File

Machine reading comprehension (MRC) is an essential task for many question-answering applications. However, existing MRC datasets mainly focus on data with single answer and overlook multiple answers, which are common in the real world. In this paper, we aim to construct an MRC dataset with both data of single answer and multiple answers. To achieve this purpose, we design a novel pipeline method: data collection, data cleaning, question generation and test set annotation. Based on these procedures, we construct a high-quality multi-answer MRC dataset (MA-MRC) with 129K question-answer-context samples. We implement a sequence of baselines and carry out extensive experiments on MA-MRC. According to the experimental results, MA-MRC is a challenging dataset, which can facilitate the future research on the multi-answer MRC task.

Download
65.49 MB

References

[1]

Sören Auer, Christian Bizer, Georgi Kobilarov, Jens Lehmann, Richard Cyganiak, and Zachary Ives. 2007. Dbpedia: A nucleus for a web of open data. In The semantic web. Springer, 722--735.

Digital Library

[2]

Payal Bajaj, Daniel Campos, Nick Craswell, Li Deng, Jianfeng Gao, Xiaodong Liu, Rangan Majumder, Andrew McNamara, Bhaskar Mitra, Tri Nguyen, et al. 2016. Ms marco: A human generated machine reading comprehension dataset. arXiv preprint arXiv:1611.09268 (2016).

[3]

Antoine Bordes, Nicolas Usunier, Sumit Chopra, and Jason Weston. 2015. Largescale simple question answering with memory networks. arXiv preprint arXiv:1506.02075 (2015).

[4]

Pere-Lluís Huguet Cabot and Roberto Navigli. 2021. REBEL: Relation extraction by end-to-end language generation. In Findings of the Association for Computational Linguistics: EMNLP 2021. 2370--2381.

[5]

Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Édouard Grave, Myle Ott, Luke Zettlemoyer, and Veselin Stoyanov. 2020. Unsupervised Cross-lingual Representation Learning at Scale. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 8440--8451.

[6]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 4171--4186.

[7]

Matthew Dunn, Levent Sagun, Mike Higgins, V Ugur Guney, Volkan Cirik, and Kyunghyun Cho. 2017. Searchqa: A new q&a dataset augmented with context from a search engine. arXiv preprint arXiv:1704.05179 (2017).

[8]

Wei He, Kai Liu, Jing Liu, Yajuan Lyu, Shiqi Zhao, Xinyan Xiao, Yuan Liu, Yizhong Wang, HuaWu, Qiaoqiao She, et al. 2018. DuReader: a Chinese Machine Reading Comprehension Dataset from Real-world Applications. In Proceedings of the Workshop on Machine Reading for Question Answering. 37--46.

[9]

Karl Moritz Hermann, Tomas Kocisky, Edward Grefenstette, Lasse Espeholt, Will Kay, Mustafa Suleyman, and Phil Blunsom. 2015. Teaching machines to read and comprehend. Advances in neural information processing systems 28 (2015).

[10]

Felix Hill, Antoine Bordes, Sumit Chopra, and JasonWeston. 2015. The goldilocks principle: Reading children's books with explicit memory representations. arXiv preprint arXiv:1511.02301 (2015).

[11]

Mandar Joshi, Eunsol Choi, Daniel SWeld, and Luke Zettlemoyer. 2017. TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1601--1611.

[12]

Daniel Khashabi, Yeganeh Kordi, and Hannaneh Hajishirzi. 2022. Unifiedqav2: Stronger generalization via broader cross-format training. arXiv preprint arXiv:2202.12359 (2022).

[13]

Tomá? Kocisky, Jonathan Schwarz, Phil Blunsom, Chris Dyer, Karl Moritz Hermann, Gábor Melis, and Edward Grefenstette. 2018. The NarrativeQA Reading Comprehension Challenge. Transactions of the Association for Computational Linguistics 6 (2018), 317--328.

[14]

Guokun Lai, Qizhe Xie, Hanxiao Liu, Yiming Yang, and Eduard Hovy. 2017. RACE: Large-scale ReAding Comprehension Dataset From Examinations. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 785--794.

[15]

Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, and Luke Zettlemoyer. 2020. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 7871--7880.

[16]

Haonan Li, Martin Tomko, Maria Vasardani, and Timothy Baldwin. 2022. Multi-SpanQA: A Dataset for Multi-Span Question Answering. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 1250--1260.

[17]

Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019).

[18]

Mike Mintz, Steven Bills, Rion Snow, and Dan Jurafsky. 2009. Distant supervision for relation extraction without labeled data. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. 1003--1011.

Digital Library

[19]

Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J Liu, et al. 2020. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21, 140 (2020), 1--67.

[20]

Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang. 2016. SQuAD: 100,000 Questions for Machine Comprehension of Text. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 2383--2392.

[21]

Matthew Richardson, Christopher JC Burges, and Erin Renshaw. 2013. Mctest: A challenge dataset for the open-domain machine comprehension of text. In Proceedings of the 2013 conference on empirical methods in natural language processing. 193--203.

[22]

Sebastian Riedel, Limin Yao, and Andrew McCallum. 2010. Modeling relations and their mentions without labeled text. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 148--163.

Digital Library

[23]

Jana Straková, Milan Straka, and Jan Hajic. 2019. Neural Architectures for Nested NER through Linearization. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 5326--5331.

[24]

Jianlin Su, Ahmed Murtadha, Shengfeng Pan, Jing Hou, Jun Sun, Wanwei Huang, Bo Wen, and Yunfeng Liu. 2022. Global Pointer: Novel Efficient Span-based Approach for Named Entity Recognition. arXiv preprint arXiv:2208.03054 (2022).

[25]

Adam Trischler, Tong Wang, Xingdi Yuan, Justin Harris, Alessandro Sordoni, Philip Bachman, and Kaheer Suleman. 2017. NewsQA: A Machine Comprehension Dataset. In Proceedings of the 2nd Workshop on Representation Learning for NLP. 191--200.

[26]

ShuohangWang and Jing Jiang. 2016. Machine comprehension using match-lstm and answer pointer. arXiv preprint arXiv:1608.07905 (2016).

[27]

Juntao Yu, Bernd Bohnet, and Massimo Poesio. 2020. Named Entity Recognition as Dependency Parsing. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 6470--6476.

[28]

Ming Zhu, Aman Ahuja, Da-Cheng Juan, Wei Wei, and Chandan K Reddy. 2020. Question answering with long multiple-span answers. In Findings of the Association for Computational Linguistics: EMNLP 2020. 3840--3849.

Index Terms

MA-MRC: A Multi-answer Machine Reading Comprehension Dataset
1. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Question answering

Recommendations

A Chinese Machine Reading Comprehension Dataset Automatic Generated Based on Knowledge Graph
Chinese Computational Linguistics
Abstract
Machine reading comprehension (MRC) is a typical natural language processing (NLP) task and has developed rapidly in the last few years. Various reading comprehension datasets have been built to support MRC studies. However, large-scale and high-...
I Know There Is No Answer: Modeling Answer Validation for Machine Reading Comprehension
Natural Language Processing and Chinese Computing
Abstract
Existing works on machine reading comprehension mostly focus on extracting text spans from passages with the assumption that the passage must contain the answer to the question. This assumption usually cannot be satisfied in real-life ...
Sentence Extraction-Based Machine Reading Comprehension for Vietnamese
Knowledge Science, Engineering and Management
Abstract
The development of natural language processing (NLP) in general and machine reading comprehension in particular has attracted the great attention of the research community. In recent years, there are a few datasets for machine reading ... $_{}$

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval

July 2023

3567 pages

ISBN:9781450394086

DOI:10.1145/3539618

General Chairs:
Hsin-Hsi Chen
National Taiwan University
,
Wei-Jou (Edward) Duh
National Taiwan University
,
Hen-Hsen Huang
Academia Sinica
,
Program Chairs:
Makoto P. Kato
Spotify
,
Josiane Mothe
Universite de Toulouse
,
Barbara Poblete
University of Chile and Amazon Visiting Academic

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 July 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Funding Sources

AECC Sichuan Gas Turbine Establishment
Shanghai Sailing Program
Science and Technology Commission of Shanghai Municipality Grant

Conference

SIGIR '23

Sponsor:

SIGIR

SIGIR '23: The 46th International ACM SIGIR Conference on Research and Development in Information Retrieval

July 23 - 27, 2023

Taipei, Taiwan

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
235
Total Downloads

Downloads (Last 12 months)62
Downloads (Last 6 weeks)4

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten