Abstract
This paper proposes a method to capture the relations between options in Multiple-choice Machine Reading Comprehension (MMRC) tasks. MMRC is a form of question answering (QA) in which the question is about a given text, and multiple answers are provided as options. Capturing the relations between options is especially important for options with information references between them that cannot stand alone as responses to the questions, such as “None of the above”. Our method 1) takes the whole sample including identification of the passage, question, and all options as input for pre-trained language models, and 2) adds a fuser network to emphasize the information interaction between options. Experimental results show that our method improves over the common encoding approaches on COSMOS-QA, an MMRC dataset with between-option references, while having a relatively small impact on other MMRC datasets without references between the options. We conclude that our method actually helps to capture the necessary relationships between options. In addition, our method can reduce the memory usage required for training, and the model can be easily transferred to other domains and models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Conneau, A., et al.: Unsupervised cross-lingual representation learning at scale. CoRR abs/1911.02116 (2019). http://arxiv.org/abs/1911.02116
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Long and Short Papers), vol. 1, pp. 4171–4186. Association for Computational Linguistics, Minneapolis, Minnesota, June 2019. https://doi.org/10.18653/v1/N19-1423. https://aclanthology.org/N19-1423
Ghosal, D., Majumder, N., Mihalcea, R., Poria, S.: Two is better than many? Binary classification as an effective approach to multi-choice question answering (2022)
He, P., Liu, X., Gao, J., Chen, W.: DeBERTa: decoding-enhanced BERT with disentangled attention. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=XPZIaotutsD
Huang, L., Bras, R.L., Bhagavatula, C., Choi, Y.: COSMOS QA: machine reading comprehension with contextual commonsense reasoning. CoRR abs/1909.00277 (2019). http://arxiv.org/abs/1909.00277
Jin, D., Gao, S., Kao, J., Chung, T., Hakkani-Tür, D.: MMM: multi-stage multi-task learning for multi-choice reading comprehension. CoRR abs/1910.00458 (2019). http://arxiv.org/abs/1910.00458
Joshi, M., Chen, D., Liu, Y., Weld, D.S., Zettlemoyer, L., Levy, O.: SpanBERT: improving pre-training by representing and predicting spans. CoRR abs/1907.10529 (2019). http://arxiv.org/abs/1907.10529
Kalpakchi, D., Boye, J.: BERT-based distractor generation for Swedish reading comprehension questions using a small-scale dataset. CoRR abs/2108.03973 (2021). https://arxiv.org/abs/2108.03973
Lai, G., Xie, Q., Liu, H., Yang, Y., Hovy, E.: RACE: large-scale reading comprehension dataset from examinations. arXiv preprint arXiv:1704.04683 (2017)
Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: ALBERT: a lite BERT for self-supervised learning of language representations. CoRR abs/1909.11942 (2019). http://arxiv.org/abs/1909.11942
Liu, S., Zhang, X., Zhang, S., Wang, H., Zhang, W.: Neural machine reading comprehension: methods and trends. CoRR abs/1907.01118 (2019). http://arxiv.org/abs/1907.01118
Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019). http://arxiv.org/abs/1907.11692
Ostermann, S., Modi, A., Roth, M., Thater, S., Pinkal, M.: MCScript: a novel dataset for assessing machine comprehension using script knowledge. In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). European Language Resources Association (ELRA), Miyazaki, Japan, May 2018. https://aclanthology.org/L18-1564
Qiu, B., Chen, X., Xu, J., Sun, Y.: A survey on neural machine reading comprehension. CoRR abs/1906.03824 (2019). http://arxiv.org/abs/1906.03824
Ran, Q., Li, P., Hu, W., Zhou, J.: Option comparison network for multiple-choice reading comprehension. CoRR abs/1903.03033 (2019). http://arxiv.org/abs/1903.03033
Richardson, M.: MCTest: a challenge dataset for the open-domain machine comprehension of text. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (EMNLP 2013), October 2013. https://www.microsoft.com/en-us/research/publication/mctest-challenge-dataset-open-domain-machine-comprehension-text/
Sachan, M., Dubey, K., Xing, E., Richardson, M.: Learning answer-entailing structures for machine comprehension. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 239–249 (2015)
Shavrina, T., et al.: RussianSuperGLUE: a Russian language understanding evaluation benchmark. arXiv preprint arXiv:2010.15925 (2020)
Sun, K., Yu, D., Chen, J., Yu, D., Choi, Y., Cardie, C.: DREAM: a challenge dataset and models for dialogue-based reading comprehension. CoRR abs/1902.00164 (2019). http://arxiv.org/abs/1902.00164
Sun, K., Yu, D., Yu, D., Cardie, C.: Probing prior knowledge needed in challenging Chinese machine reading comprehension. CoRR abs/1904.09679 (2019). http://arxiv.org/abs/1904.09679
Sundararajan, M., Taly, A., Yan, Q.: Axiomatic attribution for deep networks. CoRR abs/1703.01365 (2017). http://arxiv.org/abs/1703.01365
Tian, Z., Zhang, Y., Liu, K., Zhao, J., Jia, Y., Sheng, Z.: Scene restoring for narrative machine reading comprehension. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 3063–3073. Association for Computational Linguistics, November 2020. https://doi.org/10.18653/v1/2020.emnlp-main.247. https://aclanthology.org/2020.emnlp-main.247
Trischler, A., Ye, Z., Yuan, X., He, J., Bachman, P., Suleman, K.: A parallel-hierarchical model for machine comprehension on sparse data. arXiv preprint arXiv:1603.08884 (2016)
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems 30 (2017)
Wang, H., Bansal, M., Gimpel, K., McAllester, D.: Machine comprehension with syntax, frames, and semantics. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pp. 700–706 (2015)
Yang, Z., Dai, Z., Yang, Y., Carbonell, J.G., Salakhutdinov, R., Le, Q.V.: XLNet: generalized autoregressive pretraining for language understanding. CoRR abs/1906.08237 (2019). http://arxiv.org/abs/1906.08237
Zhang, S., Zhao, H., Wu, Y., Zhang, Z., Zhou, X., Zhou, X.: DCMN+: dual co-matching network for multi-choice reading comprehension. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 9563–9570 (2020)
Zhang, Y., Yamana, H.: HRCA+: advanced multiple-choice machine reading comprehension method. In: LREC (2022)
Zhu, P., Zhao, H., Li, X.: Dual multi-head co-attention for multi-choice reading comprehension. CoRR abs/2001.09415 (2020). https://arxiv.org/abs/2001.09415
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
A Appendix
A Appendix
1.1 A.1 Truncation Statistics
As we mentioned in Sect. 4.2, we truncated the passages that exceed the length limit into segments and each of them is combined with the corresponding question and options as input. The statistics of truncation are shown in Table 8.
1.2 A.2 Hyper-parameters Settings
The hyper-parameters for the major experiments in Table 3 and Table 4 are summarized in Table 9.
For the transfer learning experiment on XLM-RoBERTa-large, the hyper-parameters for intermediate training and post-fine-tuning are shown in Table 10.
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, R., Verberne, S., Spruit, M. (2024). Attend All Options at Once: Full Context Input for Multi-choice Reading Comprehension. In: Goharian, N., et al. Advances in Information Retrieval. ECIR 2024. Lecture Notes in Computer Science, vol 14608. Springer, Cham. https://doi.org/10.1007/978-3-031-56027-9_24
Download citation
DOI: https://doi.org/10.1007/978-3-031-56027-9_24
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-56026-2
Online ISBN: 978-3-031-56027-9
eBook Packages: Computer ScienceComputer Science (R0)