Abstract
Remarkable success has been achieved in the last few years on machine reading comprehension tasks. In previous works, long-range dependencies were captured by explicitly attending to all the tokens and modeling the relations between the question and each sentence. However, a great deal of important information regarding token-level and sentence-level relations in the passage, which are useful to infer the answer, were ignored in these works. We observed that the contextual information between the token-level and sentence-level in the same passage plays a vital role in reading comprehension tasks. To address this problem, we proposed a multi-stage maximization attention (MMA) network, which is used to capture the important relations in the passage from different levels of granularity at its hierarchical nature. By utilizing MMA as a module, we integrated two sentence-level question-aware matching mechanisms to infer the answer: (1) Co-matching is used to match the passage with the question and the candidate answer. (2) Sentence-level hierarchical attention is used to identify the importance of sentences conditioned on the question and the option. In addition, inspired by how humans solve multi-choice reading comprehension questions, the passage sentence selection strategy is fused into our model to select the most salient sentences to guide the model to infer the answer. The proposed model is evaluated on three multi-choice reading comprehension datasets RACE, Dream and MultiRC. Significance tests demonstrated the improvement of existing MRC models. A series of analyses were also conducted to interpret the effectiveness of the proposed model.






Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.References
Baradaran R, Ghiasi R, Amirkhani H (2020) A survey on machine reading comprehension systems. arxiv:2001.01582
Chen Q, Zhu X, Ling ZH, Wei S, Jiang H, Inkpen D (2017) Enhanced LSTM for natural language inference. In: Proceedings of the 55th annual meeting of the association for computational linguistics (Volume 1: Long Papers), Association for Computational Linguistics, Vancouver, Canada, pp 1657–1668, https://doi.org/10.18653/v1/P17-1152,
Chen YC, Bansal M (2018) Fast abstractive summarization with reinforce-selected sentence rewriting. In: Proceedings of the 56th annual meeting of the association for computational linguistics (Volume 1: Long Papers), Association for Computational Linguistics, Melbourne, Australia, pp 675–686, https://doi.org/10.18653/v1/P18-1063,
Choi E, Hewlett D, Uszkoreit J, Polosukhin I, Lacoste A, Berant J (2017) Coarse-to-fine question answering for long documents. In: Proceedings of the 55th annual meeting of the association for computational linguistics (Volume 1: Long Papers), Association for Computational Linguistics, Vancouver, Canada, pp 209–220, https://doi.org/10.18653/v1/P17-1020,
Clark C, Lee K, Chang MW, Kwiatkowski T, Collins M, Toutanova K (2019) BoolQ: exploring the surprising difficulty of natural yes/no questions. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, Volume 1 (Long and Short Papers), Association for Computational Linguistics, Minneapolis, Minnesota, pp 2924–2936, https://doi.org/10.18653/v1/N19-1300,
Cui Y, Chen Z, Wei S, Wang S, Liu T, Hu G (2017) Attention-over-attention neural networks for reading comprehension. In: Proceedings of the 55th annual meeting of the association for computational linguistics (Volume 1: Long Papers), Association for Computational Linguistics, Vancouver, Canada, pp 593–602, https://doi.org/10.18653/v1/P17-1055,
Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, Vol 1 (Long and Short Papers), Association for Computational Linguistics, Minneapolis, Minnesota, pp 4171–4186, https://doi.org/10.18653/v1/N19-1423,
DeYoung J, Jain S, Rajani NF, Lehman E, Xiong C, Socher R, Wallace BC (2020) ERASER: a benchmark to evaluate rationalized NLP models. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Online, pp 4443–4458, https://doi.org/10.18653/v1/2020.acl-main.408,
Dhingra B, Liu H, Yang Z, Cohen W, Salakhutdinov R (2017) Gated-attention readers for text comprehension. In: Proceedings of the 55th annual meeting of the association for computational linguistics (Volume 1: Long Papers), Association for Computational Linguistics, Vancouver, Canada, pp 1832–1846, https://doi.org/10.18653/v1/P17-1168,
Ding M, Zhou C, Chen Q, Yang H, Tang J (2019) Cognitive graph for multi-hop reading comprehension at scale. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 2694–2703, https://doi.org/10.18653/v1/P19-1259,
Fukui A, Park DH, Yang D, Rohrbach A, Darrell T, Rohrbach M (2016) Multimodal compact bilinear pooling for visual question answering and visual grounding. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Austin, Texas, pp 457–468, https://doi.org/10.18653/v1/D16-1044,
Glockner M, Habernal I, Gurevych I (2020) Why do you think that? Exploring faithful sentence-level rationales without supervision. In: Findings of the association for computational linguistics: EMNLP 2020, association for computational linguistics, Online, pp 1080–1095, https://doi.org/10.18653/v1/2020.findings-emnlp.97,
Gu Y, Gui X, Li D (2021) Utterance-focusing multiway-matching network for dialogue-based multiple-choice machine reading comprehension. Neurocomputing 425:12–22
Guo S, Guan Y, Tan H, Li R, Li X (2021) Frame-based neural network for machine reading comprehension. Knowl-Based Syst 219:106889
Hanselowski A, Zhang H, Li Z, Sorokin D, Schiller B, Schulz C, Gurevych I (2018) Ukp-athene: multi-sentence textual entailment for claim verification. CoRR , arxiv:1809.01479,
Hermann KM, Kocisky T, Grefenstette E, Espeholt L, Kay W, Suleyman M, Blunsom P (2015) Teaching machines to read and comprehend. In: Cortes C, Lawrence ND, Lee DD, Sugiyama M, Garnett R (eds) Advances in neural information processing systems. Curran Associates Inc, USA, pp 1693–1701
Jiao F, Guo Y, Niu Y, Ji F, Li FL, Nie L (2021) REPT: Bridging language models and machine reading comprehension via retrieval-based pre-training. In: Findings of the association for computational linguistics: ACL-IJCNLP 2021, Association for Computational Linguistics, Online, pp 150–163, https://doi.org/10.18653/v1/2021.findings-acl.13,
Kadlec R, Schmid M, Bajgar O, Kleindienst J (2016) Text understanding with the attention sum reader network. In: Proceedings of the 54th annual meeting of the association for computational linguistics (Volume 1: Long Papers), Association for computational linguistics, Berlin, Germany, pp 908–918, https://doi.org/10.18653/v1/P16-1086,
Khashabi D, Chaturvedi S, Roth M, Upadhyay S, Roth D (2018) Looking beyond the surface: achallenge set for reading comprehension over multiple sentences. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), Association for Computational Linguistics, New Orleans, Louisiana, pp 252–262, https://doi.org/10.18653/v1/N18-1023,
Kim JH, Jun J, Zhang BT (2018) Bilinear attention networks. In: Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R (eds) Advances in neural information processing systems. Curran Associates Inc, USA, pp 1564–1574
Lai G, Xie Q, Liu H, Yang Y, Hovy E (2017) RACE: Large-scale ReAding comprehension dataset from examinations. In: Proceedings of the 2017 conference on empirical methods in natural language processing, association for computational linguistics, Copenhagen, Denmark, pp 785–794, https://doi.org/10.18653/v1/D17-1082,
Lan Z, Chen M, Goodman S, Gimpel K, Sharma P, Soricut R (2019) ALBERT: a lite BERT for self-supervised learning of language representations. CoRR , arxiv:1909.11942,
Hg Lee, Kim H (2020) Gf-net: improving machine reading comprehension with feature gates. Pattern Recogn Lett 129:8–15
Lin Y, Ji H, Liu Z, Sun M (2018) Denoising distantly supervised open-domain question answering. In: Proceedings of the 56th annual meeting of the association for computational linguistics (Volume 1: Long Papers), Association for Computational Linguistics, Melbourne, Australia, pp 1736–1745, https://doi.org/10.18653/v1/P18-1161,
Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) Roberta: a robustly optimized bert pretraining approach. arXiv:1907.11692
Manning C, Surdeanu M, Bauer J, Finkel J, Bethard S, McClosky D (2014) The Stanford CoreNLP natural language processing toolkit. In: Proceedings of 52nd Annual Meeting of the association for computational linguistics: system demonstrations, Association for Computational Linguistics, Baltimore, Maryland, pp 55–60, https://doi.org/10.3115/v1/P14-5010,
Min S, Zhong V, Socher R, Xiong C (2018) Efficient and robust question answering from minimal context over documents. In: Proceedings of the 56th annual meeting of the association for computational linguistics (Volume 1: Long Papers), Association for Computational Linguistics, Melbourne, Australia, pp 1725–1735, https://doi.org/10.18653/v1/P18-1160,
Nguyen M, Nguyen TH (2018) Who is killed by police: Introducing supervised attention for hierarchical LSTMs. In: Proceedings of the 27th international conference on computational linguistics, association for computational linguistics, Santa Fe, New Mexico, USA, pp 2277–2287, https://www.aclweb.org/anthology/C18-1193
Nishida K, Nishida K, Nagata M, Otsuka A, Saito I, Asano H, Tomita J (2019) Answering while summarizing: Multi-task learning for multi-hop QA with evidence extraction. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, Italy, pp 2335–2345, https://doi.org/10.18653/v1/P19-1225,
Niu Y, Jiao F, Zhou M, Yao T, Xu J, Huang M (2020) A self-training method for machine reading comprehension with soft evidence extraction. In: Proceedings of the 58th annual meeting of the association for computational linguistics, Association for Computational Linguistics, Online, pp 3916–3927, https://doi.org/10.18653/v1/2020.acl-main.361,
Radford A, Narasimhan K, Salimans T, Sutskever I (2018) Improving language understanding by generative pre-training. OpenAI, https://www.semanticscholar.org/paper/Improving-Language-Understanding-by-Generative-Radford/cd18800a0fe0b668a1cc19f2ec95b5003d0a5035
Rajpurkar P, Zhang J, Lopyrev K, Liang P (2016) SQuAD: 100,000+ questions for machine comprehension of text. In: Proceedings of the 2016 conference on empirical methods in natural language processing, Association for Computational Linguistics, Austin, Texas, pp 2383–2392, https://doi.org/10.18653/v1/D16-1264,
Rajpurkar P, Jia R, Liang P (2018) Know what you don’t know: Unanswerable questions for SQuAD. In: Proceedings of the 56th annual meeting of the association for computational linguistics (Volume 2: Short Papers), Association for Computational Linguistics, Melbourne, Australia, pp 784–789, https://doi.org/10.18653/v1/P18-2124,
Ren M, Huang H, Gao Y (2022) Interpretable modular knowledge reasoning for machine reading comprehension. Neural Comput Appl 34:9901–9918
Seo M, Kembhavi A, Farhadi A, Hajishirzi H (2017) Bidirectional attention flow for machine comprehension. In: International conference on learning representations, arxiv:1611.01603
Sun K, Yu D, Chen J, Yu D, Choi Y, Cardie C (2019a) Dream: a challenge dataset and models for dialogue-based reading compre- hension. TACL , arxiv:1902.00164
Sun K, Yu D, Yu D, Cardie C (2019b) Improving machine reading comprehension with general reading strategies. In: Proceedings of the 2019 Conference of the North American chapter of the association for computational linguistics: human language technologies, Volume 1 (Long and Short Papers), Association for Computational Linguistics, Minneapolis, Minnesota, pp 2633–2643, https://doi.org/10.18653/v1/N19-1270,
Tang M, Cai J, Zhuo HH (2019) Multi-matching network for multiple choice reading comprehension. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 33, pp 7088–7095. https://doi.org/10.1609/aaai.v33i01.33017088
Tay Y, Tuan LA, Hui SC (2018) Multi-range reasoning for machine comprehension. vol arxiv:1803.09074,
Tian Z, Zhang Y, Feng X, Jiang W, Lyu Y, Liu K, Zhao J (2020) Capturing sentence relations for answer sentence selection with multi-perspective graph encoding. Proceedings of the AAAI Conference on Artificial Intelligence 34:9032–9039. https://doi.org/10.1609/aaai.v34i05.6436
Tu M, Huang K, Wang G, Huang J, He X, Zhou B (2020) Select, answer and explain: interpretable multi-hop reading comprehension over multiple documents. In: The Thirty-Fourth AAAI conference on artificial intelligence, AAAI 2020, the thirty-second innovative applications of artificial intelligence conference, IAAI 2020, The Tenth AAAI symposium on educational advances in artificial intelligence, EAAI 2020, New York, NY, USA, February 7-12, 2020, AAAI Press, pp 9073–9080, https://aaai.org/ojs/index.php/AAAI/article/view/6441
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Lu, Polosukhin I (2017) Attention is all you need. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in neural information processing systems. Curran Associates Inc, USA, pp 5998–6008
Wang H, Yu D, Sun K, Chen J, Yu D, McAllester D, Roth D (2019) Evidence sentence extraction for machine reading comprehension. In: Proceedings of the 23rd conference on computational natural language learning (CoNLL), Association for Computational Linguistics, Hong Kong, China, pp 696–707, https://doi.org/10.18653/v1/K19-1065,
Wang S, Yu M, Guo X, Wang Z, Klinger T, Zhang W, Chang S, Tesauro G, Zhou B, Jiang J (2018a) R\({}^{\text{3}}\): r`einforced ranker-reader for open-domain question answering. In: McIlraith SA, Weinberger KQ (eds) Proceedings of the Thirty-Second AAAI conference on artificial intelligence, (AAAI-18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans, Louisiana, USA, February 2-7, 2018, AAAI Press, pp 5981–5988, https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16712
Wang S, Yu M, Jiang J, Chang S (2018b) A co-matching model for multi-choice reading comprehension. In: Proceedings of the 56th annual meeting of the association for computational linguistics (Volume 2: Short Papers), Association for Computational Linguistics, Melbourne, Australia, pp 746–751, https://doi.org/10.18653/v1/P18-2118,
Wang W, Yang N, Wei F, Chang B, Zhou M (2017a) Gated self-matching networks for reading comprehension and question answering. In: Proceedings of the 55th annual meeting of the association for computational linguistics (Volume 1: Long Papers), Association for Computational Linguistics, Vancouver, Canada, pp 189–198, https://doi.org/10.18653/v1/P17-1018,
Wang Z, Hamza W, Florian R (2017b) Bilateral multi-perspective matching for natural language sentences. In: Proceedings of the twenty-sixth international joint conference on artificial intelligence, IJCAI-17, pp 4144–4150, https://doi.org/10.24963/ijcai.2017/579,
Xiong C, Zhong V, Socher R (2018) Dcn+: mixed objective and deep residual coattention for question answering. international conference on learning representations arxiv:1711.00106
Zhang C, Luo C, Lu J, Liu A, Bai B, Bai K, Xu Z (2020a) Read, attend, and exclude: multi-choice reading comprehension by mimicking human reasoning process. In: SIGIR '20: The 43rd International ACM SIGIR conference on research and development in Information Retrieval, pp 1945–1948, https://doi.org/10.1145/3397271.3401326
Zhang S, Zhao H, Wu Y, Zhang Z, Zhou X, Zhou X (2020) Dual co-matching network for multi-choice reading comprehension. Proceedings of the AAAI Conference on Artificial Intelligence, vol 34. https://doi.org/10.1609/aaai.v34i05.6502
Zhang Z, Yang J, Zhao H (2020c) Retrospective reader for machine reading comprehension. arxiv:2001.09694
Zhu H, Wei F, Qin B, Liu T (2018) Hierarchical attention flow for multiple-choice reading comprehension. In: AAAI
Zhu P, Zhao H, Li X (2020) Dual multi-head co-attention for multi-choice reading comprehension. CoRR arxiv:2001.09415,
Acknowledgements
This work was supported in part by the National Natural Science Foundation of China under Grants 81860318 and 81560296.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yan, H., Liu, L., Feng, X. et al. Leveraging greater relations for improving multi-choice reading comprehension. Neural Comput & Applic 34, 20851–20864 (2022). https://doi.org/10.1007/s00521-022-07561-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-022-07561-2