Skip to main content
Log in

Interpretable modular knowledge reasoning for machine reading comprehension

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Machine reading comprehension (MRC) is a fundamental task of evaluating the natural language understanding ability of model, which requires complicated reasoning about the knowledge involved in the context as well as world knowledge. However, most existing approaches ignore the complicated reasoning process and solve it with a one-step “black box” model and massive data augmentation. Therefore, in this paper, we propose a modular knowledge reasoning approach based on neural network modules that explicitly model each reasoning process step. Five reasoning modules are designed and learned in an end-to-end manner, which leads to a more interpretable model. Experiments using the reasoning over paragraph effects in situations (ROPES) dataset, a challenging dataset that requires reasoning over paragraph effects in a situation, demonstrate the effectiveness and explainability of our proposed approach. Moreover, the transfer of our reasoning modules to the WinoGrande dataset under the zero-shot setting achieved competitive results compared with the data augmented model, proving the generalization capability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Adi Y, Kermany E, Belinkov Y, Lavi O, Goldberg Y (2016) Fine-grained analysis of sentence embeddings using auxiliary prediction tasks. arXiv preprint arXiv:1608.04207

  2. AI2: Ropes leaderboard. https://leaderboard.allenai.org/ropes/submissions/get-started. Accessed October 25, 2020

  3. AI2: Winogrande leaderboard. https://leaderboard.allenai.org/winogrande/submissions/get-started. Accessed December 4, 2020

  4. Andreas J, Rohrbach M, Darrell T, Klein D (2016) Neural module networks. In: Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 39–48

  5. Banerjee P, Pal KK, Mitra A, Baral C (2019) Careful selection of knowledge to solve open book question answering. In: Proc. 57th Ann. Meet. Assoc. Comput. Linguist., pp. 6120–6129. Association for Computational Linguistics, Florence, Italy. Doi: https://doi.org/10.18653/v1/P19-1615

  6. Belinkov Y, Glass J (2019) Analysis methods in neural language processing: A survey. Trans Assoc Comput Linguist (TACL) 7:49–72. https://doi.org/10.1162/tacl_a_00254

    Article  Google Scholar 

  7. Browne MW (2000) Cross-validation methods. J Math Psychol 44(1):108–132

    Article  MathSciNet  Google Scholar 

  8. Burman P (1989) A comparative study of ordinary cross-validation, v-fold cross-validation and the repeated learning-testing methods. Biometrika 76(3):503–514

    Article  MathSciNet  Google Scholar 

  9. Cao Q, Li B, Liang X, Wang K, Lin L (2021) Knowledge-routed visual question reasoning: Challenges for deep representation embedding. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2020.3045034

    Article  Google Scholar 

  10. Carcenac M (2008) A modular neural network applied to image transformation and mental images. Neural Comput Appl 17(5):549–568

    Article  Google Scholar 

  11. Choi E, He H, Iyyer M, Yatskar M, Yih Wt, Choi Y, Liang P, Zettlemoyer L (2018) QuAC: Question answering in context. In: EMNLP 2018, pp. 2174–2184. Association for Computational Linguistics, Brussels, Belgium. Doi: https://doi.org/10.18653/v1/D18-1241

  12. Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proc. 2019 Conf. North Amer. Chap. Assoc. Comput. Linguist. Human Lang. Technol., pp. 4171–4186. Association for Computational Linguistics, Minneapolis, Minnesota. Doi: https://doi.org/10.18653/v1/N19-1423

  13. Dua D, Gottumukkala A, Talmor A, Singh S, Gardner M (2019) Orb: An open reading benchmark for comprehensive evaluation of machine reading comprehension. arXiv preprint arXiv:1912.12598

  14. Dua D, Wang Y, Dasigi P, Stanovsky G, Singh S, Gardner M (2019) DROP: A reading comprehension benchmark requiring discrete reasoning over paragraphs. In: Proc. 2019 Conf. North Amer. Chapt. Assoc. Comput. Linguist. Human Lang. Technol., Volume 1 (Long and Short Papers), pp. 2368–2378. Association for Computational Linguistics, Minneapolis, Minnesota. Doi: https://doi.org/10.18653/v1/N19-1246

  15. Durrani N, Sajjad H, Dalvi F, Belinkov Y (2020) Analyzing individual neurons in pre-trained language models. arXiv preprint arXiv:2010.02695

  16. Elazar Y, Ravfogel S, Jacovi A, Goldberg Y (2021) Amnesic probing: Behavioral explanation with amnesic counterfactuals. Trans Assoc Comput Linguist 9:160–175

    Article  Google Scholar 

  17. Evans JSB (1984) Heuristic and analytic processes in reasoning. Br J Psychol 75(4):451–468

    Article  Google Scholar 

  18. Face H (2020) Transformers. https://github.com/huggingface/transformers. Accessed October 25, 2020

  19. Feng Y, Chen X, Lin BY, Wang P, Yan J, Ren X (2020) Scalable multi-hop relational reasoning for knowledge-aware question answering. In: EMNLP 2020. Association for Computational Linguistics, Online. Doi: https://doi.org/10.18653/v1/2020.emnlp-main.99

  20. Gardner M, Artzi Y, Basmova V, Berant J, Bogin B, Chen S, Dasigi P, Dua D, Elazar Y, Gottumukkala A, Gupta N, Hajishirzi H, Ilharco G, Khashabi D, Lin K, Liu J, Liu NF, Mulcaire P, Ning Q, Singh S, Smith NA, Subramanian S, Tsarfaty R, Wallace E, Zhang A, Zhou B (2020) Evaluating NLP models via contrast sets. arXiv preprint arXiv:2004.027

  21. Geva M, Goldberg Y, Berant J (2019) Are we modeling the task or the annotator? an investigation of annotator bias in natural language understanding datasets. arXiv preprint arXiv:1908.07898

  22. Gupta N, Lewis M (2018) Neural compositional denotational semantics for question answering. In: EMNLP 2018, pp. 2152–2161. Association for Computational Linguistics, Brussels, Belgium. Doi: https://doi.org/10.18653/v1/D18-1239

  23. Gupta N, Lin K, Roth D, Singh S, Gardner M (2019) Neural module networks for reasoning over text. arXiv preprint arXiv:1912.04971

  24. Hao Y, Dong L, Wei F, Xu K (2020) Self-attention attribution: Interpreting information interactions inside transformer. arXiv preprint arXiv:2004.11207

  25. Hu P (2020) A classifier of matrix modular neural network to simplify complex classification tasks. Neural Comput Appl 32(5):1367–1377

    Article  Google Scholar 

  26. Hu R, Andreas J, Rohrbach M, Darrell T, Saenko K (2017) Learning to reason: End-to-end module networks for visual question answering. In: Proc. IEEE Int. Conf. Comput. Vis., pp. 804–813

  27. Huang W, Qu Q, Yang M (2020) Interactive knowledge-enhanced attention network for answer selection. Neural Comput Appl pp. 1–17

  28. Hupkes D, Veldhoen S, Zuidema W (2018) Visualisation and’diagnostic classifiers’ reveal how recurrent and recursive neural networks process hierarchical structure. J Artif Intell Res 61:907–926

    Article  MathSciNet  Google Scholar 

  29. Janizek JD, Sturmfels P, Lee SI (2021) Explaining explanations: Axiomatic feature interactions for deep networks. J Mach Learn Res 22(104):1–54

    MathSciNet  MATH  Google Scholar 

  30. Jhamtani H, Clark P (2020) Learning to explain: Datasets and models for identifying valid reasoning chains in multihop question-answering. arXiv preprint arXiv:2010.03274

  31. Jia R, Liang P (2017) Adversarial examples for evaluating reading comprehension systems. In: EMNLP 2017. Association for Computational Linguistics, Copenhagen, Denmark, pp. 2021–2031. https://doi.org/10.18653/v1/D17-1215

  32. Jiang Y, Bansal M (2019) Self-assembling modular networks for interpretable multi-hop reasoning. In: EMNLP-IJCNLP. Association for Computational Linguistics, Hong Kong, China, pp. 4474–4484. Doi: https://doi.org/10.18653/v1/D19-1455

  33. Jiang Y, Joshi N, Chen YC, Bansal M (2019) Explore, propose, and assemble: An interpretable model for multi-hop reading comprehension. arXiv preprint arXiv:1906.05210

  34. Johnson J, Hariharan B, van der Maaten L, Fei-Fei L, Lawrence Zitnick C, Girshick R (2017) Clevr: A diagnostic dataset for compositional language and elementary visual reasoning. In: Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 2901–2910

  35. Khashabi D, Khot T, Sabharwal A, Tafjord O, Clark P, Hajishirzi H (2020) UnifiedQA: Crossing format boundaries with a single qa system. In: Findings Assoc. Computat. Ling. EMNLP 2020. Association for Computational Linguistics, Online, pp. 1896–1907. Doi: https://doi.org/10.18653/v1/2020.findings-emnlp.171

  36. Klein T, Nabi M (2020) Contrastive self-supervised learning for commonsense reasoning. In: Proc. 58th Ann. Meet. Assoc. Comput. Linguist. Association for Computational Linguistics, Online, pp. 7517–7523. https://doi.org/10.18653/v1/2020.acl-main.671

  37. Krishnamurthy J, Dasigi P, Gardner M (2017) Neural semantic parsing with type constraints for semi-structured tables. In: EMNLP 2017, pp. 1516–1526

  38. Lai G, Xie Q, Liu H, Yang Y, Hovy E (2017) RACE: Large-scale ReAding comprehension dataset from examinations. In: EMNLP 2017. Association for Computational Linguistics, Copenhagen, Denmark, pp. 785–794. Doi: https://doi.org/10.18653/v1/D17-1082

  39. Li X, Zang H, Yu X, Wu H, Zhang Z, Liu J, Wang M (2021) On improving knowledge graph facilitated simple question answering system. Neural Comput Appl pp. 1–10

  40. Lin BY, Chen X, Chen J, Ren X (2019) Kagnet: Knowledge-aware graph networks for commonsense reasoning. arXiv preprint arXiv:1909.02151

  41. Lin H, Sun L, Han X (2017) Reasoning with heterogeneous knowledge for commonsense machine comprehension. In: EMNLP 2017. Association for Computational Linguistics, Copenhagen, Denmark, pp. 2032–2043. Doi: https://doi.org/10.18653/v1/D17-1216

  42. Lin K, Tafjord O, Clark P, Gardner M (2019) Reasoning over paragraph effects in situations. In: Proc. 2nd Workshop Machine Reading for Ques. Answering. Association for Computational Linguistics, Hong Kong, China, pp. 58–62. Doi: https://doi.org/10.18653/v1/D19-5808

  43. Liu J, Gardner M (2020) Multi-step inference for reasoning over paragraphs. arXiv preprint arXiv:2004.02995

  44. Liu K, Liu X, Yang A, Liu J, Su J, Li S, She Q (2020) A robust adversarial training approach to machine reading comprehension. In: AAAI, pp. 8392–8400

  45. Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) RoBERTa: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692

  46. Lourie N, Le Bras R, Bhagavatula C, Choi Y (2021) Unicorn on rainbow: A universal commonsense reasoning model on a new multitask benchmark. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 13480–13488

  47. Lv S, Guo D, Xu J, Tang D, Duan N, Gong M, Shou L, Jiang D, Cao G, Hu S (2020) Graph-based reasoning over heterogeneous external knowledge for commonsense question answering. In: AAAI, pp. 8449–8456 (2020)

  48. Min S, Zhong V, Zettlemoyer L, Hajishirzi H (2019) Multi-hop reading comprehension through question decomposition and rescoring. arXiv preprint arXiv:1906.02916

  49. Mitra A, Baral C, Bhattacharjee A, Shrivastava I (2019) A generate-validate approach to answering questions about qualitative relationships. ArXiv preprint arXiv:1908.03645

  50. Mokhtari K, Reichard CA (2002) Assessing students’ metacognitive awareness of reading strategies. J Edu Psychol 94(2):249

    Article  Google Scholar 

  51. Murdoch WJ, Liu PJ, Yu B (2018) Beyond word importance: Contextual decomposition to extract interactions from lstms. arXiv preprint arXiv:1801.05453

  52. Petroni F, Rocktäschel T, Riedel S, Lewis P, Bakhtin A, Wu Y, Miller A (2019) Language models as knowledge bases? In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 2463–2473. Association for Computational Linguistics, Hong Kong, China. 10.18653/v1/D19-1250

  53. Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, Zhou Y, Li W, Liu PJ (2020) Exploring the limits of transfer learning with a unified text-to-text transformer. J Mach Learn Res 21(140):1–67

    MathSciNet  MATH  Google Scholar 

  54. Rajani NF, McCann B, Xiong C, Socher R (2019) Explain yourself! leveraging language models for commonsense reasoning. arXiv preprint arXiv:1906.02361

  55. Rajpurkar P, Jia R, Liang P (2020) Squad 2.0:the stanford question answering dataset. https://rajpurkar.github.io/SQuAD-explorer/. Accessed October 25, 2020

  56. Rajpurkar P, Jia R, Liang P (2018) Know what you don’t know: Unanswerable questions for SQuAD. In: Proc. 56th Ann. Meet. Assoc. Comput. Linguist., pp. 784–789. Association for Computational Linguistics, Melbourne, Australia. Doi: https://doi.org/10.18653/v1/P18-2124

  57. Ran Q, Lin Y, Li P, Zhou J, Liu Z (2019) NumNet: Machine reading comprehension with numerical reasoning. In: EMNLP-IJCNLP. Association for Computational Linguistics, Hong Kong, China, pp. 2474–2484. Doi: https://doi.org/10.18653/v1/D19-1251

  58. Raschka S (2018) Model evaluation, model selection, and algorithm selection in machine learning. arXiv preprint arXiv:1811.12808

  59. Ravichander A, Dalmia S, Ryskina M, Metze F, Hovy E, Black AW (2021) Noiseqa: Challenge set evaluation for user-centric question answering. arXiv preprint arXiv:2102.08345

  60. Ren M, Geng X, Qin T, Huang H, Jiang D (2020) Towards interpretable reasoning over paragraph effects in situation. In: EMNLP 2020. Association for Computational Linguistics, Online, pp. 6745–6758. Doi: https://doi.org/10.18653/v1/2020.emnlp-main.548

  61. Sakaguchi K, Bras R, Bhagavatula C, Yejin C (2020) Winogrande: An adversarial winograd schema challenge at scale. Proc AAAI Conf Artif Intell 34:8732–8740. https://doi.org/10.1609/aaai.v34i05.6399

    Article  Google Scholar 

  62. Sap M, Le Bras R, Allaway E, Bhagavatula C, Lourie N, Rashkin H, Roof B, Smith NA, Choi Y (2019) Atomic: An atlas of machine commonsense for if-then reasoning. In: Proceedings of the AAAI Conference on Artificial Intelligence, 33: 3027–3035

  63. Sheorey R, Mokhtari K (2001) Differences in the metacognitive awareness of reading strategies among native and non-native readers. System 29(4):431–449

    Article  Google Scholar 

  64. Strobelt H, Gehrmann S, Behrisch M, Perer A, Pfister H, Rush AM (2018) S eq 2s eq-v is: A visual debugging tool for sequence-to-sequence models. IEEE Trans Visual Comput Gr 25(1):353–363

    Article  Google Scholar 

  65. Tafjord O, Clark P, Gardner M, Yih Wt, Sabharwal A (2019) Quarel: A dataset and models for answering questions about qualitative relationships. In: Proceedings of the AAAI Conference on Artificial Intelligence, 33: 7063–7071

  66. Talmor A, Herzig J, Lourie N, Berant J (2018) Commonsenseqa: A question answering challenge targeting commonsense knowledge. arXiv preprint arXiv:1811.00937

  67. Wu Z, Peng H, Smith NA (2021) Infusing finetuning with semantic dependencies. Trans Assoc Comput Linguist 9:226–242

    Article  Google Scholar 

  68. Yadav V, Bethard S, Surdeanu M (2020) Unsupervised alignment-based iterative evidence retrieval for multi-hop question answering. In: Proc. 58th Ann. Meet. Assoc. Comput. Linguist. Association for Computational Linguistics, pp. 4514–4525. Doi: https://doi.org/10.18653/v1/2020.acl-main.414

  69. Yang Z, Qi P, Zhang S, Bengio Y, Cohen W, Salakhutdinov R, Manning CD (2018) HotpotQA: A dataset for diverse, explainable multi-hop question answering. In: EMNLP 2018. Association for Computational Linguistics, Brussels, Belgium, pp. 2369–2380. Doi: https://doi.org/10.18653/v1/D18-1259

  70. Zhu Y, Liang X, Lin B, Ye Q, Jiao J, Lin L, Liang X (2020) Configurable graph reasoning for visual relationship detection. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2020.3027575

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (No. 61751201) and the National Key R&D Plan (No. 2016QY03D0602).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mucheng Ren.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ren, M., Huang, H. & Gao, Y. Interpretable modular knowledge reasoning for machine reading comprehension. Neural Comput & Applic 34, 9901–9918 (2022). https://doi.org/10.1007/s00521-022-06975-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-022-06975-2

Keywords

Navigation