Abstract
This paper studies the problem of spatial commonsense reasoning for the machine reading comprehension task. Spatial commonsense is the human-shared but latent knowledge of object shape, size, distance, and position. Reasoning this abstract knowledge can facilitate machines better perceive their surroundings, which is crucial for general intelligence. However, this valuable topic is challenging and has been less studied. To bridge this research gap, we focus on this topic and propose a new method to realize spatial reasoning. Given a text, we first build a potential reasoning graph based on its parsing tree. To better support spatial reasoning, we retrieve the related commonsense entities and relations from external knowledge sources, including the pre-trained language model (LM) and knowledge graph (KG). LM covers all kinds of factual knowledge and KG has abundant commonsense relations. We then propose a new fusion method called LEGRN (LM Edge-GNN Reasoner Networks) to fuse the text and graph. LEGRN adopts layer-based attention to integrate the LM text encoder and KG graph encoder, which can capture correlations between LM text context and KG graph structure. Considering that spatial relations involve a variety of attributes, we propose an attribute-aware inferential network to deduce the correct answers. To evaluate our approach, we construct a new large-scale dataset named CRCSpatial, consisting of 40k spatial reasoning questions. Experiment results illustrated the effectiveness of our approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bagherinezhad, H., Hajishirzi, H., Choi, Y., Farhadi, A.: Are elephants bigger than butterflies? reasoning about sizes of objects. In: Proceedings of the AAAI, vol. 30 (2016)
Elazar, Y., Mahabal, A., Ramachandran, D., Bedrax-Weiss, T., Roth, D.: How large are lions? inducing distributions over quantitative attributes. In: Proceedings of the 57th ACL, pp. 3973–3983 (2019)
Feng, Y., Chen, X., Lin, B.Y., Wang, P., Yan, J., Ren, X.: Scalable multi-hop relational reasoning for knowledge-aware question answering. In: Proceedings of the EMNLP, pp. 1295–1309 (2020)
Gong, L., Cheng, Q.: Exploiting edge features for graph neural networks. In: Proceedings of the IEEE/CVF, pp. 9211–9219 (2019)
Huang, L., Le Bras, R., Bhagavatula, C., Choi, Y.: Cosmos QA: machine reading comprehension with contextual commonsense reasoning. In: Proceedings of the 9th EMNLP-IJCNLP, pp. 2391–2401 (2019)
Kenton, J.D.M.W.C., Toutanova, L.K.: Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL-HLT, pp. 4171–4186 (2019)
Levine, Y., et al.: SenseBert: driving some sense into Bert. In: Proceedings of the 58th ACL, pp. 4656–4667 (2020)
Lin, B.Y., Chen, X., Chen, J., Ren, X.: KagNet: knowledge-aware graph networks for commonsense reasoning. In: Proceedings of the 9th EMNLP-IJCNLP, pp. 2829–2839 (2019)
Liu, X., Yin, D., Feng, Y., Zhao, D.: Things not written in text: exploring spatial commonsense from visual signals. In: Proceedings of the 60th ACL, pp. 2365–2376 (2022)
Liu, Y., et al.: Roberta: a robustly optimized Bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019)
Mihaylov, T., Frank, A.: Knowledgeable reader: enhancing cloze-style reading comprehension with external commonsense knowledge. In: Proceedings of the 56th ACL, pp. 821–832 (2018)
Ostermann, S., Roth, M., Pinkal, M.: Mcscript2. 0: a machine comprehension corpus focused on script events and participants. In: Proceedings of * SEM 2019, pp. 103–117 (2019)
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I., et al.: Language models are unsupervised multitask learners. OpenAI Blog 1(8), 9 (2019)
Sap, M., Rashkin, H., Chen, D., Le Bras, R., Choi, Y.: Social IQA: commonsense reasoning about social interactions. In: Proceedings of the 9th EMNLP-IJCNLP, pp. 4463–4473 (2019)
Speer, R., Chin, J., Havasi, C.: ConceptNet 5.5: an open multilingual graph of general knowledge. In: Proceedings of the AAAI, vol. 31 (2017)
Sun, Y., et al.: Ernie 2.0: a continual pre-training framework for language understanding. In: Proceedings of the AAAI, vol. 34, pp. 8968–8975 (2020)
Talmor, A., Herzig, J., Lourie, N., Berant, J.: CommonsenseQA: a question answering challenge targeting commonsense knowledge. In: Proceedings of NAACL-HLT, pp. 4149–4158 (2019)
Taunk, D., Khanna, L., Kandru, S.V.P.K., Varma, V., Sharma, C., Tapaswi, M.: GrapeQA: graph augmentation and pruning to enhance question-answering. In: Companion Proceedings of the ACM Web Conference 2023, pp. 1138–1144 (2023)
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Wang, G., Hou, X., Yang, D., Mckeown, K., Huang, J.: Semantic categorization of social knowledge for commonsense question answering. In: Proceedings of the 2nd Workshop on Simple and Efficient Natural Language Processing, pp. 79–85 (2021)
Wang, X., et al.: Improving natural language inference using external knowledge in the science questions domain. In: Proceedings of the AAAI, vol. 33, pp. 7208–7215 (2019)
Yang, A., et al.: Enhancing pre-trained language representations with rich knowledge for machine reading comprehension. In: Proceedings of the 57th ACL, pp. 2346–2357 (2019)
Yasunaga, M., Ren, H., Bosselut, A., Liang, P., Leskovec, J.: QA-GNN: reasoning with language models and knowledge graphs for question answering. In: Proceedings of the NAACL 2021, pp. 535–546 (2021)
Yatskar, M., Ordonez, V., Farhadi, A.: Stating the obvious: extracting visual common sense knowledge. In: Proceedings of the NAACL 2016, pp. 193–198 (2016)
Yu, J., Liu, W., Zheng, L., Su, Q., Zhao, B., Yin, J.: Generating deep questions with commonsense reasoning ability from the text by disentangled adversarial inference. arXiv preprint (2023)
Zhang, X., et al.: GreaseLM: graph reasoning enhanced language models for question answering. In: Proceedings of ICLR (2022)
Zhong, W., Tang, D., Duan, N., Zhou, M., Wang, J., Yin, J.: Improving question answering by commonsense-based pre-training. In: Tang, J., Kan, M.-Y., Zhao, D., Li, S., Zan, H. (eds.) NLPCC 2019. LNCS (LNAI), vol. 11838, pp. 16–28. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32233-5_2
Zhou, B., Khashabi, D., Ning, Q., Roth, D.: “going on a vacation” takes longer than “going for a walk”: a study of temporal commonsense understanding. In: Proceedings of the 9th EMNLP-IJCNLP, pp. 3363–3369 (2019)
Acknowlendgement
This work is supported by the National Natural Science Foundation of China (62276279, 62002396), the Key-Area Research and Development Program of Guangdong Province (2020B0101100001), the Tencent WeChat Rhino-Bird Focused Research Program (WXG-FR-2023-06), and Zhuhai Industry-University-Research Cooperation Project (2220004002549).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Lin, M. et al. (2023). Spatial Commonsense Reasoning for Machine Reading Comprehension. In: Yang, X., et al. Advanced Data Mining and Applications. ADMA 2023. Lecture Notes in Computer Science(), vol 14177. Springer, Cham. https://doi.org/10.1007/978-3-031-46664-9_24
Download citation
DOI: https://doi.org/10.1007/978-3-031-46664-9_24
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-46663-2
Online ISBN: 978-3-031-46664-9
eBook Packages: Computer ScienceComputer Science (R0)