Spatial Commonsense Reasoning for Machine Reading Comprehension

Lin, Miaopei; Wang, Meng-xiang; Yu, Jianxing; Wang, Shiqi; Lai, Hanjiang; Liu, Wei; Yin, Jian

doi:10.1007/978-3-031-46664-9_24

Miaopei Lin¹⁵,
Meng-xiang Wang¹⁷,
Jianxing Yu^16,18,
Shiqi Wang¹⁶,
Hanjiang Lai^15,18,
Wei Liu¹⁸ &
…
Jian Yin^16,18

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14177))

Included in the following conference series:

International Conference on Advanced Data Mining and Applications

989 Accesses
2 Citations

Abstract

This paper studies the problem of spatial commonsense reasoning for the machine reading comprehension task. Spatial commonsense is the human-shared but latent knowledge of object shape, size, distance, and position. Reasoning this abstract knowledge can facilitate machines better perceive their surroundings, which is crucial for general intelligence. However, this valuable topic is challenging and has been less studied. To bridge this research gap, we focus on this topic and propose a new method to realize spatial reasoning. Given a text, we first build a potential reasoning graph based on its parsing tree. To better support spatial reasoning, we retrieve the related commonsense entities and relations from external knowledge sources, including the pre-trained language model (LM) and knowledge graph (KG). LM covers all kinds of factual knowledge and KG has abundant commonsense relations. We then propose a new fusion method called LEGRN (LM Edge-GNN Reasoner Networks) to fuse the text and graph. LEGRN adopts layer-based attention to integrate the LM text encoder and KG graph encoder, which can capture correlations between LM text context and KG graph structure. Considering that spatial relations involve a variety of attributes, we propose an attribute-aware inferential network to deduce the correct answers. To evaluate our approach, we construct a new large-scale dataset named CRCSpatial, consisting of 40k spatial reasoning questions. Experiment results illustrated the effectiveness of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Answering Spatial Commonsense Questions Based on Chain-of-Thought Reasoning with Adaptive Complexity

Modular Graph Attention Network for Complex Visual Relational Reasoning

Graphhopper: Multi-hop Scene Graph Reasoning for Visual Question Answering

Notes

1.
https://expertvagabond.com, https://www.wtcf.org.cn, https://www.tripadvisor.com.

References

Bagherinezhad, H., Hajishirzi, H., Choi, Y., Farhadi, A.: Are elephants bigger than butterflies? reasoning about sizes of objects. In: Proceedings of the AAAI, vol. 30 (2016)
Google Scholar
Elazar, Y., Mahabal, A., Ramachandran, D., Bedrax-Weiss, T., Roth, D.: How large are lions? inducing distributions over quantitative attributes. In: Proceedings of the 57th ACL, pp. 3973–3983 (2019)
Google Scholar
Feng, Y., Chen, X., Lin, B.Y., Wang, P., Yan, J., Ren, X.: Scalable multi-hop relational reasoning for knowledge-aware question answering. In: Proceedings of the EMNLP, pp. 1295–1309 (2020)
Google Scholar
Gong, L., Cheng, Q.: Exploiting edge features for graph neural networks. In: Proceedings of the IEEE/CVF, pp. 9211–9219 (2019)
Google Scholar
Huang, L., Le Bras, R., Bhagavatula, C., Choi, Y.: Cosmos QA: machine reading comprehension with contextual commonsense reasoning. In: Proceedings of the 9th EMNLP-IJCNLP, pp. 2391–2401 (2019)
Google Scholar
Kenton, J.D.M.W.C., Toutanova, L.K.: Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL-HLT, pp. 4171–4186 (2019)
Google Scholar
Levine, Y., et al.: SenseBert: driving some sense into Bert. In: Proceedings of the 58th ACL, pp. 4656–4667 (2020)
Google Scholar
Lin, B.Y., Chen, X., Chen, J., Ren, X.: KagNet: knowledge-aware graph networks for commonsense reasoning. In: Proceedings of the 9th EMNLP-IJCNLP, pp. 2829–2839 (2019)
Google Scholar
Liu, X., Yin, D., Feng, Y., Zhao, D.: Things not written in text: exploring spatial commonsense from visual signals. In: Proceedings of the 60th ACL, pp. 2365–2376 (2022)
Google Scholar
Liu, Y., et al.: Roberta: a robustly optimized Bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019)
Mihaylov, T., Frank, A.: Knowledgeable reader: enhancing cloze-style reading comprehension with external commonsense knowledge. In: Proceedings of the 56th ACL, pp. 821–832 (2018)
Google Scholar
Ostermann, S., Roth, M., Pinkal, M.: Mcscript2. 0: a machine comprehension corpus focused on script events and participants. In: Proceedings of * SEM 2019, pp. 103–117 (2019)
Google Scholar
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I., et al.: Language models are unsupervised multitask learners. OpenAI Blog 1(8), 9 (2019)
Google Scholar
Sap, M., Rashkin, H., Chen, D., Le Bras, R., Choi, Y.: Social IQA: commonsense reasoning about social interactions. In: Proceedings of the 9th EMNLP-IJCNLP, pp. 4463–4473 (2019)
Google Scholar
Speer, R., Chin, J., Havasi, C.: ConceptNet 5.5: an open multilingual graph of general knowledge. In: Proceedings of the AAAI, vol. 31 (2017)
Google Scholar
Sun, Y., et al.: Ernie 2.0: a continual pre-training framework for language understanding. In: Proceedings of the AAAI, vol. 34, pp. 8968–8975 (2020)
Google Scholar
Talmor, A., Herzig, J., Lourie, N., Berant, J.: CommonsenseQA: a question answering challenge targeting commonsense knowledge. In: Proceedings of NAACL-HLT, pp. 4149–4158 (2019)
Google Scholar
Taunk, D., Khanna, L., Kandru, S.V.P.K., Varma, V., Sharma, C., Tapaswi, M.: GrapeQA: graph augmentation and pruning to enhance question-answering. In: Companion Proceedings of the ACM Web Conference 2023, pp. 1138–1144 (2023)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Wang, G., Hou, X., Yang, D., Mckeown, K., Huang, J.: Semantic categorization of social knowledge for commonsense question answering. In: Proceedings of the 2nd Workshop on Simple and Efficient Natural Language Processing, pp. 79–85 (2021)
Google Scholar
Wang, X., et al.: Improving natural language inference using external knowledge in the science questions domain. In: Proceedings of the AAAI, vol. 33, pp. 7208–7215 (2019)
Google Scholar
Yang, A., et al.: Enhancing pre-trained language representations with rich knowledge for machine reading comprehension. In: Proceedings of the 57th ACL, pp. 2346–2357 (2019)
Google Scholar
Yasunaga, M., Ren, H., Bosselut, A., Liang, P., Leskovec, J.: QA-GNN: reasoning with language models and knowledge graphs for question answering. In: Proceedings of the NAACL 2021, pp. 535–546 (2021)
Google Scholar
Yatskar, M., Ordonez, V., Farhadi, A.: Stating the obvious: extracting visual common sense knowledge. In: Proceedings of the NAACL 2016, pp. 193–198 (2016)
Google Scholar
Yu, J., Liu, W., Zheng, L., Su, Q., Zhao, B., Yin, J.: Generating deep questions with commonsense reasoning ability from the text by disentangled adversarial inference. arXiv preprint (2023)
Google Scholar
Zhang, X., et al.: GreaseLM: graph reasoning enhanced language models for question answering. In: Proceedings of ICLR (2022)
Google Scholar
Zhong, W., Tang, D., Duan, N., Zhou, M., Wang, J., Yin, J.: Improving question answering by commonsense-based pre-training. In: Tang, J., Kan, M.-Y., Zhao, D., Li, S., Zan, H. (eds.) NLPCC 2019. LNCS (LNAI), vol. 11838, pp. 16–28. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32233-5_2
Chapter Google Scholar
Zhou, B., Khashabi, D., Ning, Q., Roth, D.: “going on a vacation” takes longer than “going for a walk”: a study of temporal commonsense understanding. In: Proceedings of the 9th EMNLP-IJCNLP, pp. 3363–3369 (2019)
Google Scholar

Download references

Acknowlendgement

This work is supported by the National Natural Science Foundation of China (62276279, 62002396), the Key-Area Research and Development Program of Guangdong Province (2020B0101100001), the Tencent WeChat Rhino-Bird Focused Research Program (WXG-FR-2023-06), and Zhuhai Industry-University-Research Cooperation Project (2220004002549).

Author information

Authors and Affiliations

School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China
Miaopei Lin & Hanjiang Lai
School of Artificial Intelligence, Sun Yat-sen University, Zhuhai, China
Jianxing Yu, Shiqi Wang & Jian Yin
China National Institute of Standardization, Beijing, 100088, China
Meng-xiang Wang
Guangdong Key Laboratory of Big Data Analysis and Processing, Guangzhou, China
Jianxing Yu, Hanjiang Lai, Wei Liu & Jian Yin

Authors

Miaopei Lin
View author publications
You can also search for this author in PubMed Google Scholar
Meng-xiang Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jianxing Yu
View author publications
You can also search for this author in PubMed Google Scholar
Shiqi Wang
View author publications
You can also search for this author in PubMed Google Scholar
Hanjiang Lai
View author publications
You can also search for this author in PubMed Google Scholar
Wei Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jian Yin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jianxing Yu .

Editor information

Editors and Affiliations

Northeastern University, Shenyang, China
Xiaochun Yang
The University of Indonesia, Depok, Indonesia
Heru Suhartanto
Beijing Institute of Technology, Beijing, China
Guoren Wang
Northeastern University, Shenyang, China
Bin Wang
University of Technology Sydney, Sydney, NSW, Australia
Jing Jiang
Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
Bing Li
Sun Yat-sen University, Guangzhou, China
Huaijie Zhu
Anhui University, Hefei, China
Ningning Cui

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lin, M. et al. (2023). Spatial Commonsense Reasoning for Machine Reading Comprehension. In: Yang, X., et al. Advanced Data Mining and Applications. ADMA 2023. Lecture Notes in Computer Science(), vol 14177. Springer, Cham. https://doi.org/10.1007/978-3-031-46664-9_24

Download citation

DOI: https://doi.org/10.1007/978-3-031-46664-9_24
Published: 05 November 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-46663-2
Online ISBN: 978-3-031-46664-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Spatial Commonsense Reasoning for Machine Reading Comprehension