DeMRC: Dynamically Enhanced Multi-hop Reading Comprehension Model for Low Data

Tang, Xiu; Xu, Yangchao; Lu, Xuefeng; He, Qiang; Fang, Jun; Chen, Junjie

doi:10.1007/978-3-031-22137-8_4

Xiu Tang¹³,
Yangchao Xu¹⁴,
Xuefeng Lu¹³,
Qiang He¹⁴,
Jun Fang¹⁴ &
…
Junjie Chen¹³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13726))

Included in the following conference series:

International Conference on Advanced Data Mining and Applications

700 Accesses

Abstract

Multi-hop reading comprehension requires the aggregation of multiple evidence facts to answer complex natural language questions, and the answer should be avoided when there is no answer. Training a model that can handle such difficult tasks requires a large number of data sets to support, but the labeling of data sets is very expensive and time-consuming, so it is very important to explore reading comprehension models suitable for low data, and external data related to large-scale tasks. It will also effectively improve the performance of the model. This paper proposes a two-stage model with dynamically context-enhanced method for multi-hop reading comprehension tasks under low data called DeMRC. The first stage sentence filtering model filters the top k sentences that are strongly related to the question, and the second stage answer prediction model dynamically constructs the training set every time during training to expand the data set, and uses sentences selected by sentence filtering model as input to reduce the interference of irrelevant sentences to the model during inference. In addition, the self-training method is used to pseudo-label the external data and use it as an auxiliary data set to improve the performance of the model. We conducted experiments on the multi-hop reading comprehension data set of the Chinese “CAIL 2020" Judicial Artificial Intelligence Challenge Reading Comprehension Track and English cross-document level data set called HotpotQA, which are 3.5% and 21.3% higher than the powerful baseline model, showing the effectiveness of the method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P.: Squad: 100, 000+ questions for machine comprehension of text. In: EMNLP, pp. 2383–2392 (2016)
Google Scholar
Chen, J., Durrett, G.: Understanding dataset design choices for multi-hop reasoning. In: NAACL-HLT, pp. 4026–4032 (2019)
Google Scholar
Welbl, J., Stenetorp, P., Riedel, S.: Constructing datasets for multihop reading comprehension across documents. Trans. Assoc. Comput. Linguistics 6, 287–302 (2018)
Google Scholar
Yang, Z., et al.: Hotpotqa: a dataset for diverse, explainable multi-hop question answering. In: EMNLP, pp. 2369–2380 (2018)
Google Scholar
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Liu, Y., et al.: Roberta: a robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019)
Hermann, K.M., et al.: Teaching machines to read and comprehend. Adv. Neural. Inf. Process. Syst. 28, 1693–1701 (2015)
Google Scholar
Seo, M.J., Kembhavi, A., Farhadi, A., Hajishirzi, H.: Bidirectional attention flow for machine comprehension. In: ICLR (2017)
Google Scholar
Tu, M., Huang, K., Wang, G., Huang, J., He, X., Zhou, B.: Select, answer and explain: Interpretable multi-hop reading comprehension over multiple documents. In: AAAI, pp. 9073–9080 (2020)
Google Scholar
De Cao, N., Aziz, W., Titov, I.: Question answering by reasoning across documents with graph convolutional networks. arXiv preprint arXiv:1808.09920 (2018)
Tu, M., Wang, G., Huang, J., Tang, Y., He, X., Zhou, B.: Multi-hop reading comprehension across multiple documents by reasoning over heterogeneous graphs. In: ACL, pp. 2704–2713 (2019)
Google Scholar
Shao, N., Cui, Y., Liu, T., Wang, S., Hu, G.: Is graph structure necessary for multi-hop reasoning? CoRR (2020)
Google Scholar
Levy, O., Seo, M., Choi, E., Zettlemoyer, L.: Zero-shot relation extraction
Google Scholar
Tan, C., Wei, F., Zhou, Q., Yang, N., Lv, W., Zhou, M.: I know there is no answer: Modeling answer validation for machine reading comprehension. In: Zhang, M., Ng, V., Zhao, D., Li, S., Zan, H. (eds.) NLPCC, pp. 85–97 (2018)
Google Scholar
Wei, J., Zou, K.: Eda: Easy data augmentation techniques for boosting performance on text classification tasks. In: EMNLP-IJCNLP, pp. 6381–6387 (2019)
Google Scholar
Xie, Q., Dai, Z., Hovy, E., Luong, M.-T., Le, Q.V.: Unsupervised data augmentation for consistency training. In: NeurIPS (2020)
Google Scholar
Guo, H., Mao, Y., Zhang, R.: Augmenting data with mixup for sentence classification: An empirical study. arXiv preprint arXiv:1905.08941 (2019)
Kobayashi, S.: Contextual augmentation: Data augmentation by words with paradigmatic relations. In: NAACL-HLT, pp. 452–457 (2018)
Google Scholar
Wu, X., Lv, S., Zang, L., Han, J., Hu, S.: Conditional BERT contextual augmentation. In: Rodrigues, J.M.F., Cardoso, P.J.S., Monteiro, J., Lam, R., Krzhizhanovskaya, V.V., Lees, M.H., Dongarra, J.J., Sloot, P.M.A. (eds.) ICCS 2019. LNCS, vol. 11539, pp. 84–95. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-22747-0_7
Chapter Google Scholar
Anaby-Tavor, A., et al.: Do not have enough data? deep learning to the rescue! In: AAAI, pp. 7383–7390 (2020)
Google Scholar
Hewlett, D., Jones, L., Lacoste, A., Gur, I.: Accurate supervised and semisupervised machine reading for long documents. In: EMNLP, pp. 2011–2020 (2017)
Google Scholar
Xie, Q., Luong, M.-T., Hovy, E., Le, Q.V.: Self-training with noisy student improves imagenet classification. In: CVPR, pp. 10687–10698 (2020)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: NIPS, pp. 5998–6008 (2017)
Google Scholar
Bhojanapalli, S., Yun, C., Rawat, A.S., Reddi, S.J., Kumar, S.: Low-rank bottleneck in multi-head attention models. In: ICML, pp. 864–873 (2020)
Google Scholar
Shazeer, N., Lan, Z., Cheng, Y., Ding, N., Hou, L.: Talking-heads attention. arXiv preprint arXiv:2003.02436 (2020)
Cui, Y., Che, W., Liu, T., Qin, B., Yang, Z.: Pre-training with whole word masking for chinese BERT. IEEE ACM Trans. Audio Speech Lang. Process. 29, 3504–3514 (2021)
Google Scholar

Download references

Acknowledgments

This work is supported by State Grid Zhejiang Electric Power Co., Ltd. Science and Technology Project - Research and Application of Intelligent Operation and Inspection Technology Based on Natural Language Processing and Artificial Intelligence Technology.

The authors would like to thank AI+ High Performance Computing Center of ZJU-ICI.

Author information

Authors and Affiliations

Zhejiang University, Hangzhou, China
Xiu Tang, Xuefeng Lu & Junjie Chen
State Grid Shaoxing Power Supply Company, Shaoxing, China
Yangchao Xu, Qiang He & Jun Fang

Authors

Xiu Tang
View author publications
You can also search for this author in PubMed Google Scholar
Yangchao Xu
View author publications
You can also search for this author in PubMed Google Scholar
Xuefeng Lu
View author publications
You can also search for this author in PubMed Google Scholar
Qiang He
View author publications
You can also search for this author in PubMed Google Scholar
Jun Fang
View author publications
You can also search for this author in PubMed Google Scholar
Junjie Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiu Tang .

Editor information

Editors and Affiliations

The University of Adelaide, Adelaide, SA, Australia
Weitong Chen
The University of New South Wales, Sydney, NSW, Australia
Lina Yao
Macquarie University, Sydney, NSW, Australia
Taotao Cai
Griffith University, Brisbane, QLD, Australia
Shirui Pan
Microsoft, Beijing, China
Tao Shen
The University of Queensland, Brisbane, QLD, Australia
Xue Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tang, X., Xu, Y., Lu, X., He, Q., Fang, J., Chen, J. (2022). DeMRC: Dynamically Enhanced Multi-hop Reading Comprehension Model for Low Data. In: Chen, W., Yao, L., Cai, T., Pan, S., Shen, T., Li, X. (eds) Advanced Data Mining and Applications. ADMA 2022. Lecture Notes in Computer Science(), vol 13726. Springer, Cham. https://doi.org/10.1007/978-3-031-22137-8_4

Download citation

DOI: https://doi.org/10.1007/978-3-031-22137-8_4
Published: 24 November 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-22136-1
Online ISBN: 978-3-031-22137-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

DeMRC: Dynamically Enhanced Multi-hop Reading Comprehension Model for Low Data