Abstract
Multi-hop reading comprehension requires the aggregation of multiple evidence facts to answer complex natural language questions, and the answer should be avoided when there is no answer. Training a model that can handle such difficult tasks requires a large number of data sets to support, but the labeling of data sets is very expensive and time-consuming, so it is very important to explore reading comprehension models suitable for low data, and external data related to large-scale tasks. It will also effectively improve the performance of the model. This paper proposes a two-stage model with dynamically context-enhanced method for multi-hop reading comprehension tasks under low data called DeMRC. The first stage sentence filtering model filters the top k sentences that are strongly related to the question, and the second stage answer prediction model dynamically constructs the training set every time during training to expand the data set, and uses sentences selected by sentence filtering model as input to reduce the interference of irrelevant sentences to the model during inference. In addition, the self-training method is used to pseudo-label the external data and use it as an auxiliary data set to improve the performance of the model. We conducted experiments on the multi-hop reading comprehension data set of the Chinese “CAIL 2020" Judicial Artificial Intelligence Challenge Reading Comprehension Track and English cross-document level data set called HotpotQA, which are 3.5% and 21.3% higher than the powerful baseline model, showing the effectiveness of the method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P.: Squad: 100, 000+ questions for machine comprehension of text. In: EMNLP, pp. 2383–2392 (2016)
Chen, J., Durrett, G.: Understanding dataset design choices for multi-hop reasoning. In: NAACL-HLT, pp. 4026–4032 (2019)
Welbl, J., Stenetorp, P., Riedel, S.: Constructing datasets for multihop reading comprehension across documents. Trans. Assoc. Comput. Linguistics 6, 287–302 (2018)
Yang, Z., et al.: Hotpotqa: a dataset for diverse, explainable multi-hop question answering. In: EMNLP, pp. 2369–2380 (2018)
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Liu, Y., et al.: Roberta: a robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019)
Hermann, K.M., et al.: Teaching machines to read and comprehend. Adv. Neural. Inf. Process. Syst. 28, 1693–1701 (2015)
Seo, M.J., Kembhavi, A., Farhadi, A., Hajishirzi, H.: Bidirectional attention flow for machine comprehension. In: ICLR (2017)
Tu, M., Huang, K., Wang, G., Huang, J., He, X., Zhou, B.: Select, answer and explain: Interpretable multi-hop reading comprehension over multiple documents. In: AAAI, pp. 9073–9080 (2020)
De Cao, N., Aziz, W., Titov, I.: Question answering by reasoning across documents with graph convolutional networks. arXiv preprint arXiv:1808.09920 (2018)
Tu, M., Wang, G., Huang, J., Tang, Y., He, X., Zhou, B.: Multi-hop reading comprehension across multiple documents by reasoning over heterogeneous graphs. In: ACL, pp. 2704–2713 (2019)
Shao, N., Cui, Y., Liu, T., Wang, S., Hu, G.: Is graph structure necessary for multi-hop reasoning? CoRR (2020)
Levy, O., Seo, M., Choi, E., Zettlemoyer, L.: Zero-shot relation extraction
Tan, C., Wei, F., Zhou, Q., Yang, N., Lv, W., Zhou, M.: I know there is no answer: Modeling answer validation for machine reading comprehension. In: Zhang, M., Ng, V., Zhao, D., Li, S., Zan, H. (eds.) NLPCC, pp. 85–97 (2018)
Wei, J., Zou, K.: Eda: Easy data augmentation techniques for boosting performance on text classification tasks. In: EMNLP-IJCNLP, pp. 6381–6387 (2019)
Xie, Q., Dai, Z., Hovy, E., Luong, M.-T., Le, Q.V.: Unsupervised data augmentation for consistency training. In: NeurIPS (2020)
Guo, H., Mao, Y., Zhang, R.: Augmenting data with mixup for sentence classification: An empirical study. arXiv preprint arXiv:1905.08941 (2019)
Kobayashi, S.: Contextual augmentation: Data augmentation by words with paradigmatic relations. In: NAACL-HLT, pp. 452–457 (2018)
Wu, X., Lv, S., Zang, L., Han, J., Hu, S.: Conditional BERT contextual augmentation. In: Rodrigues, J.M.F., Cardoso, P.J.S., Monteiro, J., Lam, R., Krzhizhanovskaya, V.V., Lees, M.H., Dongarra, J.J., Sloot, P.M.A. (eds.) ICCS 2019. LNCS, vol. 11539, pp. 84–95. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-22747-0_7
Anaby-Tavor, A., et al.: Do not have enough data? deep learning to the rescue! In: AAAI, pp. 7383–7390 (2020)
Hewlett, D., Jones, L., Lacoste, A., Gur, I.: Accurate supervised and semisupervised machine reading for long documents. In: EMNLP, pp. 2011–2020 (2017)
Xie, Q., Luong, M.-T., Hovy, E., Le, Q.V.: Self-training with noisy student improves imagenet classification. In: CVPR, pp. 10687–10698 (2020)
Vaswani, A., et al.: Attention is all you need. In: NIPS, pp. 5998–6008 (2017)
Bhojanapalli, S., Yun, C., Rawat, A.S., Reddi, S.J., Kumar, S.: Low-rank bottleneck in multi-head attention models. In: ICML, pp. 864–873 (2020)
Shazeer, N., Lan, Z., Cheng, Y., Ding, N., Hou, L.: Talking-heads attention. arXiv preprint arXiv:2003.02436 (2020)
Cui, Y., Che, W., Liu, T., Qin, B., Yang, Z.: Pre-training with whole word masking for chinese BERT. IEEE ACM Trans. Audio Speech Lang. Process. 29, 3504–3514 (2021)
Acknowledgments
This work is supported by State Grid Zhejiang Electric Power Co., Ltd. Science and Technology Project - Research and Application of Intelligent Operation and Inspection Technology Based on Natural Language Processing and Artificial Intelligence Technology.
The authors would like to thank AI+ High Performance Computing Center of ZJU-ICI.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Tang, X., Xu, Y., Lu, X., He, Q., Fang, J., Chen, J. (2022). DeMRC: Dynamically Enhanced Multi-hop Reading Comprehension Model for Low Data. In: Chen, W., Yao, L., Cai, T., Pan, S., Shen, T., Li, X. (eds) Advanced Data Mining and Applications. ADMA 2022. Lecture Notes in Computer Science(), vol 13726. Springer, Cham. https://doi.org/10.1007/978-3-031-22137-8_4
Download citation
DOI: https://doi.org/10.1007/978-3-031-22137-8_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-22136-1
Online ISBN: 978-3-031-22137-8
eBook Packages: Computer ScienceComputer Science (R0)