Superimposed Attention Mechanism-Based CNN Network for Reading Comprehension and Question Answering

Li, Mingqi; Hou, Xuefei; Li, Jiaoe; Gao, Kai

doi:10.1007/978-981-15-0121-0_2

Mingqi Li^11,12,
Xuefei Hou^11,12,
Jiaoe Li^11,12 &
…
Kai Gao^11,12

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1059))

Included in the following conference series:

International Conference of Pioneering Computer Scientists, Engineers and Educators

1363 Accesses

Abstract

In recent years, end-to-end models have been widely used in the fields of machine comprehension (MC) and question answering (QA). Recurrent neural network (RNN) or convolutional neural network (CNN) is combined with attention mechanism to construct models to improve their accuracy. However, a single attention mechanism does not fully express the meaning of the text. In this paper, recurrent neural network is replaced with the convolutional neural network to process the text, and a superimposed attention mechanism is proposed. The model was constructed by combining a convolutional neural network with a superimposed attention mechanism. It shows that good results are achieved on the Stanford question answering dataset (SQuAD).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Turing, A.M.: Computing machinery and intelligence. In: Epstein, R., Roberts, G., Beber, G. (eds.) Parsing the Turing Test, pp. 23–65. Springer, Dordrecht (2009). https://doi.org/10.1007/978-1-4020-6710-5_3
Chapter Google Scholar
Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P.: SQuAD: 100,000 + questions for machine comprehension of text. arXiv preprint arXiv:1606.05250 (2016)
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems, pp. 3104–3112 (2014)
Google Scholar
Lawrence, S., Giles, C.L., Tsoi, A.C., Back, A.D.: Face recognition: a convolutional neural-network approach. IEEE Trans. Neural Netw. 8(1), 98–113 (1997)
Article Google Scholar
Sundermeyer, M., Schlüter, R., Ney, H.: LSTM neural networks for language modeling. In: Thirteenth Annual Conference of the International Speech Communication Association (2012)
Google Scholar
Vieira, J.P.A., Moura, R.S.: An analysis of convolutional neural networks for sentence classification. In: Computer Conference (CLEI), pp. 1–5. IEEE (2017)
Google Scholar
Kameoka, H., Tanaka, K., Kaneko, T., Hojo, N.: ConvS2S-VC: fully convolutional sequence-to-sequence voice conversion. arXiv preprint arXiv:1811.01609 (2018)
Yu, A.W., et al.: QANet: combining local convolution with global self-attention for reading comprehension. arXiv preprint arXiv:1804.09541 (2018)
Zhang, J., Zhu, X., Chen, Q., Dai, L., Wei, S., Jiang, H.: Exploring question understanding and adaptation in neural-network-based question answering. arXiv preprint arXiv:1703.04617 (2017)
Gao, S., Zhao, Y., Zhao, D., Yin, D., Yan, R.: Product-Answer Generation in ECommerce Question-Answering (2019)
Google Scholar
Luong, M.-T., Pham, H., Manning, C.D.: “Effective approaches to attention-based neural machine translation.” arXiv preprint arXiv:1508.04025 (2015)
Mnih, V., Heess, N., Graves, A.: Recurrent models of visual attention. In: Advances in Neural Information Processing Systems, pp. 2204–2212 (2014)
Google Scholar
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
Seo, M., Kembhavi, A., Farhadi, A., Hajishirzi, H.: Bidirectional attention flow for machine comprehension. arXiv preprint arXiv:1611.01603 (2016)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Tao, C., Gao, S., Shang, M., Wu, W., Zhao, D., Yan, R.: Get the point of my utterance! learning towards effective responses with multi-head attention mechanism. In: IJCAI, pp. 4418–4424 (2018)
Google Scholar
Gao, S., Chen, X., Ren, Z., Bing, L., Zhao, D., Yan, R.: Abstractive Text Summarization by Incorporating Reader Comments (2019)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Yin, W., Schütze, H., Xiang, B., Zhou, B.: ABCNN: attention-based convolutional neural network for modeling sentence pairs. Trans. Assoc. Comput. Linguist. 4, 259–272 (2016)
Article Google Scholar
Pennington, J., Socher, R., Manning, C.: Glove: Global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)
Google Scholar
Srivastava, R.K., Greff, K., Schmidhuber, J.: Highway networks. arXiv preprint arXiv:1505.00387 (2015)
Rajpurkar, P., Jia, R., Liang, P.: Know what you don’t know: unanswerable questions for SQuAD. arXiv preprint arXiv:1806.03822 (2018)
Trischler, A., et al.: NewsQA: a machine comprehension dataset. arXiv preprint arXiv:1611.09830 (2016)
Nguyen, T., et al.: MS MARCO: a human generated machine reading comprehension dataset. arXiv preprint arXiv:1611.09268 (2016)
Xiong, C., Zhong, V., Socher, R.: Dynamic coattention networks for question answering. arXiv preprint arXiv:1611.01604 (2016)
Lee, K., Salant, S., Kwiatkowski, T., Parikh, A., Das, D., Berant, J.: Learning recurrent span representations for extractive question answering. arXiv preprint arXiv:1611.01436 (2016)
Wang, Z., Mi, H., Hamza, W., Florian, R.: Multi-perspective context matching for machine comprehension. arXiv preprint arXiv:1612.04211 (2016)
Weissenborn, D., Wiese, G., Seiffe, L.: Making neural QA as simple as possible but not simpler. arXiv preprint arXiv:1703.04816 (2017)
Gong, Y., Bowman, S.R.: Ruminating reader: reasoning with gated multi-hop attention. arXiv preprint arXiv:1704.07415 (2017)

Download references

Acknowledgement

This paper is sponsored by National Science Foundation of China (61772075) and National Science Foundation of Hebei Province (F2017208012) and Humanities and social sciences research projects of the Ministry of Education (17JDGC022).

Author information

Authors and Affiliations

Princeton University, Princeton, NJ, 08544, USA
Mingqi Li, Xuefei Hou, Jiaoe Li & Kai Gao
School of Information Science and Engineering, Hebei University of Science and Technology, Shijiazhuang, 050700, China
Mingqi Li, Xuefei Hou, Jiaoe Li & Kai Gao

Authors

Mingqi Li
View author publications
You can also search for this author in PubMed Google Scholar
Xuefei Hou
View author publications
You can also search for this author in PubMed Google Scholar
Jiaoe Li
View author publications
You can also search for this author in PubMed Google Scholar
Kai Gao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kai Gao .

Editor information

Editors and Affiliations

College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China
Rui Mao
Harbin Institute of Technology, Harbin, China
Hongzhi Wang
Guilin University of Technology, Guilin, China
Xiaolan Xie
National Academy of Guo Ding Institute of Data Science, Harbin, China
Zeguang Lu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, M., Hou, X., Li, J., Gao, K. (2019). Superimposed Attention Mechanism-Based CNN Network for Reading Comprehension and Question Answering. In: Mao, R., Wang, H., Xie, X., Lu, Z. (eds) Data Science. ICPCSEE 2019. Communications in Computer and Information Science, vol 1059. Springer, Singapore. https://doi.org/10.1007/978-981-15-0121-0_2

Download citation

DOI: https://doi.org/10.1007/978-981-15-0121-0_2
Published: 13 September 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-0120-3
Online ISBN: 978-981-15-0121-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics