Skip to main content

Superimposed Attention Mechanism-Based CNN Network for Reading Comprehension and Question Answering

  • Conference paper
  • First Online:
Data Science (ICPCSEE 2019)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1059))

  • 1363 Accesses

Abstract

In recent years, end-to-end models have been widely used in the fields of machine comprehension (MC) and question answering (QA). Recurrent neural network (RNN) or convolutional neural network (CNN) is combined with attention mechanism to construct models to improve their accuracy. However, a single attention mechanism does not fully express the meaning of the text. In this paper, recurrent neural network is replaced with the convolutional neural network to process the text, and a superimposed attention mechanism is proposed. The model was constructed by combining a convolutional neural network with a superimposed attention mechanism. It shows that good results are achieved on the Stanford question answering dataset (SQuAD).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Turing, A.M.: Computing machinery and intelligence. In: Epstein, R., Roberts, G., Beber, G. (eds.) Parsing the Turing Test, pp. 23–65. Springer, Dordrecht (2009). https://doi.org/10.1007/978-1-4020-6710-5_3

    Chapter  Google Scholar 

  2. Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P.: SQuAD: 100,000 + questions for machine comprehension of text. arXiv preprint arXiv:1606.05250 (2016)

  3. Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems, pp. 3104–3112 (2014)

    Google Scholar 

  4. Lawrence, S., Giles, C.L., Tsoi, A.C., Back, A.D.: Face recognition: a convolutional neural-network approach. IEEE Trans. Neural Netw. 8(1), 98–113 (1997)

    Article  Google Scholar 

  5. Sundermeyer, M., Schlüter, R., Ney, H.: LSTM neural networks for language modeling. In: Thirteenth Annual Conference of the International Speech Communication Association (2012)

    Google Scholar 

  6. Vieira, J.P.A., Moura, R.S.: An analysis of convolutional neural networks for sentence classification. In: Computer Conference (CLEI), pp. 1–5. IEEE (2017)

    Google Scholar 

  7. Kameoka, H., Tanaka, K., Kaneko, T., Hojo, N.: ConvS2S-VC: fully convolutional sequence-to-sequence voice conversion. arXiv preprint arXiv:1811.01609 (2018)

  8. Yu, A.W., et al.: QANet: combining local convolution with global self-attention for reading comprehension. arXiv preprint arXiv:1804.09541 (2018)

  9. Zhang, J., Zhu, X., Chen, Q., Dai, L., Wei, S., Jiang, H.: Exploring question understanding and adaptation in neural-network-based question answering. arXiv preprint arXiv:1703.04617 (2017)

  10. Gao, S., Zhao, Y., Zhao, D., Yin, D., Yan, R.: Product-Answer Generation in ECommerce Question-Answering (2019)

    Google Scholar 

  11. Luong, M.-T., Pham, H., Manning, C.D.: “Effective approaches to attention-based neural machine translation.” arXiv preprint arXiv:1508.04025 (2015)

  12. Mnih, V., Heess, N., Graves, A.: Recurrent models of visual attention. In: Advances in Neural Information Processing Systems, pp. 2204–2212 (2014)

    Google Scholar 

  13. Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)

  14. Seo, M., Kembhavi, A., Farhadi, A., Hajishirzi, H.: Bidirectional attention flow for machine comprehension. arXiv preprint arXiv:1611.01603 (2016)

  15. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)

    Article  Google Scholar 

  16. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)

    Google Scholar 

  17. Tao, C., Gao, S., Shang, M., Wu, W., Zhao, D., Yan, R.: Get the point of my utterance! learning towards effective responses with multi-head attention mechanism. In: IJCAI, pp. 4418–4424 (2018)

    Google Scholar 

  18. Gao, S., Chen, X., Ren, Z., Bing, L., Zhao, D., Yan, R.: Abstractive Text Summarization by Incorporating Reader Comments (2019)

    Google Scholar 

  19. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  20. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  21. Yin, W., Schütze, H., Xiang, B., Zhou, B.: ABCNN: attention-based convolutional neural network for modeling sentence pairs. Trans. Assoc. Comput. Linguist. 4, 259–272 (2016)

    Article  Google Scholar 

  22. Pennington, J., Socher, R., Manning, C.: Glove: Global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)

    Google Scholar 

  23. Srivastava, R.K., Greff, K., Schmidhuber, J.: Highway networks. arXiv preprint arXiv:1505.00387 (2015)

  24. Rajpurkar, P., Jia, R., Liang, P.: Know what you don’t know: unanswerable questions for SQuAD. arXiv preprint arXiv:1806.03822 (2018)

  25. Trischler, A., et al.: NewsQA: a machine comprehension dataset. arXiv preprint arXiv:1611.09830 (2016)

  26. Nguyen, T., et al.: MS MARCO: a human generated machine reading comprehension dataset. arXiv preprint arXiv:1611.09268 (2016)

  27. Xiong, C., Zhong, V., Socher, R.: Dynamic coattention networks for question answering. arXiv preprint arXiv:1611.01604 (2016)

  28. Lee, K., Salant, S., Kwiatkowski, T., Parikh, A., Das, D., Berant, J.: Learning recurrent span representations for extractive question answering. arXiv preprint arXiv:1611.01436 (2016)

  29. Wang, Z., Mi, H., Hamza, W., Florian, R.: Multi-perspective context matching for machine comprehension. arXiv preprint arXiv:1612.04211 (2016)

  30. Weissenborn, D., Wiese, G., Seiffe, L.: Making neural QA as simple as possible but not simpler. arXiv preprint arXiv:1703.04816 (2017)

  31. Gong, Y., Bowman, S.R.: Ruminating reader: reasoning with gated multi-hop attention. arXiv preprint arXiv:1704.07415 (2017)

Download references

Acknowledgement

This paper is sponsored by National Science Foundation of China (61772075) and National Science Foundation of Hebei Province (F2017208012) and Humanities and social sciences research projects of the Ministry of Education (17JDGC022).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kai Gao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, M., Hou, X., Li, J., Gao, K. (2019). Superimposed Attention Mechanism-Based CNN Network for Reading Comprehension and Question Answering. In: Mao, R., Wang, H., Xie, X., Lu, Z. (eds) Data Science. ICPCSEE 2019. Communications in Computer and Information Science, vol 1059. Springer, Singapore. https://doi.org/10.1007/978-981-15-0121-0_2

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-0121-0_2

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-0120-3

  • Online ISBN: 978-981-15-0121-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics