skip to main content
short-paper

S3-NET: SRU-Based Sentence and Self-Matching Networks for Machine Reading Comprehension

Published:20 February 2020Publication History
Skip Abstract Section

Abstract

Machine reading comprehension question answering (MRC-QA) is the task of understanding the context of a given passage to find a correct answer within it. A passage is composed of several sentences; therefore, the length of the input sentence becomes longer, leading to diminished performance. In this article, we propose S3-NET, which adds sentence-based encoding to solve this problem. S3-NET, which is based on a simple recurrent unit architecture, is a deep learning model that solves the MRC-QA by applying matching network to sentence-level encoding. In addition, S3-NET utilizes self-matching networks to compute attention weight for its own recurrent neural network sequences. We perform MRC-QA for the SQuAD dataset of English and MindsMRC dataset of Korean. The experimental results show that for SQuAD, the S3-NET model proposed in this article produces 71.91% and 74.12% exact match and 81.02% and 82.34% F1 in single and ensemble models, respectively, and for MindsMRC, our model achieves 69.43% and 71.28% exact match and 81.53% and 82.77% F1 in single and ensemble models, respectively.

References

  1. Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv:1409.473.Google ScholarGoogle Scholar
  2. Danqi Chen, Adam Fisch, Jason Weston, and Antoine Bordes. 2017. Reading Wikipedia to answer open-domain questions. arXiv:1704.00051.Google ScholarGoogle Scholar
  3. Zheqian Chen, Rongqin Yang, Bin Cao, Zhou Zhao, Deng Cai, and Xiaofei He. 2017. Smarnet: Teaching machines to read and comprehend like human. arXiv:1710.02772.Google ScholarGoogle Scholar
  4. Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv:1406.1078.Google ScholarGoogle Scholar
  5. Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv:1412.3555.Google ScholarGoogle Scholar
  6. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805.Google ScholarGoogle Scholar
  7. Felix Hill, Antoine Bordes, Sumit Chopra, and Jason Weston. 2015. The Goldilocks principle: Reading children’s books with explicit memory representations. arXiv:1511.02301.Google ScholarGoogle Scholar
  8. Minghao Hu, Yuxing Peng, Zhen Huang, Xipeng Qiu, Furu Wei, and Ming Zhou. 2017. Reinforced mnemonic reader for machine reading comprehension. arXiv:1705.02798.Google ScholarGoogle Scholar
  9. Yoon Kim. 2014. Convolutional neural networks for sentence classification. arXiv:1408.5882.Google ScholarGoogle Scholar
  10. Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv:1412.6980.Google ScholarGoogle Scholar
  11. Rupesh Kumar Srivastava, Klaus Greff, and Jürgen Schmidhuber. 2015. Highway networks. arxiv:cs.LG/1505.00387.Google ScholarGoogle Scholar
  12. Tao Lei and Yu Zhang. 2017. Training RNNs as fast as CNNs. arXiv:1709.02755.Google ScholarGoogle Scholar
  13. Tri Nguyen, Mir Rosenberg, Xia Song, Jianfeng Gao, Saurabh Tiwary, Rangan Majumder, and Li Deng. 2016. MS MARCO: A human generated machine reading comprehension dataset. arXiv:1611.09268.Google ScholarGoogle Scholar
  14. Cheoneum Park and Changki Lee. 2017. Coreference resolution using hierarchical pointer networks. KIISE Transaction on Computing Practices 23, 9 (2017), 542–549.Google ScholarGoogle ScholarCross RefCross Ref
  15. Cheoneum Park, Changki Lee, Lynn Hong, Yigyu Hwang, Taejoon Yoo, Jaeyong Jang, Yunki Hong, Kyung-Hoon Bae, and Hyun-Ki Kim. 2019. S-Net: Machine reading comprehension with SRU-based self-matching network. ETRI Journal 41, 3 (2019), 371–382.Google ScholarGoogle ScholarCross RefCross Ref
  16. Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 1532–1543.Google ScholarGoogle ScholarCross RefCross Ref
  17. Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. arXiv:1802.05365.Google ScholarGoogle Scholar
  18. Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang. 2016. Squad: 100,000+ questions for machine comprehension of text. arXiv:1606.05250.Google ScholarGoogle Scholar
  19. Minjoon Seo, Aniruddha Kembhavi, Ali Farhadi, and Hannaneh Hajishirzi. 2016. Bidirectional attention flow for machine comprehension. arXiv:1611.01603.Google ScholarGoogle Scholar
  20. Yelong Shen, Po-Sen Huang, Jianfeng Gao, and Weizhu Chen. 2017. ReasoNet: Learning to stop reading in machine comprehension. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York, NY, 1047–1055.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Oriol Vinyals, Meire Fortunato, and Navdeep Jaitly. 2015. Pointer networks. In Advances in Neural Information Processing Systems. 2692–2700.Google ScholarGoogle Scholar
  22. Wenhui Wang, Nan Yang, Furu Wei, Baobao Chang, and Ming Zhou. 2017. Gated self-matching networks for reading comprehension and question answering. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Vol. 1. 189–198.Google ScholarGoogle ScholarCross RefCross Ref
  23. Dirk Weissenborn, Georg Wiese, and Laura Seiffe. 2017. Making neural QA as simple as possible but not simpler. arXiv:1703.04816.Google ScholarGoogle Scholar

Index Terms

  1. S3-NET: SRU-Based Sentence and Self-Matching Networks for Machine Reading Comprehension

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Asian and Low-Resource Language Information Processing
        ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 19, Issue 3
        May 2020
        228 pages
        ISSN:2375-4699
        EISSN:2375-4702
        DOI:10.1145/3378675
        Issue’s Table of Contents

        Copyright © 2020 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 20 February 2020
        • Accepted: 1 September 2019
        • Revised: 1 July 2019
        • Received: 1 August 2018
        Published in tallip Volume 19, Issue 3

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • short-paper
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format