skip to main content
10.1145/3341981.3344227acmconferencesArticle/Chapter ViewAbstractPublication PagesictirConference Proceedingsconference-collections
research-article

Unsupervised Story Comprehension with Hierarchical Encoder-Decoder

Published:26 September 2019Publication History

ABSTRACT

Commonsense understanding is a long-term goal of natural language processing yet to be resolved. One standard testbed for commonsense understanding isStory Cloze Test (SCT) \citemostafazadeh2016corpus, In SCT, given a 4-sentences story, we are expected to select the proper ending out of two proposed candidates. The training set in SCT only contains unlabeled stories, previous works usually adopt the small labeled development data for training, which ignored the sufficient training data and, essentially, not reveal the commonsense reasoning procedure. In this paper, we propose an unsupervised sequence-to-sequence method for story reading comprehension, we only adopt the unlabeled story and directly model the context-target inference probability. We propose a loss-reweight training strategy for the seq-to-seq model to dynamically tuning the training process. Experimental results demonstrate the advantage of the proposed model and it achieves the comparable results with supervised methods on SCT.

References

  1. Collin F. Baker, Charles J. Fillmore, and John B. Lowe. 1998. The Berkeley FrameNet Project. In COLING-ACL .Google ScholarGoogle Scholar
  2. Zheng Cai, Lifu Tu, and Kevin Gimpel. 2017. Pay Attention to the Ending: Strong Neural Baselines for the ROC Story Cloze Task. In ACL .Google ScholarGoogle Scholar
  3. Nathanael Chambers and Daniel Jurafsky. 2008. Unsupervised Learning of Narrative Event Chains.. In ACL, Vol. 94305. Citeseer, 789--797.Google ScholarGoogle Scholar
  4. Nathanael Chambers and Dan Jurafsky. 2009. Unsupervised learning of narrative schemas and their participants. In ACL. Association for Computational Linguistics, 602--610.Google ScholarGoogle Scholar
  5. Ming-Wei Chang, Kristina Toutanova, Kenton Lee, and Jacob Devlin. 2019. Language Model Pre-training for Hierarchical Document Representations. arXiv preprint arXiv:1901.09128 (2019).Google ScholarGoogle Scholar
  6. Snigdha Chaturvedi, Haoruo Peng, and Dan Roth. 2017. Story Comprehension for Predicting What Happens Next. In EMNLP .Google ScholarGoogle Scholar
  7. Danqi Chen, Jason Bolton, and Christopher D. Manning. 2016. A Thorough Examination of the CNN/Daily Mail Reading Comprehension Task. In Association for Computational Linguistics (ACL) .Google ScholarGoogle Scholar
  8. Qian Chen, Xiaodan Zhu, Zhen-Hua Ling, Si Wei, Hui Jiang, and Diana Inkpen. 2017. Enhanced LSTM for natural language inference. In Proc. ACL .Google ScholarGoogle ScholarCross RefCross Ref
  9. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).Google ScholarGoogle Scholar
  10. Yarin Gal and Zoubin Ghahramani. 2016. A Theoretically Grounded Application of Dropout in Recurrent Neural Networks. In NIPS .Google ScholarGoogle Scholar
  11. Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron C. Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets. In NIPS .Google ScholarGoogle Scholar
  12. Edouard Grave, Tomas Mikolov, Armand Joulin, and Piotr Bojanowski. 2017. Bag of Tricks for Efficient Text Classification. In EACL .Google ScholarGoogle Scholar
  13. Karl Moritz Hermann, Tomas Kocisky, Edward Grefenstette, Lasse Espeholt, Will Kay, Mustafa Suleyman, and Phil Blunsom. 2015. Teaching machines to read and comprehend. In NIPS. 1684--1692.Google ScholarGoogle Scholar
  14. Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng, Alex Acero, and Larry Heck. 2013. Learning deep structured semantic models for web search using clickthrough data. In Proceedings of the 22nd CIKM. ACM, 2333--2338.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Rafal Józefowicz, Oriol Vinyals, Mike Schuster, Noam Shazeer, and Yonghui Wu. 2016. Exploring the Limits of Language Modeling. CoRR , Vol. abs/1602.02410 (2016).Google ScholarGoogle Scholar
  16. Ryan Kiros, Yukun Zhu, Ruslan R Salakhutdinov, Richard Zemel, Raquel Urtasun, Antonio Torralba, and Sanja Fidler. 2015. Skip-thought vectors. In Advances in neural information processing systems. 3294--3302.Google ScholarGoogle Scholar
  17. Hongyu Lin, Le Sun, and Xianpei Han. 2017b. Reasoning with Heterogeneous Knowledge for Commonsense Machine Comprehension. In EMNLP . 2022--2033. http://aclweb.org/anthology/D17--1215Google ScholarGoogle Scholar
  18. Zhouhan Lin, Minwei Feng, Cicero Nogueira dos Santos, Mo Yu, Bing Xiang, Bowen Zhou, and Yoshua Bengio. 2017a. A structured self-attentive sentence embedding. arXiv preprint arXiv:1703.03130 (2017).Google ScholarGoogle Scholar
  19. Thang Luong, Hieu Pham, and Christopher D. Manning. 2015. Effective Approaches to Attention-based Neural Machine Translation. In EMNLP .Google ScholarGoogle Scholar
  20. Todor Mihaylov and Anette Frank. 2017. Story Cloze Ending Selection Baselines and Data Examination. arXiv preprint arXiv:1703.04330 (2017).Google ScholarGoogle Scholar
  21. Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In NIPS . 3111--3119.Google ScholarGoogle Scholar
  22. Nasrin Mostafazadeh, Nathanael Chambers, Xiaodong He, Devi Parikh, Dhruv Batra, Lucy Vanderwende, Pushmeet Kohli, and James Allen. 2016. A corpus and cloze evaluation for deeper understanding of commonsense stories. Proceedings of NAACL HLT (2016).Google ScholarGoogle ScholarCross RefCross Ref
  23. Tri Nguyen, Mir Rosenberg, Xia Song, Jianfeng Gao, Saurabh Tiwary, Rangan Majumder, and Li Deng. 2016. MS MARCO: A Human Generated MAchine Reading COmprehension Dataset. arXiv preprint arXiv:1611.09268 (2016).Google ScholarGoogle Scholar
  24. Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. Glove: Global Vectors for Word Representation. In EMNLP .Google ScholarGoogle Scholar
  25. Matthew E Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. arXiv preprint arXiv:1802.05365 (2018).Google ScholarGoogle Scholar
  26. Karl Pichotta and Raymond J Mooney. 2016. Learning Statistical Scripts with LSTM Recurrent Neural Networks.. In AAAI . 2800--2806.Google ScholarGoogle Scholar
  27. Alec Radford, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever. 2018. Improving language understanding by generative pre-training. URL https://s3-us-west-2. amazonaws. com/openai-assets/research-covers/languageunsupervised/language understanding paper. pdf (2018).Google ScholarGoogle Scholar
  28. Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang. 2016. SQuAD: 100,000Google ScholarGoogle Scholar
  29. Questions for Machine Comprehension of Text. In EMNLP .Google ScholarGoogle Scholar
  30. Matthew Richardson, Christopher JC Burges, and Erin Renshaw. 2013. MCTest: A Challenge Dataset for the Open-Domain Machine Comprehension of Text.. In EMNLP , Vol. 1. 2.Google ScholarGoogle Scholar
  31. Rachel Rudinger, Pushpendre Rastogi, Francis Ferraro, and Benjamin Van Durme. 2015. Script induction as language modeling. In EMNLP. 1681--1686.Google ScholarGoogle Scholar
  32. Roy Schwartz, Maarten Sap, Ioannis Konstas, Leila Zilles, Yejin Choi, and Noah A Smith. 2017. The Effect of Different Writing Tasks on Linguistic Style: A Case Study of the ROC Story Cloze Task. arXiv preprint arXiv:1702.01841 (2017).Google ScholarGoogle Scholar
  33. Iulian V Serban, Alessandro Sordoni, Yoshua Bengio, Aaron Courville, and Joelle Pineau. 2016. Building end-to-end dialogue systems using generative hierarchical neural network models. In Thirtieth AAAI Conference on Artificial Intelligence .Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Siddarth Srinivasan, Richa Arora, and Mark O. Riedl. 2018. A Simple and Effective Approach to the Story Cloze Test. In NAACL-HLT .Google ScholarGoogle Scholar
  35. Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. [n.d.]. Dropout: A simple way to prevent neural networks from overfitting. JMLR ( [n.,d.]).Google ScholarGoogle Scholar
  36. Saku Sugawara, Kentaro Inui, Satoshi Sekine, and Akiko Aizawa. 2018. What Makes Reading Comprehension Questions Easier? EMNLP (2018).Google ScholarGoogle Scholar
  37. Saku Sugawara, Yusuke Kido, Hikaru Yokono, and Akiko Aizawa. 2017. Evaluation Metrics for Machine Reading Comprehension: Prerequisite Skills and Readability. In ACL .Google ScholarGoogle Scholar
  38. Simon Suster and Walter Daelemans. 2018. CliCR: A Dataset of Clinical Case Reports for Machine Reading Comprehension. CoRR , Vol. abs/1803.09720 (2018).Google ScholarGoogle Scholar
  39. Adam Trischler, Tong Wang, Xingdi Yuan, Justin Harris, Alessandro Sordoni, Philip Bachman, and Kaheer Suleman. 2016. NewsQA: A Machine Comprehension Dataset. arXiv preprint arXiv:1611.09830 (2016).Google ScholarGoogle Scholar
  40. Bingning Wang, Kang Liu, and Jun Zhao. 2017. Conditional Generative Adversarial Networks for Commonsense Machine Comprehension. In IJCAI .Google ScholarGoogle Scholar
  41. Matthew D Zeiler. 2012. ADADELTA: an adaptive learning rate method. arXiv preprint (2012).Google ScholarGoogle Scholar
  42. Yukun Zhu, Ryan Kiros, Rich Zemel, Ruslan Salakhutdinov, Raquel Urtasun, Antonio Torralba, and Sanja Fidler. 2015. Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books. In The IEEE International Conference on Computer Vision (ICCV) .Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Unsupervised Story Comprehension with Hierarchical Encoder-Decoder

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      ICTIR '19: Proceedings of the 2019 ACM SIGIR International Conference on Theory of Information Retrieval
      September 2019
      273 pages
      ISBN:9781450368810
      DOI:10.1145/3341981

      Copyright © 2019 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 26 September 2019

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      ICTIR '19 Paper Acceptance Rate20of41submissions,49%Overall Acceptance Rate209of482submissions,43%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader