Skip to main content

GameOfThronesQA: Answer-Aware Question-Answer Pairs for TV Series

  • Conference paper
  • First Online:
Advances in Information Retrieval (ECIR 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13186))

Included in the following conference series:

Abstract

In this paper, we offer a corpus of question answer pairs related to the TV series generated from paragraph contexts. The data set called GameofThronesQA V1.0 contains 5237 unique question answer pairs from the Game Of Thrones TV series across the eight seasons. In particular, we provide a pipeline approach for answer aware question generation, where the answers are extracted based on the named entities from the TV series. This is different to the traditional methods which generate questions first and find the relevant answers later. Furthermore, we provide a comparative analysis of the generated corpus with the benchmark datasets such as SQuAD, TriviaQA, WikiQA and TweetQA. The snapshot of the dataset is provided as an appendix for review purpose and will be released to public later.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Du, X., Cardie, C.: Harvesting paragraph-level question-answer Pairs from Wikipedia. In: Association for Computational Linguistics (ACL) (2018)

    Google Scholar 

  2. Chan, Y.-H., Fan, Y.-C.: A recurrent BERT-based model for question generation. In: Proceedings of the Second Workshop on Machine Reading for Question Answering, pp. 154–162, Hong Kong, China, 4 November 2019. (ACL) (2019)

    Google Scholar 

  3. Devlin, J., et al.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

  4. Lee, J., et al.: BioBERT: a pre-trained biomedical language representation model for biomedical text mining, Bioinformatics, vol. 36(4), pp. 1234–1240, BioBERT: a pre-trained biomedical language representation model for biomedical text mining, 15 February 2020

    Google Scholar 

  5. https://gameofthrones.fandom.com/wiki/

  6. Duan, N., Tang, D., Chen, P., Zhou, M.: Question generation for question answering. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 866–874 (2017). http://www.aclweb.org/anthology/D13-1160

  7. Indurthi, S.R., et al.: Generating natural language question-answer pairs from a knowledge graph using a RNN based question generation model. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, vol. 1, Long Papers (2017)

    Google Scholar 

  8. Kwiatkowski, T., et al.: Natural questions: a benchmark for question answering research. Trans. Assoc. Comput. Linguist. 7, 453–466 (2019)

    Google Scholar 

  9. Yang, Y., Yih, W.-T., Meek, C.: WIKIQA: a challenge dataset for open-domain question answering. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (2015)

    Google Scholar 

  10. Cambazoglu, B.B., et al.: A Review of Public Datasets in Question Answering Research (2020)

    Google Scholar 

  11. Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. arXiv preprint arXiv:1910.10683 (2019)

  12. Xiong, W., et al.: TWEETQA: a social media focused question answering dataset. arXiv preprint arXiv:1907.06292 (2019)

  13. Joshi, M., et al.: TriviaQA: a large scale distantly supervised challenge dataset for reading comprehension. arXiv preprint arXiv:1705.03551 (2017)

  14. Bordes, A., Usunier, N., Chopra, S., Weston, J.: Large-scale simple question answering with memory networks. arXiv preprint arXiv:1506.02075 (2015)

  15. Banerjee, S., Lavie, A.: METEOR: an automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the ACL workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization (2005)

    Google Scholar 

  16. Chen, D., Fisch, A., Weston, J., Bordes, A.: Reading Wikipedia to answer open domain questions. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, vol. 1: Long Papers. Association for Computational Linguistics, pp. 1870–1879 (2017). https://doi.org/10.18653/v1/P17-1171

  17. Lin, C.-Y.: Rouge: a package for automatic evaluation of summaries. Text summarization branches out (2004)

    Google Scholar 

  18. Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P.: Squad: 100,000+ questions for machine comprehension of text. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Austin, Texas, pp. 2383–2392 (2016). https://aclweb.org/anthology/D16- 1264

  19. Serban, I.V., et al.: Generating factoid questions with recurrent neural networks: the 30 m factoid question-answer corpus. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (vol. 1: Long Papers). Association for Computational Linguistics, Berlin, Germany, pp. 588–598 (2016). http://www.aclweb.org/anthology/P16-1056

  20. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)

    MathSciNet  MATH  Google Scholar 

  21. Wang, S.: R3: Reinforced ranker-reader for open-domain question answering (2018)

    Google Scholar 

  22. Yao, X., Bouma, G., Zhang, Y.: Semantics-based question generation and implementation. Dialog. Discourse 3(2), 11–42 (2012)

    Article  Google Scholar 

  23. Zhou, Q., Yang, N., Wei, F., Tan, C., Bao, H., Zhou, M.: Neural question generation from text: a preliminary study. arXiv preprint arXiv:1704.01792 (2017)

  24. Papineni, K., Roukos, S., Ward, T., Zhu, W.-J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of 40th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Philadelphia, Pennsylvania, USA, pp. 311–318 (2002). https://doi.org/10.3115/1073083.1073135

  25. Winograd, T.: Understanding natural language. Cogn. Psychol. 3(1), 1–191 (1972)

    Article  Google Scholar 

  26. Ryu, P.-M., Jang, M.-G., Kim, H.-K.: Open domain question answering using Wikipedia-based knowledge model. Inf. Process. Manage. 50(5), 683–692 (2014)

    Article  Google Scholar 

  27. Hermann, K.M., Kocisky, T., Grefenstette, E., Espeholt, L., Kay, W., Suleyman, M., Blunsom, P.: Teaching machines to read and comprehend. In: Advances in Neural Information Processing Systems (NIPS) (2015)

    Google Scholar 

  28. https://huggingface.co/t5-base

Download references

Acknowledgments

This study was supported in part by the Discovery and CREATE grants from the Natural Sciences and Engineering Research Council (NSERC) of Canada.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qinmin Vivian Hu .

Editor information

Editors and Affiliations

5 Appendix

5 Appendix

Some sample QA pairs in our corpus is given below for users to review and get an understanding of our generated dataset. The generated 5237 QA pairs are all unique, although there may be some answers which are same for different questions.

figure a
figure b

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Lahiri, A.K., Hu, Q.V. (2022). GameOfThronesQA: Answer-Aware Question-Answer Pairs for TV Series. In: Hagen, M., et al. Advances in Information Retrieval. ECIR 2022. Lecture Notes in Computer Science, vol 13186. Springer, Cham. https://doi.org/10.1007/978-3-030-99739-7_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-99739-7_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-99738-0

  • Online ISBN: 978-3-030-99739-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics