Skip to main content

WikiFlash: Generating Flashcards from Wikipedia Articles

  • Conference paper
  • First Online:
Neural Information Processing (ICONIP 2021)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 13111))

Included in the following conference series:

Abstract

Flashcards, or any sort of question-answer pairs, are a fundamental tool in education. However, the creation of question-answer pairs is a tedious job which often defers independent learners from properly studying a topic. We seek to provide a tool to automatically generate flashcards from Wikipedia articles to make independent education more attractive to a broader audience. We investigate different state-of-the-art natural language processing models and propose a pipeline to generate flashcards with different levels of detail from any given article. We evaluate the proposed pipeline based on its computing time and the number of generated and filtered questions, given the proposed filtering method. In a user study, we find that the generated flashcards are evaluated as helpful. Further, users evaluated the quality of human created flashcards that are available open source as comparable to or only slightly better than the automatically generated cards (Our application is available at: flashcard.ethz.ch).

Authors in alphabetical order.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Alberti, C., Andor, D., Pitler, E., Devlin, J., Collins, M.: Synthetic QA corpora generation with roundtrip consistency. In: Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL (2019)

    Google Scholar 

  2. AnkiWeb. Shared decks - ankiweb (2020)

    Google Scholar 

  3. Bao, H., et al.: UniLMv2: pseudo-masked language models for unified language model pre-training. CoRR, abs/2002.12804 (2020)

    Google Scholar 

  4. Blšták, M., Rozinajová, V.: Automatic question generation based on analysis of sentence structure. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2016. LNCS (LNAI), vol. 9924, pp. 223–230. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-45510-5_26

    Chapter  Google Scholar 

  5. Chan, Y.-H., Fan, Y.-C.: A recurrent BERT-based model for question generation. In: Proceedings of the 2nd Workshop on Machine Reading for Question Answering, MRQA@EMNLP (2019)

    Google Scholar 

  6. Chen, G., Yang, J., Hauff, C., Houben, G.-J.: LearningQ: a large-scale dataset for educational question generation. In: Proceedings of the Twelfth International Conference on Web and Social Media, ICWSM (2018)

    Google Scholar 

  7. Dong, L., et al.: Unified language model pre-training for natural language understanding and generation. In: Annual Conference on Neural Information Processing Systems, NeurIPS (2019)

    Google Scholar 

  8. Du, X., Cardie, C.: Harvesting paragraph-level question-answer pairs from Wikipedia. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (2018)

    Google Scholar 

  9. The Examiners. theexaminers (2020)

    Google Scholar 

  10. Gupta, S., Gupta, S.K.: Abstractive summarization: an overview of the state of the art. Expert Syst. Appl. 121, 49–65 (2019)

    Article  Google Scholar 

  11. Hermann, K.M.: Teaching machines to read and comprehend. In: Advances in Neural Information Processing Systems (2015)

    Google Scholar 

  12. Kriangchaivech, K., Wangperawong, A.: Question generation by transformers (2019)

    Google Scholar 

  13. Kurdi, G., Leo, J., Parsia, B., Sattler, U., Al-Emari, S.: A systematic review of automatic question generation for educational purposes. Int. J. Artif. Intell. Educ. 30(1), 121–204 (2020)

    Article  Google Scholar 

  14. Kwankajornkiet, C., Suchato, A., Punyabukkana, P.: Automatic multiple-choice question generation from Thai text. In: 2016 13th International Joint Conference on Computer Science and Software Engineering (JCSSE), pp. 1–6 (2016)

    Google Scholar 

  15. LanguageTool. Languagetool (2020)

    Google Scholar 

  16. Lewis, M., et al.: BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL (2020)

    Google Scholar 

  17. Lin, C.-Y.: ROUGE: a package for automatic evaluation of summaries. In: Text Summarization Branches Out, pp. 74–81 (2004)

    Google Scholar 

  18. Liu, M., Rus, V., Liu, L.: Automatic Chinese factual question generation. IEEE Trans. Learn. Technol. 10(2), 194–204 (2017)

    Article  Google Scholar 

  19. Lopez, L.E., Cruz, D.K., Cruz, J.C.B., Cheng, C.: Transformer-based end-to-end question generation. CoRR, abs/2005.01107 (2020)

    Google Scholar 

  20. Niraula, N.B., Rus, V.: Judging the quality of automatically generated gap-fill question using active learning. In: Proceedings of the Tenth Workshop on Innovative Use of NLP for Building Educational Applications (2015)

    Google Scholar 

  21. Pan, L., Lei, W., Chua, T.-S., Kan, M.-Y.: Recent advances in neural question generation. arXiv preprint arXiv:1905.08949 (2019)

  22. Patil, S.: Question generation using transformers (2020)

    Google Scholar 

  23. Raffel, C.: Exploring the limits of transfer learning with a unified text-to-text transformer. arXiv preprint arXiv:1910.10683 (2019)

  24. Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P.: Squad: 100, 000+ questions for machine comprehension of text. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, EMNLP (2016)

    Google Scholar 

  25. Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using Siamese BERT-networks. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP (2019)

    Google Scholar 

  26. Rush, A.M., Chopra, S., Weston, J.: A neural attention model for abstractive sentence summarization. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP (2015)

    Google Scholar 

  27. Sanh, V., Debut, L., Chaumond, J., Wolf, T.: DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter (2020)

    Google Scholar 

  28. See, A., Liu, P.J., Manning, C.D.: Get to the point: summarization with pointer-generator networks. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, pp. 1073–1083 (2017)

    Google Scholar 

  29. Shleifer, S.: Distilbart model (2020)

    Google Scholar 

  30. Thalheimer, W.: The learning benefits of questions. Work Learning Research (2003)

    Google Scholar 

  31. HuggingFace Transformers. Question answering using distilbert (2020)

    Google Scholar 

  32. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems (2017)

    Google Scholar 

  33. Wang, A., et al.: Superglue: a stickier benchmark for general-purpose language understanding systems. In: Advances in Neural Information Processing Systems (2019)

    Google Scholar 

  34. Yan, Y., et al.: Prophetnet: predicting future n-gram for sequence-to-sequence pre-training. CoRR, abs/2001.04063 (2020)

    Google Scholar 

  35. Zhang, J., Zhao, Y., Saleh, M., Liu, P.J.: PEGASUS: pre-training with extracted gap-sentences for abstractive summarization. CoRR, abs/1912.08777 (2019)

    Google Scholar 

  36. Zhong, M., Liu, P., Wang, D., Qiu, X., Huang, X.: Searching for effective neural extractive summarization: what works and what’s next. In: Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL (2019)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Damián Pascual or Oliver Richter .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cheng, Y. et al. (2021). WikiFlash: Generating Flashcards from Wikipedia Articles. In: Mantoro, T., Lee, M., Ayu, M.A., Wong, K.W., Hidayanto, A.N. (eds) Neural Information Processing. ICONIP 2021. Lecture Notes in Computer Science(), vol 13111. Springer, Cham. https://doi.org/10.1007/978-3-030-92273-3_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-92273-3_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-92272-6

  • Online ISBN: 978-3-030-92273-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics