Abstract
Flashcards, or any sort of question-answer pairs, are a fundamental tool in education. However, the creation of question-answer pairs is a tedious job which often defers independent learners from properly studying a topic. We seek to provide a tool to automatically generate flashcards from Wikipedia articles to make independent education more attractive to a broader audience. We investigate different state-of-the-art natural language processing models and propose a pipeline to generate flashcards with different levels of detail from any given article. We evaluate the proposed pipeline based on its computing time and the number of generated and filtered questions, given the proposed filtering method. In a user study, we find that the generated flashcards are evaluated as helpful. Further, users evaluated the quality of human created flashcards that are available open source as comparable to or only slightly better than the automatically generated cards (Our application is available at: flashcard.ethz.ch).
Authors in alphabetical order.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Alberti, C., Andor, D., Pitler, E., Devlin, J., Collins, M.: Synthetic QA corpora generation with roundtrip consistency. In: Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL (2019)
AnkiWeb. Shared decks - ankiweb (2020)
Bao, H., et al.: UniLMv2: pseudo-masked language models for unified language model pre-training. CoRR, abs/2002.12804 (2020)
Blšták, M., Rozinajová, V.: Automatic question generation based on analysis of sentence structure. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2016. LNCS (LNAI), vol. 9924, pp. 223–230. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-45510-5_26
Chan, Y.-H., Fan, Y.-C.: A recurrent BERT-based model for question generation. In: Proceedings of the 2nd Workshop on Machine Reading for Question Answering, MRQA@EMNLP (2019)
Chen, G., Yang, J., Hauff, C., Houben, G.-J.: LearningQ: a large-scale dataset for educational question generation. In: Proceedings of the Twelfth International Conference on Web and Social Media, ICWSM (2018)
Dong, L., et al.: Unified language model pre-training for natural language understanding and generation. In: Annual Conference on Neural Information Processing Systems, NeurIPS (2019)
Du, X., Cardie, C.: Harvesting paragraph-level question-answer pairs from Wikipedia. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (2018)
The Examiners. theexaminers (2020)
Gupta, S., Gupta, S.K.: Abstractive summarization: an overview of the state of the art. Expert Syst. Appl. 121, 49–65 (2019)
Hermann, K.M.: Teaching machines to read and comprehend. In: Advances in Neural Information Processing Systems (2015)
Kriangchaivech, K., Wangperawong, A.: Question generation by transformers (2019)
Kurdi, G., Leo, J., Parsia, B., Sattler, U., Al-Emari, S.: A systematic review of automatic question generation for educational purposes. Int. J. Artif. Intell. Educ. 30(1), 121–204 (2020)
Kwankajornkiet, C., Suchato, A., Punyabukkana, P.: Automatic multiple-choice question generation from Thai text. In: 2016 13th International Joint Conference on Computer Science and Software Engineering (JCSSE), pp. 1–6 (2016)
LanguageTool. Languagetool (2020)
Lewis, M., et al.: BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL (2020)
Lin, C.-Y.: ROUGE: a package for automatic evaluation of summaries. In: Text Summarization Branches Out, pp. 74–81 (2004)
Liu, M., Rus, V., Liu, L.: Automatic Chinese factual question generation. IEEE Trans. Learn. Technol. 10(2), 194–204 (2017)
Lopez, L.E., Cruz, D.K., Cruz, J.C.B., Cheng, C.: Transformer-based end-to-end question generation. CoRR, abs/2005.01107 (2020)
Niraula, N.B., Rus, V.: Judging the quality of automatically generated gap-fill question using active learning. In: Proceedings of the Tenth Workshop on Innovative Use of NLP for Building Educational Applications (2015)
Pan, L., Lei, W., Chua, T.-S., Kan, M.-Y.: Recent advances in neural question generation. arXiv preprint arXiv:1905.08949 (2019)
Patil, S.: Question generation using transformers (2020)
Raffel, C.: Exploring the limits of transfer learning with a unified text-to-text transformer. arXiv preprint arXiv:1910.10683 (2019)
Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P.: Squad: 100, 000+ questions for machine comprehension of text. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, EMNLP (2016)
Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using Siamese BERT-networks. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP (2019)
Rush, A.M., Chopra, S., Weston, J.: A neural attention model for abstractive sentence summarization. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP (2015)
Sanh, V., Debut, L., Chaumond, J., Wolf, T.: DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter (2020)
See, A., Liu, P.J., Manning, C.D.: Get to the point: summarization with pointer-generator networks. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, pp. 1073–1083 (2017)
Shleifer, S.: Distilbart model (2020)
Thalheimer, W.: The learning benefits of questions. Work Learning Research (2003)
HuggingFace Transformers. Question answering using distilbert (2020)
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems (2017)
Wang, A., et al.: Superglue: a stickier benchmark for general-purpose language understanding systems. In: Advances in Neural Information Processing Systems (2019)
Yan, Y., et al.: Prophetnet: predicting future n-gram for sequence-to-sequence pre-training. CoRR, abs/2001.04063 (2020)
Zhang, J., Zhao, Y., Saleh, M., Liu, P.J.: PEGASUS: pre-training with extracted gap-sentences for abstractive summarization. CoRR, abs/1912.08777 (2019)
Zhong, M., Liu, P., Wang, D., Qiu, X., Huang, X.: Searching for effective neural extractive summarization: what works and what’s next. In: Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL (2019)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Cheng, Y. et al. (2021). WikiFlash: Generating Flashcards from Wikipedia Articles. In: Mantoro, T., Lee, M., Ayu, M.A., Wong, K.W., Hidayanto, A.N. (eds) Neural Information Processing. ICONIP 2021. Lecture Notes in Computer Science(), vol 13111. Springer, Cham. https://doi.org/10.1007/978-3-030-92273-3_12
Download citation
DOI: https://doi.org/10.1007/978-3-030-92273-3_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-92272-6
Online ISBN: 978-3-030-92273-3
eBook Packages: Computer ScienceComputer Science (R0)