Skip to main content

Fine-Tuning a Large Language Model with Reinforcement Learning for Educational Question Generation

  • Conference paper
  • First Online:
Artificial Intelligence in Education (AIED 2024)

Abstract

Educational Natural Language Generation (EduQG) aims to automatically generate educational questions from textual content, which is crucial for the expansion of online education. Prior research in EduQG has predominantly relied on cross-entropy loss for training, which can lead to issues such as exposure bias and inconsistencies between training and testing metrics. To mitigate this issue, we propose a reinforcement learning (RL) based large language model (LLM) for educational question generation. In particular, we fine-tune the Google FLAN-T5 model using a mixed objective function that combines cross-entropy and RL losses to ensure the generation of questions that are syntactically and semantically accurate. The experimental results on the SciQ question generation dataset show that the proposed method is competitive with current state-of-the-art systems in terms of predictive performance and linguistic quality.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://pytorch.org/.

  2. 2.

    https://www.pytorchlightning.ai/.

  3. 3.

    https://github.com/huggingface/transformers.

  4. 4.

    https://github.com/hmuus01/Educational_QG/tree/main.

  5. 5.

    https://github.com/hugochan/RL-based-Graph2Seq-for-NQG/tree/master.

  6. 6.

    https://huggingface.co/google/flan-t5-base.

  7. 7.

    https://hpc-docs.uni.lu/systems/iris/.

References

  1. Bulathwela, S., Muse, H., Yilmaz, E.: Scalable educational question generation with pre-trained language models. In: International Conference on Artificial Intelligence in Education, pp. 327–339. Springer (2023). https://doi.org/10.1007/978-3-031-36272-9_27

  2. Chen, Y., Wu, L., Zaki, M.J.: Reinforcement learning based graph-to-sequence model for natural question generation. arXiv preprint arXiv:1908.04942 (2019)

  3. Danon, G., Last, M.: A syntactic approach to domain-specific automatic question generation. arXiv preprint arXiv:1712.09827 (2017)

  4. Das, B., Majumder, M., Phadikar, S., Sekh, A.A.: Automatic question generation and answer assessment: a survey. Res. Pract. Technol. Enhanc. Learn. 16(1), 1–15 (2021)

    Article  Google Scholar 

  5. Du, X., Shao, J., Cardie, C.: Learning to ask: neural question generation for reading comprehension. In: Barzilay, R., Kan, M.Y. (eds.) Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1342–1352 (2017)

    Google Scholar 

  6. Elkins, S., Kochmar, E., Serban, I., Cheung, J.C.: How useful are educational questions generated by large language models? In: International Conference on Artificial Intelligence in Education, pp. 536–542. Springer (2023). https://doi.org/10.1007/978-3-031-36336-8_83

  7. Gao, T., Yao, X., Chen, D.: SimCSE: simple contrastive learning of sentence embeddings. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 6894–6910 (2021)

    Google Scholar 

  8. Gou, Q., et al.: Diversify question generation with retrieval-augmented style transfer. In: Bouamor, H., Pino, J., Bali, K. (eds.) Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pp. 1677–1690. Association for Computational Linguistics (2023)

    Google Scholar 

  9. Guan, M., Mondal, S.K., Dai, H.N., Bao, H.: Reinforcement learning-driven deep question generation with rich semantics. Inf. Process. Manage. 60(2), 103232 (2023)

    Article  Google Scholar 

  10. Heilman, M., Smith, N.A.: Good question! statistical ranking for question generation. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 609–617 (2010)

    Google Scholar 

  11. Jiao, Y., Shridhar, K., Cui, P., Zhou, W., Sachan, M.: Automatic educational question generation with difficulty level controls. In: International Conference on Artificial Intelligence in Education, pp. 476–488. Springer (2023). https://doi.org/10.1007/978-3-031-36272-9_39

  12. Keneshloo, Y., Shi, T., Ramakrishnan, N., Reddy, C.K.: Deep reinforcement learning for sequence-to-sequence models. IEEE Trans. Neural Netw. Learn. Syst. 31(7), 2469–2489 (2019)

    Google Scholar 

  13. Kumar, V., Ramakrishnan, G., Li, Y.F.: Putting the Horse Before the Cart: a generator-evaluator framework for question generation from text. In: Bansal, M., Villavicencio, A. (eds.) Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL), pp. 812–821 (2019)

    Google Scholar 

  14. Lamsiyah, S., Schommer, C.: A comparative study of sentence embeddings for unsupervised extractive multi-document summarization. In: Benelux Conference on Artificial Intelligence, pp. 78–95. Springer (2022). https://doi.org/10.1007/978-3-031-39144-6_6

  15. Leite, B., Cardoso, H.L.: Towards enriched controllability for educational question generation. In: International Conference on Artificial Intelligence in Education, pp. 786–791. Springer (2023). https://doi.org/10.1007/978-3-031-36272-9_72

  16. Lin, C.Y.: ROUGE: a package for automatic evaluation of summaries. In: Text Summarization Branches Out, pp. 74–81 (2004)

    Google Scholar 

  17. Lo, K., Wang, L.L., Neumann, M., Kinney, R., Weld, D.: S2ORC: the semantic scholar open research corpus. In: Jurafsky, D., Chai, J., Schluter, N., Tetreault, J. (eds.) Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 4969–4983 (2020)

    Google Scholar 

  18. Mostafazadeh, N., Misra, I., Devlin, J., Mitchell, M., He, X., Vanderwende, L.: Generating natural questions about an image. arXiv preprint arXiv:1603.06059 (2016)

  19. Naeiji, A., An, A., Davoudi, H., Delpisheh, M., Alzghool, M.: Question generation using sequence-to-sequence model with semantic role labels. In: Vlachos, A., Augenstein, I. (eds.) Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pp. 2830–2842 (2023)

    Google Scholar 

  20. Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: Isabelle, P., Charniak, E., Lin, D. (eds.) Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pp. 311–318 (2002)

    Google Scholar 

  21. Paulus, R., Xiong, C., Socher, R.: A deep reinforced model for abstractive summarization. arXiv preprint arXiv:1705.04304 (2017)

  22. Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21(1) (2020)

    Google Scholar 

  23. Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P.: SQuAD: 100,000+ questions for machine comprehension of text. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 2383–2392 (2016)

    Google Scholar 

  24. Rennie, S.J., Marcheret, E., Mroueh, Y., Ross, J., Goel, V.: Self-critical sequence training for image captioning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7008–7024 (2017)

    Google Scholar 

  25. Rus, V., Cai, Z., Graesser, A.: Question generation: example of a multi-year evaluation campaign. In: Proceedings of WS on the QGSTEC (2008)

    Google Scholar 

  26. Serban, I.V., et al.: Generating factoid questions with recurrent neural networks: the 30M factoid question-answer corpus. In: Erk, K., Smith, N.A. (eds.) Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 588–598 (2016)

    Google Scholar 

  27. Shimmei, M., Bier, N., Matsuda, N.: Machine-generated questions attract instructors when acquainted with learning objectives. In: International Conference on Artificial Intelligence in Education, pp. 3–15. Springer (2023). https://doi.org/10.1007/978-3-031-36272-9_1

  28. Song, L., Wang, Z., Hamza, W.: A unified query-based generative model for question generation and question answering. arXiv preprint arXiv:1709.01058 (2017)

  29. Song, L., Wang, Z., Hamza, W., Zhang, Y., Gildea, D.: Leveraging context information for natural question generation. In: Walker, M., Ji, H., Stent, A. (eds.) Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 2 (Short Papers), pp. 569–574 (2018)

    Google Scholar 

  30. Wang, L., Xu, Z., Lin, Z., Zheng, H., Shen, Y.: Answer-driven deep question generation based on reinforcement learning. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 5159–5170 (2020)

    Google Scholar 

  31. Welbl, J., Liu, N.F., Gardner, M.: Crowdsourcing multiple choice science questions. In: Derczynski, L., Xu, W., Ritter, A., Baldwin, T. (eds.) Proceedings of the 3rd Workshop on Noisy User-generated Text, pp. 94–106 (2017)

    Google Scholar 

  32. Zhang, Z.: Improved Adam optimizer for deep neural networks. In: 2018 IEEE/ACM 26th International Symposium on Quality of Service (IWQoS), pp. 1–2. IEEE (2018)

    Google Scholar 

  33. Zhao, Z., Hou, Y., Wang, D., Yu, M., Liu, C., Ma, X.: Educational question generation of children storybooks via question type distribution learning and event-centric summarization. In: Muresan, S., Nakov, P., Villavicencio, A. (eds.) Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 5073–5085 (2022)

    Google Scholar 

  34. Zhou, Q., Yang, N., Wei, F., Tan, C., Bao, H., Zhou, M.: Neural question generation from text: a preliminary study. In: Huang, X., Jiang, J., Zhao, D., Feng, Y., Hong, Yu. (eds.) NLPCC 2017. LNCS (LNAI), vol. 10619, pp. 662–671. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73618-1_56

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Salima Lamsiyah .

Editor information

Editors and Affiliations

A Examples of Generated Questions of Our RLLM-EduQG System

A Examples of Generated Questions of Our RLLM-EduQG System

Table 4. Comparison of Questions Generated by Our System, Leaf+, and EduQG+ with Ground Truth Questions Using Randomly Selected Examples from the SciQ Dataset Test Set.

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Lamsiyah, S., El Mahdaouy, A., Nourbakhsh, A., Schommer, C. (2024). Fine-Tuning a Large Language Model with Reinforcement Learning for Educational Question Generation. In: Olney, A.M., Chounta, IA., Liu, Z., Santos, O.C., Bittencourt, I.I. (eds) Artificial Intelligence in Education. AIED 2024. Lecture Notes in Computer Science(), vol 14829. Springer, Cham. https://doi.org/10.1007/978-3-031-64302-6_30

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-64302-6_30

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-64301-9

  • Online ISBN: 978-3-031-64302-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics