Skip to main content

Comparative Quality Analysis of GPT-Based Multiple Choice Question Generation

  • Conference paper
  • First Online:
Applied Informatics (ICAI 2023)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1874))

Included in the following conference series:

  • 463 Accesses

Abstract

Assessment is an essential part of education, both for teachers who assess their students as well as learners who auto-evaluate themselves. A popular type of assessment questions are multiple-choice questions (MCQ), as they can be automatically graded and can cover a wide range of learning items. However, the creation of high quality MCQ items is nontrivial. With the advent of Generative Pre-trained Transformer (GPT), considerable effort has been recently made regarding Automatic Question Generation (AQG). While metrics have been applied to evaluate the linguistic quality, an evaluation of generated questions according to the best practices for MCQ creation has been missing so far. In this paper, we propose an analysis of the quality of automatically generated MCQs from 3 different GPT-based services. After producing 150 MCQs in the domain of computer science, we analyse them according to common multiple-choice item writing guidelines and annotate them with identified docimological issues. The dataset of annotated MCQs is available in Moodle XML format. We discuss the different flaws and propose solutions for AQG service developers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://www.wooclap.com.

  2. 2.

    https://moodle.org.

  3. 3.

    https://doi.org/10.6084/m9.figshare.23689044.

  4. 4.

    https://developer.apple.com/documentation/coremotion/cmmotionmanager.

References

  1. Bandiera, G., Sherbino, J., Frank, J.R.: The CanMEDS assessment tools handbook: an introductory guide to assessment methods for the CanMEDS competencies. Royal College of Physicians and Surgeons of Canada (2006)

    Google Scholar 

  2. Bertrand, C., et al.: Choisir un outil d’évaluation. In: Pelaccia, T. (ed.) Comment (mieux) former et évaluer les étudiants en médecine et en sciences de la santé?, pp. 357–370. De Boeck Supérieur (2016)

    Google Scholar 

  3. Bloom, B.S.: Taxonomy of Educational Objectives: The Classification of Educational Goals. Allyn and Bacon, Boston (1956)

    Google Scholar 

  4. Bongir, A., Attar, V., Janardhanan, R.: Automated quiz generator. In: Thampi, S.M., Mitra, S., Mukhopadhyay, J., Li, K.-C., James, A.P., Berretti, S. (eds.) ISTA 2017. AISC, vol. 683, pp. 174–188. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-68385-0_15

    Chapter  Google Scholar 

  5. Cortés, J.A., Vega, J.A., Schotborg, D.C., Caicedo, J.C.: Education platform with dynamic questions using cloud computing services. In: Solano, A., Ordoñez, H. (eds.) CCC 2017. CCIS, vol. 735, pp. 387–400. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66562-7_28

    Chapter  Google Scholar 

  6. DiBattista, D., Sinnige-Egger, J.A., Fortuna, G.: The “none of the above’’ option in multiple-choice testing: an experimental study. J. Exp. Educ. 82(2), 168–183 (2014). https://doi.org/10.1080/00220973.2013.795127

    Article  Google Scholar 

  7. Dijkstra, R., Genç, Z., Kayal, S., Kamps, J.: Reading comprehension quiz generation using generative pre-trained transformers. In: Sosnovsky, S.A., Brusilovsky, P., Lan, A.S. (eds.) Proceedings of the Fourth International Workshop on Intelligent Textbooks 2022 Co-Located with 23d International Conference on Artificial Intelligence in Education (AIED 2022), Durham, UK, 27 July 2022. CEUR Workshop Proceedings, vol. 3192, pp. 4–17. CEUR-WS.org (2022). https://ceur-ws.org/Vol-3192/itb22_p1_full5439.pdf

  8. Gabajiwala, E., Mehta, P., Singh, R., Koshy, R.: Quiz maker: automatic quiz generation from text using NLP. In: Singh, P.K., Wierzchoń, S.T., Chhabra, J.K., Tanwar, S. (eds.) Futuristic Trends in Networks and Computing Technologies. LNEE, vol. 936, pp. 523–533. Springer, Singapore (2022). https://doi.org/10.1007/978-981-19-5037-7_37

    Chapter  Google Scholar 

  9. Gilal, A.R., Waqas, A., Talpur, B.A., Abro, R.A., Jaafar, J., Amur, Z.H.: Question guru: an automated multiple-choice question generation system. In: Al-Sharafi, M.A., Al-Emran, M., Al-Kabi, M.N., Shaalan, K. (eds.) ICETIS 2022. LNNS, vol. 573, pp. 501–514. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-20429-6_46

    Chapter  Google Scholar 

  10. Goyal, R., Kumar, P., Singh, V.P.: Automated question and answer generation from texts using text-to-text transformers. Arab. J. Sci. Eng. (2023). https://doi.org/10.1007/s13369-023-07840-7

  11. Haladyna, T.M., Downing, S.M., Rodriguez, M.C.: A review of multiple-choice item-writing guidelines for classroom assessment. Appl. Measur. Educ. 15(3), 309–333 (2002). https://doi.org/10.1207/S15324818AME1503_5

    Article  Google Scholar 

  12. Kumar, A.P., Nayak, A., Manjula Shenoy, K., Chaitanya, Ghosh, K.: A novel framework for the generation of multiple choice question stems using semantic and machine-learning techniques. Int. J. Artif. Intell. Educ. (2023). https://doi.org/10.1007/s40593-023-00333-6

  13. Kumar, S., Chauhan, A., Pavan Kumar, C.: Learning enhancement using question-answer generation for e-book using contrastive fine-tuned T5. In: Roy, P.P., Agarwal, A., Li, T., Krishna Reddy, P., Uday Kiran, R. (eds.) BDA 2022. LNCS, vol. 13773, pp. 68–87. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-24094-2_5

    Chapter  Google Scholar 

  14. Manrique, R., Grévisse, C., Mariño, O., Rothkugel, S.: Knowledge graph-based core concept identification in learning resources. In: Ichise, R., Lecue, F., Kawamura, T., Zhao, D., Muggleton, S., Kozaki, K. (eds.) JIST 2018. LNCS, vol. 11341, pp. 36–51. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-04284-4_3

    Chapter  Google Scholar 

  15. MoodleDocs: Calculated question type. https://docs.moodle.org/402/en/Calculated_question_type. Accessed 10 July 2023

  16. Mulla, N., Gharpure, P.: Automatic question generation: a review of methodologies, datasets, evaluation metrics, and applications. Prog. Artif. Intell. 12(1), 1–32 (2023). https://doi.org/10.1007/s13748-023-00295-9

    Article  Google Scholar 

  17. Shank, P.: Write Better Multiple-Choice Questions to Assess Learning: Measure What Matters—Evidence-Informed Tactics for Multiple-Choice Questions. Learning Peaks LLC (2021)

    Google Scholar 

  18. Srihari, C., Sunagar, S., Kamat, R.K., Raghavendra, K.S., Meleet, M.: Question and answer generation from text using transformers. In: Thampi, S.M., Mukhopadhyay, J., Paprzycki, M., Li, K.C. (eds.) ISI 2022. SIST, vol. 333, pp. 201–210. Springer, Singapore (2023). https://doi.org/10.1007/978-981-19-8094-7_15

    Chapter  Google Scholar 

  19. Vachev, K., Hardalov, M., Karadzhov, G., Georgiev, G., Koychev, I., Nakov, P.: Leaf: multiple-choice question generation. In: Hagen, M., et al. (eds.) ECIR 2022. LNCS, vol. 13186, pp. 321–328. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-99739-7_41

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christian Grévisse .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Grévisse, C. (2024). Comparative Quality Analysis of GPT-Based Multiple Choice Question Generation. In: Florez, H., Leon, M. (eds) Applied Informatics. ICAI 2023. Communications in Computer and Information Science, vol 1874. Springer, Cham. https://doi.org/10.1007/978-3-031-46813-1_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-46813-1_29

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-46812-4

  • Online ISBN: 978-3-031-46813-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics