Abstract
Assessment is an essential part of education, both for teachers who assess their students as well as learners who auto-evaluate themselves. A popular type of assessment questions are multiple-choice questions (MCQ), as they can be automatically graded and can cover a wide range of learning items. However, the creation of high quality MCQ items is nontrivial. With the advent of Generative Pre-trained Transformer (GPT), considerable effort has been recently made regarding Automatic Question Generation (AQG). While metrics have been applied to evaluate the linguistic quality, an evaluation of generated questions according to the best practices for MCQ creation has been missing so far. In this paper, we propose an analysis of the quality of automatically generated MCQs from 3 different GPT-based services. After producing 150 MCQs in the domain of computer science, we analyse them according to common multiple-choice item writing guidelines and annotate them with identified docimological issues. The dataset of annotated MCQs is available in Moodle XML format. We discuss the different flaws and propose solutions for AQG service developers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bandiera, G., Sherbino, J., Frank, J.R.: The CanMEDS assessment tools handbook: an introductory guide to assessment methods for the CanMEDS competencies. Royal College of Physicians and Surgeons of Canada (2006)
Bertrand, C., et al.: Choisir un outil d’évaluation. In: Pelaccia, T. (ed.) Comment (mieux) former et évaluer les étudiants en médecine et en sciences de la santé?, pp. 357–370. De Boeck Supérieur (2016)
Bloom, B.S.: Taxonomy of Educational Objectives: The Classification of Educational Goals. Allyn and Bacon, Boston (1956)
Bongir, A., Attar, V., Janardhanan, R.: Automated quiz generator. In: Thampi, S.M., Mitra, S., Mukhopadhyay, J., Li, K.-C., James, A.P., Berretti, S. (eds.) ISTA 2017. AISC, vol. 683, pp. 174–188. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-68385-0_15
Cortés, J.A., Vega, J.A., Schotborg, D.C., Caicedo, J.C.: Education platform with dynamic questions using cloud computing services. In: Solano, A., Ordoñez, H. (eds.) CCC 2017. CCIS, vol. 735, pp. 387–400. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66562-7_28
DiBattista, D., Sinnige-Egger, J.A., Fortuna, G.: The “none of the above’’ option in multiple-choice testing: an experimental study. J. Exp. Educ. 82(2), 168–183 (2014). https://doi.org/10.1080/00220973.2013.795127
Dijkstra, R., Genç, Z., Kayal, S., Kamps, J.: Reading comprehension quiz generation using generative pre-trained transformers. In: Sosnovsky, S.A., Brusilovsky, P., Lan, A.S. (eds.) Proceedings of the Fourth International Workshop on Intelligent Textbooks 2022 Co-Located with 23d International Conference on Artificial Intelligence in Education (AIED 2022), Durham, UK, 27 July 2022. CEUR Workshop Proceedings, vol. 3192, pp. 4–17. CEUR-WS.org (2022). https://ceur-ws.org/Vol-3192/itb22_p1_full5439.pdf
Gabajiwala, E., Mehta, P., Singh, R., Koshy, R.: Quiz maker: automatic quiz generation from text using NLP. In: Singh, P.K., Wierzchoń, S.T., Chhabra, J.K., Tanwar, S. (eds.) Futuristic Trends in Networks and Computing Technologies. LNEE, vol. 936, pp. 523–533. Springer, Singapore (2022). https://doi.org/10.1007/978-981-19-5037-7_37
Gilal, A.R., Waqas, A., Talpur, B.A., Abro, R.A., Jaafar, J., Amur, Z.H.: Question guru: an automated multiple-choice question generation system. In: Al-Sharafi, M.A., Al-Emran, M., Al-Kabi, M.N., Shaalan, K. (eds.) ICETIS 2022. LNNS, vol. 573, pp. 501–514. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-20429-6_46
Goyal, R., Kumar, P., Singh, V.P.: Automated question and answer generation from texts using text-to-text transformers. Arab. J. Sci. Eng. (2023). https://doi.org/10.1007/s13369-023-07840-7
Haladyna, T.M., Downing, S.M., Rodriguez, M.C.: A review of multiple-choice item-writing guidelines for classroom assessment. Appl. Measur. Educ. 15(3), 309–333 (2002). https://doi.org/10.1207/S15324818AME1503_5
Kumar, A.P., Nayak, A., Manjula Shenoy, K., Chaitanya, Ghosh, K.: A novel framework for the generation of multiple choice question stems using semantic and machine-learning techniques. Int. J. Artif. Intell. Educ. (2023). https://doi.org/10.1007/s40593-023-00333-6
Kumar, S., Chauhan, A., Pavan Kumar, C.: Learning enhancement using question-answer generation for e-book using contrastive fine-tuned T5. In: Roy, P.P., Agarwal, A., Li, T., Krishna Reddy, P., Uday Kiran, R. (eds.) BDA 2022. LNCS, vol. 13773, pp. 68–87. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-24094-2_5
Manrique, R., Grévisse, C., Mariño, O., Rothkugel, S.: Knowledge graph-based core concept identification in learning resources. In: Ichise, R., Lecue, F., Kawamura, T., Zhao, D., Muggleton, S., Kozaki, K. (eds.) JIST 2018. LNCS, vol. 11341, pp. 36–51. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-04284-4_3
MoodleDocs: Calculated question type. https://docs.moodle.org/402/en/Calculated_question_type. Accessed 10 July 2023
Mulla, N., Gharpure, P.: Automatic question generation: a review of methodologies, datasets, evaluation metrics, and applications. Prog. Artif. Intell. 12(1), 1–32 (2023). https://doi.org/10.1007/s13748-023-00295-9
Shank, P.: Write Better Multiple-Choice Questions to Assess Learning: Measure What Matters—Evidence-Informed Tactics for Multiple-Choice Questions. Learning Peaks LLC (2021)
Srihari, C., Sunagar, S., Kamat, R.K., Raghavendra, K.S., Meleet, M.: Question and answer generation from text using transformers. In: Thampi, S.M., Mukhopadhyay, J., Paprzycki, M., Li, K.C. (eds.) ISI 2022. SIST, vol. 333, pp. 201–210. Springer, Singapore (2023). https://doi.org/10.1007/978-981-19-8094-7_15
Vachev, K., Hardalov, M., Karadzhov, G., Georgiev, G., Koychev, I., Nakov, P.: Leaf: multiple-choice question generation. In: Hagen, M., et al. (eds.) ECIR 2022. LNCS, vol. 13186, pp. 321–328. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-99739-7_41
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Grévisse, C. (2024). Comparative Quality Analysis of GPT-Based Multiple Choice Question Generation. In: Florez, H., Leon, M. (eds) Applied Informatics. ICAI 2023. Communications in Computer and Information Science, vol 1874. Springer, Cham. https://doi.org/10.1007/978-3-031-46813-1_29
Download citation
DOI: https://doi.org/10.1007/978-3-031-46813-1_29
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-46812-4
Online ISBN: 978-3-031-46813-1
eBook Packages: Computer ScienceComputer Science (R0)