Abstract
Automatic item generation enables a diverse array of questions to be generated through the use of question templates and randomly-selected parameters. Such automatic item generators are most useful if the generated item instances are of either equivalent or predictable difficulty. In this study, we analyzed student performance on over 300 item generators from four university-level STEM classes collected over a period of two years. In most cases, we find that the choice of parameters fails to significantly affect the problem difficulty.
In our analysis, we found it useful to distinguish parameters that were drawn from a small number (\({<}10\)) of values from those that are drawn from a large—often continuous—range of values. We observed that values from smaller ranges were more likely to significantly impact difficulty, as sometimes they represented different configurations of the problem (e.g., upward force vs. downward force). Through manual review of the problems with significant difficulty variance, we found it was, in general, easy to understand the source of the variance once the data were presented. These results suggest that the use of automatic item generation by college faculty is warranted, because most problems don’t exhibit significant difficulty variation, and the few that do can be detected through automatic means and addressed by the faculty member.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Arendasy, M.E., Sommer, M.: Using automatic item generation to meet the increasing item demands of high-stakes educational and occupational assessment. Learn. Individ. Differ. 22(1), 112–117 (2012)
Attali, Y.: Automatic item generation unleashed: an evaluation of a large-scale deployment of item models. In: Penstein Rosé, C., et al. (eds.) AIED 2018. LNCS (LNAI), vol. 10947, pp. 17–29. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93843-1_2
Attali, Y., Powers, D.: Immediate feedback and opportunity to revise answers to open-ended questions. Educ. Psychol. Measure. 70(1), 22–35 (2010)
Bejar, I.I.: Generative testing: from conception to implementation. In: Irvine, S., Kyllonen, P. (eds.) Item Generation for Test Development. Lawrence Erlbaum Associates (2002)
Benjamini, Y., Yekutieli, D.: The control of the false discovery rate in multiple testing under dependency. Ann. Stat. 29, 1165–1188 (2001)
Bloom, B.S.: Learning for mastery. Evaluation Comment 1 (1968)
Cho, S.J., De Boeck, P., Embretson, S., Rabe-Hesketh, S.: Additive multilevel item structure models with random residuals: item modeling for explanation and item generation. Psychometrika 79(1), 84–104 (2014)
Cochran, W.G.: Some methods for strengthening the common \(\chi ^2\) tests. Biometrics 10(4), 417–451 (1954)
Embretson, S.: Generating items during testing: psychometric issues and models. Psychometrika 64(4), 407–433 (1999). https://doi.org/10.1007/BF02294564
Enright, M.K., Morley, M., Sheehan, K.M.: Items by design: the impact of systematic feature variation on item statistical characteristics. App. Measure. Educ. 15(1), 49–74 (2002)
Geerlings, H., Glas, C.A., van der Linden, W.J.: Modeling rule-based item generation. Psychometrika 76(2), 337–359 (2011)
Geerlings, H., van der Linden, W.J., Glas, C.A.: Optimal test design with rule-based item generation. Appl. Psychol. Measure. 37(2), 140–161 (2013)
Gierl, M., Haladyna, T.: Automatic Item Generation: Theory and practice. Routledge, Abingdon (2013)
Glas, C.A., van der Linden, W.J.: Computerized adaptive testing with item cloning. Appl. Psychol. Measure. 27(4), 247–261 (2003)
Haladyna, T.M.: Automatic item generation: a historical perspective. In: Gierl, M., Haladyna, T. (eds.) Automatic Item Generation: Theory and Practice, pp. 23–35. Routledge, Abingdon (2013)
Hively II, W., Patterson, H.L., Page, S.H.: A “universe-defined” system of arithmetic achievement tests. J. Educ. Measure. 5(4), 275–290 (1968)
Irvine, S., Kyllonen, P.: Item Generation for Test Development. LawrenceErlbaum Associates, Mahwah (2002)
Kulik, C.L.C., Kulik, J.A., Bangert-Drowns, R.L.: Effectiveness of mastery learning programs: a meta-analysis. Rev. Educ. Res. 60, 265 (1990)
van der Linden, W.J., Glas, C.A.: Elements of Adaptive Testing. Springer, Heidelberg (2010). https://doi.org/10.1007/978-0-387-85461-8
Macready, G.B.: The use of generalizability theory for assessing relations among items within domains in diagnostic testing. Appl. Psychol. Measure. 7(2), 149–157 (1983)
Meisner, R., Luecht, R., Reckase, M.: The comparability of the statistical characteristics of test items generated by computer algorithms. Technical report ACT Research Report Series Nop. 93–9, ACT, Inc. (1993)
Opitz, B., Ferdinand, N.K., Mecklinger, A.: Timing matters: the impact of immediate and delayed feedback on artificial language learning. Front. Hum. Neurosci. 5, 8 (2011)
Rasila, A., Havola, L., Majander, H., Malinen, J.: Automatic assessment in engineering mathematics: evaluation of the impact. In: ReflekTori 2010 Symposium of Engineering Education, pp. 37–45 (2010)
Sinharay, S., Johnson, M.S.: Use of item models in a large-scale admissions test: a case study. Int. J. Test. 8(3), 209–236 (2008)
Sinharay, S., Johnson, M.S.: Statistical modeling of automatically generated items. In: Gierl, M., Haladyna, T. (eds.) Automatic Item Generation: Theory and Practice, pp. 183–195. Routledge, Abingdon (2013)
West, M., Herman, G.L., Zilles, C.: PrairieLearn: mastery-based online problem solving with adaptive scoring and recommendations driven by machine learning. In: 2015 ASEE Annual Conference and Exposition. ASEE Conferences, Seattle, Washington, June 2015
Zilles, C., West, M., Mussulman, D., Bretl, T.: Making testing less trying: lessons learned from operating a computer-based testing facility. In: 2018 IEEE Frontiers in Education (FIE) Conference, San Jose, California (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Chen, B., Zilles, C., West, M., Bretl, T. (2019). Effect of Discrete and Continuous Parameter Variation on Difficulty in Automatic Item Generation. In: Isotani, S., Millán, E., Ogan, A., Hastings, P., McLaren, B., Luckin, R. (eds) Artificial Intelligence in Education. AIED 2019. Lecture Notes in Computer Science(), vol 11625. Springer, Cham. https://doi.org/10.1007/978-3-030-23204-7_7
Download citation
DOI: https://doi.org/10.1007/978-3-030-23204-7_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-23203-0
Online ISBN: 978-3-030-23204-7
eBook Packages: Computer ScienceComputer Science (R0)