Effect of Discrete and Continuous Parameter Variation on Difficulty in Automatic Item Generation

Chen, Binglin; Zilles, Craig; West, Matthew; Bretl, Timothy

doi:10.1007/978-3-030-23204-7_7

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11625))

Included in the following conference series:

International Conference on Artificial Intelligence in Education

4212 Accesses

Abstract

Automatic item generation enables a diverse array of questions to be generated through the use of question templates and randomly-selected parameters. Such automatic item generators are most useful if the generated item instances are of either equivalent or predictable difficulty. In this study, we analyzed student performance on over 300 item generators from four university-level STEM classes collected over a period of two years. In most cases, we find that the choice of parameters fails to significantly affect the problem difficulty.

In our analysis, we found it useful to distinguish parameters that were drawn from a small number (${<}10$) of values from those that are drawn from a large—often continuous—range of values. We observed that values from smaller ranges were more likely to significantly impact difficulty, as sometimes they represented different configurations of the problem (e.g., upward force vs. downward force). Through manual review of the problems with significant difficulty variance, we found it was, in general, easy to understand the source of the variance once the data were presented. These results suggest that the use of automatic item generation by college faculty is warranted, because most problems don’t exhibit significant difficulty variation, and the few that do can be detected through automatic means and addressed by the faculty member.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Difficulty-Controllable Multiple-Choice Question Generation for Reading Comprehension Using Item Response Theory

Controlling item difficulty for automatic vocabulary question generation

Article Open access 19 December 2017

Automatic Item Generation Unleashed: An Evaluation of a Large-Scale Deployment of Item Models

References

Arendasy, M.E., Sommer, M.: Using automatic item generation to meet the increasing item demands of high-stakes educational and occupational assessment. Learn. Individ. Differ. 22(1), 112–117 (2012)
Article Google Scholar
Attali, Y.: Automatic item generation unleashed: an evaluation of a large-scale deployment of item models. In: Penstein Rosé, C., et al. (eds.) AIED 2018. LNCS (LNAI), vol. 10947, pp. 17–29. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93843-1_2
Chapter Google Scholar
Attali, Y., Powers, D.: Immediate feedback and opportunity to revise answers to open-ended questions. Educ. Psychol. Measure. 70(1), 22–35 (2010)
Article Google Scholar
Bejar, I.I.: Generative testing: from conception to implementation. In: Irvine, S., Kyllonen, P. (eds.) Item Generation for Test Development. Lawrence Erlbaum Associates (2002)
Google Scholar
Benjamini, Y., Yekutieli, D.: The control of the false discovery rate in multiple testing under dependency. Ann. Stat. 29, 1165–1188 (2001)
Article MathSciNet Google Scholar
Bloom, B.S.: Learning for mastery. Evaluation Comment 1 (1968)
Google Scholar
Cho, S.J., De Boeck, P., Embretson, S., Rabe-Hesketh, S.: Additive multilevel item structure models with random residuals: item modeling for explanation and item generation. Psychometrika 79(1), 84–104 (2014)
Article MathSciNet Google Scholar
Cochran, W.G.: Some methods for strengthening the common $\chi ^2$ tests. Biometrics 10(4), 417–451 (1954)
Article MathSciNet Google Scholar
Embretson, S.: Generating items during testing: psychometric issues and models. Psychometrika 64(4), 407–433 (1999). https://doi.org/10.1007/BF02294564
Article MATH Google Scholar
Enright, M.K., Morley, M., Sheehan, K.M.: Items by design: the impact of systematic feature variation on item statistical characteristics. App. Measure. Educ. 15(1), 49–74 (2002)
Article Google Scholar
Geerlings, H., Glas, C.A., van der Linden, W.J.: Modeling rule-based item generation. Psychometrika 76(2), 337–359 (2011)
Article MathSciNet Google Scholar
Geerlings, H., van der Linden, W.J., Glas, C.A.: Optimal test design with rule-based item generation. Appl. Psychol. Measure. 37(2), 140–161 (2013)
Article Google Scholar
Gierl, M., Haladyna, T.: Automatic Item Generation: Theory and practice. Routledge, Abingdon (2013)
Google Scholar
Glas, C.A., van der Linden, W.J.: Computerized adaptive testing with item cloning. Appl. Psychol. Measure. 27(4), 247–261 (2003)
Article MathSciNet Google Scholar
Haladyna, T.M.: Automatic item generation: a historical perspective. In: Gierl, M., Haladyna, T. (eds.) Automatic Item Generation: Theory and Practice, pp. 23–35. Routledge, Abingdon (2013)
Google Scholar
Hively II, W., Patterson, H.L., Page, S.H.: A “universe-defined” system of arithmetic achievement tests. J. Educ. Measure. 5(4), 275–290 (1968)
Article Google Scholar
Irvine, S., Kyllonen, P.: Item Generation for Test Development. LawrenceErlbaum Associates, Mahwah (2002)
Google Scholar
Kulik, C.L.C., Kulik, J.A., Bangert-Drowns, R.L.: Effectiveness of mastery learning programs: a meta-analysis. Rev. Educ. Res. 60, 265 (1990)
Article Google Scholar
van der Linden, W.J., Glas, C.A.: Elements of Adaptive Testing. Springer, Heidelberg (2010). https://doi.org/10.1007/978-0-387-85461-8
Book MATH Google Scholar
Macready, G.B.: The use of generalizability theory for assessing relations among items within domains in diagnostic testing. Appl. Psychol. Measure. 7(2), 149–157 (1983)
Article Google Scholar
Meisner, R., Luecht, R., Reckase, M.: The comparability of the statistical characteristics of test items generated by computer algorithms. Technical report ACT Research Report Series Nop. 93–9, ACT, Inc. (1993)
Google Scholar
Opitz, B., Ferdinand, N.K., Mecklinger, A.: Timing matters: the impact of immediate and delayed feedback on artificial language learning. Front. Hum. Neurosci. 5, 8 (2011)
Article Google Scholar
Rasila, A., Havola, L., Majander, H., Malinen, J.: Automatic assessment in engineering mathematics: evaluation of the impact. In: ReflekTori 2010 Symposium of Engineering Education, pp. 37–45 (2010)
Google Scholar
Sinharay, S., Johnson, M.S.: Use of item models in a large-scale admissions test: a case study. Int. J. Test. 8(3), 209–236 (2008)
Article Google Scholar
Sinharay, S., Johnson, M.S.: Statistical modeling of automatically generated items. In: Gierl, M., Haladyna, T. (eds.) Automatic Item Generation: Theory and Practice, pp. 183–195. Routledge, Abingdon (2013)
Google Scholar
West, M., Herman, G.L., Zilles, C.: PrairieLearn: mastery-based online problem solving with adaptive scoring and recommendations driven by machine learning. In: 2015 ASEE Annual Conference and Exposition. ASEE Conferences, Seattle, Washington, June 2015
Google Scholar
Zilles, C., West, M., Mussulman, D., Bretl, T.: Making testing less trying: lessons learned from operating a computer-based testing facility. In: 2018 IEEE Frontiers in Education (FIE) Conference, San Jose, California (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, USA
Binglin Chen & Craig Zilles
Department of Mechanical Engineering, University of Illinois at Urbana-Champaign, Urbana, USA
Matthew West
Department of Aerospace Engineering, University of Illinois at Urbana-Champaign, Urbana, USA
Timothy Bretl

Authors

Binglin Chen
View author publications
You can also search for this author in PubMed Google Scholar
Craig Zilles
View author publications
You can also search for this author in PubMed Google Scholar
Matthew West
View author publications
You can also search for this author in PubMed Google Scholar
Timothy Bretl
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Binglin Chen .

Editor information

Editors and Affiliations

University of Sao Paulo, Sao Paulo, Brazil
Seiji Isotani
University of Malaga, Málaga, Spain
Eva Millán
Carnegie Mellon University, Pittsburgh, PA, USA
Amy Ogan
DePaul University, Chicago, IL, USA
Peter Hastings
Carnegie Mellon University, Pittsburgh, PA, USA
Bruce McLaren
University College London, London, UK
Rose Luckin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, B., Zilles, C., West, M., Bretl, T. (2019). Effect of Discrete and Continuous Parameter Variation on Difficulty in Automatic Item Generation. In: Isotani, S., Millán, E., Ogan, A., Hastings, P., McLaren, B., Luckin, R. (eds) Artificial Intelligence in Education. AIED 2019. Lecture Notes in Computer Science(), vol 11625. Springer, Cham. https://doi.org/10.1007/978-3-030-23204-7_7

Download citation

DOI: https://doi.org/10.1007/978-3-030-23204-7_7
Published: 21 June 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-23203-0
Online ISBN: 978-3-030-23204-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics