Abstract
Modern machine learning approaches have been shown to be vulnerable to adversarial attacks in many fields. This is a critical weakness, especially for models that are expected to function in an adversarial environment, such as automatic grading models in exams. However, as most of these attacks are either limited in their success rate, their applicability in diverse scenarios or require mathematical expertise of the attacker, the question arises to which extent students themselves are even capable of fooling state-of-the-art grading models. This work aims to investigate this question for the short answer question format. For this purpose, we tasked students of an educational technologies university course with probing the state-of-the-art automatic short answer grading model for weaknesses. Of the fourteen active participants, only one reported the model to be sufficiently free of deficits. The following weaknesses were identified by the students: a disregard for negation, no plagiarism detection, correct answers not being predicted as such and oversensitivity to small linguistic changes in answers, triggers, and keywords.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Dzikovska, M.O., et al.: Semeval-2013 task 7: the joint student response analysis and 8th recognizing textual entailment challenge. Technical report, NORTH TEXAS STATE UNIV DENTON (2013)
Ettinger, A., Rao, S., DaumƩ III, H., Bender, E.M.: Towards linguistically generalizable NLP systems: a workshop and shared task. arXiv preprint arXiv:1711.01505 (2017)
Filighera, A., Steuer, T., Rensing, C.: Fooling automatic short answer grading systems. In: Bittencourt, I.I., Cukurova, M., Muldner, K., Luckin, R., MillĆ”n, E. (eds.) AIED 2020. LNCS (LNAI), vol. 12163, pp. 177ā190. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-52237-7_15
Mayring, P.: Qualitative inhaltsanalyse. In: Mey, G., Mruck, K. (eds.) Handbuch Qualitative Forschung in der Psychologie, pp. 601ā613. Springer, Wiesbaden (2010). https://doi.org/10.1007/978-3-531-92052-8
Mohler, M., Bunescu, R., Mihalcea, R.: Learning to grade short answer questions using semantic similarity measures and dependency graph alignments. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1, pp. 752ā762. Association for Computational Linguistics (2011)
Sung, C., Dhamecha, T.I., Mukhi, N.: Improving short answer grading using transformer-based pre-training. In: Isotani, S., MillĆ”n, E., Ogan, A., Hastings, P., McLaren, B., Luckin, R. (eds.) AIED 2019. LNCS (LNAI), vol. 11625, pp. 469ā481. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-23204-7_39
Wallace, E., Rodriguez, P., Feng, S., Yamada, I., Boyd-Graber, J.: Trick me if you can: human-in-the-loop generation of adversarial examples for question answering. Trans. Assoc. Comput. Linguist. 7, 387ā401 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
Ā© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Filighera, A., Steuer, T., Rensing, C. (2020). Fooling It - Student Attacks on Automatic Short Answer Grading. In: Alario-Hoyos, C., RodrĆguez-Triana, M.J., Scheffel, M., Arnedillo-SĆ”nchez, I., Dennerlein, S.M. (eds) Addressing Global Challenges and Quality Education. EC-TEL 2020. Lecture Notes in Computer Science(), vol 12315. Springer, Cham. https://doi.org/10.1007/978-3-030-57717-9_25
Download citation
DOI: https://doi.org/10.1007/978-3-030-57717-9_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-57716-2
Online ISBN: 978-3-030-57717-9
eBook Packages: Computer ScienceComputer Science (R0)