Abstract
In the educational domain, identifying the similarity among test items provides various advantages for exam quality management and personalized student learning. Existing studies mostly relied on student performance data, such as the number of correct or incorrect answers, to measure item similarity. However, nuanced semantic information within the test items has been overlooked, possibly due to the lack of similarity-labeled data. Human-annotated educational data demands high-cost expertise, and items comprising multiple aspects, such as questions and choices, require detailed criteria. In this paper, we introduce a task of aspect-based semantic textual similarity for educational test items (aSTS-EI), where we assess the similarity by specific aspects within test items and present an LLM-guided benchmark dataset. We report the baseline performance by extending the STS methods, setting the groundwork for future aSTS-EI tasks. In addition, to assist data-scarce settings, we propose a progressive augmentation (ProAug) method, which generates step-by-step item aspects via recursive prompting. Experimental results imply the efficacy of existing STS methods for a shorter aspect while underlining the necessity for specialized approaches in relatively longer aspects. Nonetheless, markedly improved results with ProAug highlight the assistance of our augmentation strategy to overcome data scarcity.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
Two proficient English teachers with 100% job success are employed on the Upwork, https://www.upwork.com.
References
Achiam, J., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023)
Brown, T., et al.: Language models are few-shot learners. Adv. Neural Inform. Process. Syst. (2020)
Cechák, J., Pelánek, R.: Experimental evaluation of similarity measures for educational items. Intern. Educ. Data Mining Soc. (2021)
Cer, D., et al.: SemEval-2017 task 1: Semantic textual similarity multilingual and crosslingual focused evaluation. In: SemEval-2017. ACL (2017)
Chiang, C.H., Lee, H.Y.: Can large language models be an alternative to human evaluations? arXiv preprint arXiv:2305.01937 (2023)
Chung, Y.A., Lee, H.Y., Glass, J.: Supervised and unsupervised transfer learning for question answering. In: NAACL HLT (2018)
Formal, T., et al.: From distillation to hard negative sampling: Making sparse neural ir models more effective. In: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (2022)
Fu, J., et al.: Gptscore: Evaluate as you desire. arXiv preprint arXiv:2302.04166 (2023)
Gao, T., et al.: Simcse: simple contrastive learning of sentence embeddings. arXiv preprint arXiv:2104.08821 (2021)
Geng, C., et al.: A recommendation method of teaching resources based on similarity and als. J. Phys. Conf. Ser. (2021)
Harbouche, K., et al.: Measuring similarity of educational items using data on learners’ performance and behavioral parameters: Application of new models scnn-cosine and fuzzy-kappa. Ingenierie des Systemes d’Information (2023)
Huang, T., Li, X.: An empirical study of finding similar exercises. arXiv preprint arXiv:2111.08322 (2021)
Liu, Y., et al.: Gpteval: Nlg evaluation using gpt-4 with better human alignment. arXiv preprint arXiv:2303.16634 (2023)
Liu, Y., et al.: Roberta: a robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019)
OpenAI, T.: Chatgpt: Optimizing language models for dialogue. openai (2022)
Pelánek, R.: Measuring similarity of educational items: an overview. IEEE Trans. Learn. Technol. (2019)
Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. (2020)
Rihák, J., Pelánek, R.: Measuring similarity of educational items using data on learners’ performance. Inter. Educ. Data Mining Soc. (2017)
Santhanam, K., et al.: Colbertv2: effective and efficient retrieval via lightweight late interaction. arXiv preprint arXiv:2112.01488 (2021)
Touvron, H., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023)
Tseng, B.H., et al.: Towards machine comprehension of spoken content: Initial toefl listening comprehension test by machine. In: INTERSPEECH (2016)
Wang, W., et al.: Minilm: deep self-attention distillation for task-agnostic compression of pre-trained transformers. Adv. Neural Inform. Process. Syst. (2020)
Xiong, L., et al.: Approximate nearest neighbor negative contrastive learning for dense text retrieval. arXiv preprint arXiv:2007.00808 (2020)
Zhan, J., et al.: Optimizing dense retrieval model training with hard negatives. In: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (2021)
Acknowledgments
This work was partly supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No. 2022-0-00223, Development of digital therapeutics to improve communication ability of autism spectrum disorder patients), (No. 2019-0-01906, Artificial Intelligence Graduate School Program (POSTECH)), and Smart HealthCare Program funded by the Korean National Police Agency (KNPA) (No. 220222M01).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Do, H., Lee, G.G. (2024). Aspect-Based Semantic Textual Similarity for Educational Test Items. In: Olney, A.M., Chounta, IA., Liu, Z., Santos, O.C., Bittencourt, I.I. (eds) Artificial Intelligence in Education. AIED 2024. Lecture Notes in Computer Science(), vol 14830. Springer, Cham. https://doi.org/10.1007/978-3-031-64299-9_30
Download citation
DOI: https://doi.org/10.1007/978-3-031-64299-9_30
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-64298-2
Online ISBN: 978-3-031-64299-9
eBook Packages: Computer ScienceComputer Science (R0)