Abstract
In introductory programming courses, automated repair tools (ARTs) are used to provide feedback to students struggling with debugging. Most successful ARTs take advantage of context-specific educational data to construct repairs to students’ buggy codes. Recent work in student program repair using large language models (LLMs) has also started to utilize such data. An underexplored area in this field is the use of ARTs in combination with LLMs. In this paper, we propose to transfer the repairing capabilities of existing ARTs to open large language models by finetuning LLMs on ART corrections to buggy codes. We experiment with this approach using three large datasets of Python programs written by novices. Our results suggest that a finetuned LLM provides more reliable and higher-quality repairs than the repair tool used for finetuning the model. This opens venues for further deploying and using educational LLM-based repair techniques.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Azcona, D., Smeaton, A.: +5 Million Python & Bash Programming Submissions for 5 Courses & Grades for Computer-Based Exams Over 3 Academic Years (2020). https://doi.org/10.6084/m9.figshare.12610958.v1
Chen, M., et al.: Evaluating large language models trained on code (2021). https://doi.org/10.48550/ARXIV.2107.03374
Cleuziou, G., Flouvat, F.: Learning student program embeddings using abstract execution traces. In: 14th International Conference on Educational Data Mining, pp. 252–262 (2021)
Gulwani, S., Radiček, I., Zuleger, F.: Automated clustering and program repair for introductory programming assignments (2018). http://arxiv.org/abs/1603.03165, arXiv:1603.03165 [cs]
Hu, Y., Ahmed, U.Z., Mechtaev, S., Leong, B., Roychoudhury, A.: Re-factoring based program repair applied to programming assignments. In: 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 388–398. IEEE/ACM (2019)
McCauley, R., et al.: Debugging: a review of the literature from an educational perspective. Comput. Sci. Educ. 18(2), 67–92 (2008). https://doi.org/10.1080/08993400802114581
Pu, Y., Narasimhan, K., Solar-Lezama, A., Barzilay, R.: sk_p: a neural program corrector for MOOCs. arXiv:1607.02902 [cs] (2016). http://arxiv.org/abs/1607.02902
Singh, R., Gulwani, S., Solar-Lezama, A.: Automated feedback generation for introductory programming assignments. In: Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 15–26. PLDI 2013, Association for Computing Machinery, New York, NY, USA (2013). https://doi.org/10.1145/2491956.2462195
Wang, K., Singh, R., Su, Z.: Data-driven feedback generation for introductory programming exercises. arXiv:1711.07148 [cs] (2017). http://arxiv.org/abs/1711.07148
Wang, Y., Wang, W., Joty, S.R., Hoi, S.C.H.: Codet5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation. In: EMNLP, pp. 8696–8708. Association for Computational Linguistics (2021)
Zhang, J., et al.: Repairing bugs in python assignments using large language models (2022). https://doi.org/10.48550/ARXIV.2209.14876
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Koutcheme, C. (2023). Training Language Models for Programming Feedback Using Automated Repair Tools. In: Wang, N., Rebolledo-Mendez, G., Matsuda, N., Santos, O.C., Dimitrova, V. (eds) Artificial Intelligence in Education. AIED 2023. Lecture Notes in Computer Science(), vol 13916. Springer, Cham. https://doi.org/10.1007/978-3-031-36272-9_79
Download citation
DOI: https://doi.org/10.1007/978-3-031-36272-9_79
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-36271-2
Online ISBN: 978-3-031-36272-9
eBook Packages: Computer ScienceComputer Science (R0)