Use of Machine Learning Methods in the Assessment of Programming Assignments

Tarcsay, Botond; Vasić, Jelena; Perez-Tellez, Fernando

doi:10.1007/978-3-031-16270-1_13

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13502))

Included in the following conference series:

International Conference on Text, Speech, and Dialogue

1167 Accesses

Abstract

Programming has become an important skill in today’s world and is taught widely both in traditional and online settings. Educators need to grade increasing numbers of student submissions. Unit testing can contribute to the automation of the grading process; however, it cannot assess the structure, or partial correctness, which are needed for finely differentiated grading. This paper builds on previous research that investigated several machine learning models for determining the correctness of source code. It was found that some such models can be successful, provided that the code samples used for fitting and prediction fulfil the same sets of requirements (corresponding to coding assignments). The hypothesis investigated in this paper is that code samples can be grouped by similarity of the requirements that they fulfil and that models built with samples of code from such a group can be used for determining the quality of new samples that belong to the same group, even if they do not correspond to the same coding assignment, which would make for a much more useful predictive model in practice. The investigation involved ten different machine learning algorithms used on over four hundred thousand student code submissions and it confirmed the hypothesis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Using Machine Learning to Identify Patterns in Learner-Submitted Code for the Purpose of Assessment

Automatic Classification of Error Types in Solutions to Programming Assignments at Online Learning Platform

Grading Documentation with Machine Learning

References

Azcona, D., Arora, P., Hsiao, I.H., Smeaton, A.: user2code2vec: embeddings for profiling students based on distributional representations of source code. In: Proceedings of the 9th International Conference on Learning Analytics & Knowledge, pp. 86–95 (2019)
Google Scholar
Perry, D.M., Kim, D., Samanta, R., Zhang, X.: SemCluster: clustering of imperative programming assignments based on quantitative semantic features. In: Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 860–873 (2019)
Google Scholar
Bui, N.D., Yu, Y., Jiang, L.: InferCode: self-supervised learning of code representations by predicting subtrees. In: 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE), pp. 1186–1197. IEEE (2021)
Google Scholar
Lee, S., Han, H., Cha, S.K., Son, S., Montage: a neural network language model-GuidedJavaScript Engine Fuzzer. In: 29th USENIX Security Symposium (USENIX Security 20), pp. 2613–2630 (2020)
Google Scholar
Hegarty-Kelly, E., Mooney, D.A.: Analysis of an automatic grading system within first year computer science programming modules. In: Computing Education Practice 2021, CEP 2021, pp. 17–20 Association for Computing Machinery, New York (2021)
Google Scholar
Jayapati, V.S., Venkitaraman, A.: A Comparison of Information Retrieval Techniques for Detecting Source Code Plagiarism. arXiv preprint arXiv:1902.02407 (2019)
Chen, H.M., Chen, W.H., Lee, C.C.: An automated assessment system for analysis of coding convention violations in Java programming assignments. J. Inf. Sci. Eng. 34(5), 1203–1221 (2018)
Google Scholar
Rai, K.K., Gupta, B., Shokeen, P., Chakraborty, P.: Question independent automated code analysis and grading using bag of words and machine learning. In: 2019 International Conference on Computing, Power and Communication Technologies (GUCON), pp. 93–98. IEEE (2019)
Google Scholar
Mir, A.M., Latoskinas, E., Proksch, S. and Gousios, G.: Type4py: Deep similarity learning-based type inference for python. arXiv preprint arXiv:2101.04470 (2021)
Li, H.Y., et al.: Deepreview: automatic code review using deep multi-instance learning. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 318–330. Springer, Cham (2019)
Google Scholar
Setoodeh, Z., Moosavi, M.R., Fakhrahmad, M., Bidoki, M.: A proposed model for source code reuse detection in computer programs. Iranian J. Sci. Technol. Trans. Electr. Eng. 45(3), 1001–1014 (2021). https://doi.org/10.1007/s40998-020-00403-8
Article Google Scholar
Liu, X., Wang, S., Wang, P., Wu, D.: Automatic grading of programming assignments: an approach based on formal semantics. In: 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering Education and Training (ICSE-SEET), pp. 126–137. IEEE (2019)
Google Scholar

Download references

Author information

Authors and Affiliations

Technological University Dublin, Dublin, Ireland
Botond Tarcsay, Jelena Vasić & Fernando Perez-Tellez

Authors

Botond Tarcsay
View author publications
You can also search for this author in PubMed Google Scholar
Jelena Vasić
View author publications
You can also search for this author in PubMed Google Scholar
Fernando Perez-Tellez
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Botond Tarcsay .

Editor information

Editors and Affiliations

Faculty of Informatics, Masaryk University, Brno, Czech Republic
Petr Sojka
Faculty of Informatics, Masaryk University, Brno, Czech Republic
Aleš Horák
Faculty of Informatics, Masaryk University, Brno, Czech Republic
Ivan Kopeček
Faculty of Informatics, Masaryk University, Brno, Czech Republic
Karel Pala

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tarcsay, B., Vasić, J., Perez-Tellez, F. (2022). Use of Machine Learning Methods in the Assessment of Programming Assignments. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech, and Dialogue. TSD 2022. Lecture Notes in Computer Science(), vol 13502. Springer, Cham. https://doi.org/10.1007/978-3-031-16270-1_13

Download citation

DOI: https://doi.org/10.1007/978-3-031-16270-1_13
Published: 16 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16269-5
Online ISBN: 978-3-031-16270-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Use of Machine Learning Methods in the Assessment of Programming Assignments