ABSTRACT
A majority of the current automated evaluation tools focus on grading a program based only on functionally testing the outputs. This approach suffers both false positives (i.e. finding errors where there are not any) and false negatives (missing out on actual errors). In this paper, we present a novel system which emulates manual evaluation of programming assignments based on the structure and not the functional output of the program using structural similarity between the given program and a reference solution. We propose an evaluation rubric for scoring structural similarity with respect to a reference solution. We present an ML based approach to map the system predicted scores to the scores computed using the rubric. Empirical evaluation of the system is done on a corpus of Python programs extracted from the popular programming platform, HackerRank, in combination with programming assignments submitted by students undertaking an undergraduate Python programming course. The preliminary results have been encouraging with the errors reported being as low as 12 percent with a deviation of about 3 percent, showing that the automatically generated scores are in high correlation with the instructor assigned scores.
- Kirsti Ala-Mutka. 2005. A Survey of Automated Assessment Approaches for Programming Assignments. Computer Science Education(2005), 83–102. https://doi.org/10.1080/08993400500150747Google Scholar
- Chih-Chung Chang and Chih-Jen Lin. 2011. LIBSVM: A Library for Support Vector Machines. ACM Trans. Intell. Syst. Technol. 2, 3, Article 27 (May 2011), 27 pages. https://doi.org/10.1145/1961189.1961199Google ScholarDigital Library
- Michel Chilowicz and Gilles Roussel. 2009. Syntax tree fingerprinting for source code similarity detection. In 2009 IEEE 17th International Conference on Program Comprehension. 243–247. https://doi.org/10.1109/ICPC.2009.5090050Google ScholarCross Ref
- David Gitchell and Nicholas Tran. 1999. Sim: A Utility for Detecting Similarity in Computer Programs. SIGCSE Bull (1999), 266–270. https://doi.org/10.1145/384266.299783Google ScholarDigital Library
- Chao Liu, Chen Chen, Jiawei Han, and Philip S. Yu. 2006. GPLAG: Detection of Software Plagiarism by Program Dependence Graph Analysis. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Association for Computing Machinery, 872–881. https://doi.org/10.1145/1150402.1150522Google ScholarDigital Library
- Nikhila K N, Sujit Kumar Chakrabarti, and Manish Gupta. 2021. Discovering Multiple Design Approaches in Programming Assignment Submissions. In Proceedings of the 36th Annual ACM Symposium on Applied Computing (Virtual Event, Republic of Korea) (SAC ’21). Association for Computing Machinery, New York, NY, USA, 1841–1845. https://doi.org/10.1145/3412841.3442140Google ScholarDigital Library
- Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, Jake Vanderplas, Alexandre Passos, David Cournapeau, Matthieu Brucher, Matthieu Perrot, and Édouard Duchesnay. 2011. Scikit-Learn: Machine Learning in Python. J. Mach. Learn. Res. 12, null (Nov. 2011), 2825–2830.Google Scholar
- Lutz Prechelt, Guido Malpohl, and Michael Philippsen. 2002. Finding Plagiarisms among a Set of Programs with JPlag. Journal of Universal Computer Science(2002), 1016–1038.Google Scholar
- Hitesh Sajnani, Vaibhav Saini, Jeffrey Svajlenko, Chanchal K. Roy, and Cristina V. Lopes. 2016. SourcererCC: Scaling Code Clone Detection to Big-Code. In Proceedings of the 38th International Conference on Software Engineering. Association for Computing Machinery, 1157–1168. https://doi.org/10.1145/2884781.2884877Google ScholarDigital Library
- Saul Schleimer, Daniel S. Wilkerson, and Alex Aiken. 2003. Winnowing: Local Algorithms for Document Fingerprinting. In Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data. Association for Computing Machinery, 76–85. https://doi.org/10.1145/872757.872770Google ScholarDigital Library
- Gursimran Singh, Shashank Srikant, and Varun Aggarwal. 2016. Question Independent Grading using Machine Learning: The Case of Computer Program Grading. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 263–272. https://doi.org/10.1145/2939672.2939696Google ScholarDigital Library
- Shashank Srikant and Varun Aggarwal. 2014. A system to grade computer programming skills using machine learning. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2014). https://doi.org/10.1145/2623330.2623377Google ScholarDigital Library
- Tiantian Wang, Xiaohong Su, Yuying Wang, and Peijun Ma. 2007. Semantic Similarity-Based Grading of Student Programs. Inf. Softw. Technol.(2007), 99–107. https://doi.org/10.1016/j.infsof.2006.03.001Google ScholarDigital Library
- Michael Wise. 1993. String Similarity via Greedy String Tiling and Running Karp –Rabin Matching. Unpublished Basser Department of Computer Science Report (1993).Google Scholar
- Mengya Zheng, Xingyu Pan, and David Lillis. 2018. CodEX: Source Code Plagiarism Detection Based on Abstract Syntax Tree. In AICS.Google Scholar
Recommendations
Discovering multiple design approaches in programming assignment submissions
SAC '21: Proceedings of the 36th Annual ACM Symposium on Applied ComputingIn this paper, we present a novel approach of automated evaluation of programming assignments (AEPA) the highlight of which is that it automatically identifies multiple solution approaches to the programming question from the set of submitted solutions. ...
LetGrade: An Automated Grading System for Programming Assignments
Artificial Intelligence in Education. Posters and Late Breaking Results, Workshops and Tutorials, Industry and Innovation Tracks, Practitioners’ and Doctoral ConsortiumAbstractManually grading programming assignments is time consuming and tedious, especially if they are incorrect and incomplete. Most existing automated grading systems use testing or program analysis. These systems rely on a single reference solution and ...
Analysis of Automated Evaluation for Multi-document Summarization Using Content-Based Similarity
ICDS '08: Proceedings of the Second International Conference on Digital SocietyWe introduce an automated evaluation method based on content similarity, and construct a vector space of words, on which we compute cosine similarity of automated summaries and human summaries. The method is tested on DUC 2005 data, and produces ...
Comments