Identifying typical approaches and errors in Prolog programming with argument-based machine learning
Introduction
Programming is nowadays a must-have skill for professionals in many disciplines, and is becoming a part of basic education in many countries. It is being taught at many universities and in massive open online courses (MOOCs). Learning programming requires a great deal of practice. Bloom (1984) has shown that students learn much faster when a tutor is available to help select exercises, explain errors, and suggest possible problem-solving steps. However, due to high student-teacher ratios at universities – and even more so in MOOCs – it is practically impossible for human tutors to evaluate each individual work.
One solution is to use an intelligent tutoring system, where students implement programs in a controlled environment and receive help whenever needed. Such systems can be, in some cases, as effective as a human tutor (VanLehn, 2011). The alternative is to let students use their preferred environment, solve the problem by themselves, submit solutions and then receive automatically generated feedback. Both approaches, however, rely on a difficult knowledge acquisition task: we need to encode the teacher’s knowledge in an explicit form.
To address this problem we propose a machine learning algorithm for learning rules to distinguish between correct and incorrect programs. These rules describe different approaches that students used to solve exercises. A rule describing an incorrect program contains explicit reasons why this program is incorrect. Conversely, a rule describing correct programs specifies necessary components for a program to be correct. The semantics of these rules is analogous to constraint-based tutors (Mitrovic, 2012, Ohlsson, 1992), where constraints specify the necessary properties of correct submissions.
A common problem when using machine learning to build expert systems is that the resulting model is too complex and does not mimic the expert’s cognitive processes. It is quite possible to have a system that has high classification accuracy but is poor at teaching or even explaining its reasoning because a lot of knowledge remains implicit (see Langley & Simon, 1995 or Voosen, 2017).
We encountered a similar problem. To learn from student-submitted programs, which are labeled as “correct” or “incorrect” according to a set of test cases, we decided to use patterns from the programs’ abstract syntax trees (ASTs) as attributes for machine learning. However, the space of all possible attributes contains many meaningless AST patterns, which cannot be used for explanation. We therefore needed a programming teacher to specify which AST patterns are useful for distinguishing between correct and incorrect programs. However, the programming teacher was unable to fully express this knowledge in advance.
Domingos (2007) identified several reasons why combining machine learning and expert knowledge often fails, and how it should be approached. One of the reasons is that the results of machine learning are rarely optimal on the first attempt. Iterative improvement, where experts and computer improve the model in turns, is needed. Furthermore, some knowledge is hard to make explicit. It is known that humans are much better at explaining (or arguing) particular cases than explicating general knowledge.
Argument-based machine learning (ABML) (Možina, Žabkar, & Bratko, 2007) is an interactive method that helps a domain expert through the mechanism called the ABML knowledge refinement loop. In the loop, the experts are prompted to share only that knowledge which the learning system cannot learn on its own – the experts are asked to provide arguments about selected misclassified examples. Since an argument relates to a single example and the link between its premises and conclusion is presumptive, experts find it easier to explain their knowledge in this way. In our case, instead of asking the expert for all relevant patterns, we ask him or her to provide reasons why a certain program was either correct or incorrect.
This paper presents the following contributions. We first define abstract-syntax-tree paterns and how are they extracted. Then we describe a rule learning algorithm, which bases on the argument-based rule learning algorithm ABCN2 (Možina et al., 2007), for learning rules that represent typical approaches and errors in student solutions of programming assignments. Afterwards we present an extended version of the ABML refinement loop. In the evaluation section, we show that the patterns, which are the result of applying our algorithm on 42 Prolog exercises from our CodeQ tutoring system1, lead to accurate machine learned models.
Section snippets
Knowledge acquisition for tutoring systems
Domain knowledge for a tutoring system is most often represented with a rule-based model, which is easily understood and modified by a human. Both major ITS paradigms represent domain knowledge with rules: model-tracing tutors use production rules to model the problem-solving process (Anderson, Boyle, Corbett, Lewis, 1990, Koedinger, Anderson, 1997), while constraint-based tutors use rules to describe constraints that must hold for every correct solution (Mitrovic, 2012, Ohlsson, 1992).
Creating
Dataset
Our educational data consists of Prolog programs submitted by students using the online programming environment CodeQ during the Principles of Programming Languages course at University of Ljubljana. The students start with an empty editor and start programming. When they think that their solution is ready, they submit it for testing. If their program fails the testing cases, they will continue working on until they get it right. We selected 42 exercises with enough submitted programs for
Argument-based machine learning
An argument is comprised of a series of premises that are intended to give a reason for the conclusion. Humans mostly use arguments to justify or explain their beliefs and sometimes to convince others. In artificial intelligence, argumentation is a branch that analyzes automatic reasoning from arguments - how arguments for and against a certain claim are produced and evaluated. Argument-based machine learning (ABML) is a combination of argumentation and machine learning.
ABML uses arguments to
Experiments and evaluation
In this section we report and discuss the results of learning on a selection of 42 exercises used in our Prolog course. We randomly divided each data set into learning (70%) and testing (30%) set, where all programs submitted by the same student were in the same set. Due to this restriction, the percentage of learning examples was only approximately 70%. Note that all procedures related to the ABML refinement loop, such as using cross-validation for detecting critical examples, can use only
Conclusions
We have described a process for learning rules that characterize typical approaches and errors in students’ programming solutions. We described an argument-based rule learning algorithm that was tailored for this task, and an extended version of the ABML loop for acquiring arguments from experts. The most important extensions are the algorithm for selecting critical examples and the algorithm for selecting counter examples.
We evaluated the approach on our educational data of Prolog programs.
Acknowledgment
This work was partly supported by the Slovenian Research Agency (ARRS).
References (51)
- et al.
Skill acquisition and the list tutor
Cognitive Science
(1989) - et al.
Abml knowledge refinement loop: A case study
Proceedings of the IEEE 20th international symposium (ISMIS)
(2012) - et al.
Codewebs: Scalable homework search for massive open online programming courses
Proceedings of the 23rd international world wide web conference (www)
(2014) - et al.
Interacting meaningfully with machine learning systems: Three experiments
International Journal of Human-Computer Studies
(2009) The ai detectives
Science
(2017)- et al.
Examining multiple potential models in end-user interactive concept learning
Proceedings of the ACM conference on human factors in computing systems
(2010) - et al.
Cognitive modeling and intelligent tutoring
Artificial Intelligence
(1990) The 2 sigma problem: The search for methods of group instruction as effective as one-to one tutoring
Educational Researcher
(1984)- et al.
Pattern-based classification: A unifying perspective
CoRR
(2011) - et al.
Rule induction with CN2: Some recent improvements
Proceeding of the fifth Europen conference on machine learning (EWSL)
(1991)
Statistical comparisons of classifiers over multiple data sets
Journal of Machine Learning Research
Orange: Data mining toolbox in python
Journal of Machine Learning Research
Toward knowledge-rich data mining
Journal of Data Mining and Knowledge Discovery
Interactive machine learning
Proceedings of the 8th international conference on intelligent user interfaces
Plan ahead: Pricing ITS learner models
Proceedings of the 19th behavior representation in modeling & simulation conference
Predictive learning via rule ensembles
The Annals of Applied Statistics
An adaptable programming tutor for haskell giving automated feedback
Internation Journal of Artificial Intelligence in Education
Overcode: Visualizing variation in student solutions to programming problems at scale
ACM Transactions on Computer-Human Interaction
Survey on using constraints in data mining
Data Mining and Knowledge Discovery
Elicitation of neurological knowledge with argument-based machine learning
Artificial Intelligence in Medicine
Online python tutor: Embeddable web-based program visualization for cs education
Proceedings of the 44th SIGCSE technical symposium on computer science education
J-latte: A constraint-based tutor for java
Proceedings of 17th international conference on computers in education
A survey of frequent subgraph mining algorithms.
The Knowledge Engineering Review
Program representation for automatic hint generation for a data-driven novice programming tutor
Proceedings of the 11th international conference on intelligent tutoring systems (ITS)
Empirical comparison of graph classification algorithms
Proceedings of the IEEE symposium on computational intelligence and data mining
Cited by (7)
A Survey of Automated Programming Hint Generation: The HINTS Framework
2022, ACM Computing SurveysAbnormal video homework automatic detection system
2021, Journal of Ambient Intelligence and Humanized ComputingLearning by arguing in argument-based machine learning framework
2019, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)