Elsevier

Expert Systems with Applications

Volume 112, 1 December 2018, Pages 110-124
Expert Systems with Applications

Identifying typical approaches and errors in Prolog programming with argument-based machine learning

https://doi.org/10.1016/j.eswa.2018.06.029Get rights and content

Highlights

  • Abstract-syntax-tree (AST) patterns as attributes for classifying Prolog programs.

  • Identification of AST patterns for detecting errors and programming approaches.

  • An argument-based algorithm for learning rules suitable for tutoring.

  • Evaluation of extracted patterns and rules on 42 Prolog exercises.

Abstract

Students learn programming much faster when they receive feedback. However, in programming courses with high student-teacher ratios, it is practically impossible to provide feedback to all homeworks submitted by students. In this paper, we propose a data-driven tool for semi-automatic identification of typical approaches and errors in student solutions. Having a list of frequent errors, a teacher can prepare common feedback to all students that explains the difficult concepts. We present the problem as supervised rule learning, where each rule corresponds to a specific approach or error. We use correct and incorrect submitted programs as the learning examples, where patterns in abstract syntax trees are used as attributes. As the space of all possible patterns is immense, we needed the help of experts to select relevant patterns. To elicit knowledge from the experts, we used the argument-based machine learning (ABML) method, in which an expert and ABML interactively exchange arguments until the model is good enough. We provide a step-by-step demonstration of the ABML process, present examples of ABML questions and corresponding expert’s answers, and interpret some of the induced rules. The evaluation on 42 Prolog exercises further shows the usefulness of the knowledge elicitation process, as the models constructed using ABML achieve significantly better accuracy than the models learned from human-defined patterns or from automatically extracted patterns.

Introduction

Programming is nowadays a must-have skill for professionals in many disciplines, and is becoming a part of basic education in many countries. It is being taught at many universities and in massive open online courses (MOOCs). Learning programming requires a great deal of practice. Bloom (1984) has shown that students learn much faster when a tutor is available to help select exercises, explain errors, and suggest possible problem-solving steps. However, due to high student-teacher ratios at universities – and even more so in MOOCs – it is practically impossible for human tutors to evaluate each individual work.

One solution is to use an intelligent tutoring system, where students implement programs in a controlled environment and receive help whenever needed. Such systems can be, in some cases, as effective as a human tutor (VanLehn, 2011). The alternative is to let students use their preferred environment, solve the problem by themselves, submit solutions and then receive automatically generated feedback. Both approaches, however, rely on a difficult knowledge acquisition task: we need to encode the teacher’s knowledge in an explicit form.

To address this problem we propose a machine learning algorithm for learning rules to distinguish between correct and incorrect programs. These rules describe different approaches that students used to solve exercises. A rule describing an incorrect program contains explicit reasons why this program is incorrect. Conversely, a rule describing correct programs specifies necessary components for a program to be correct. The semantics of these rules is analogous to constraint-based tutors (Mitrovic, 2012, Ohlsson, 1992), where constraints specify the necessary properties of correct submissions.

A common problem when using machine learning to build expert systems is that the resulting model is too complex and does not mimic the expert’s cognitive processes. It is quite possible to have a system that has high classification accuracy but is poor at teaching or even explaining its reasoning because a lot of knowledge remains implicit (see Langley & Simon, 1995 or Voosen, 2017).

We encountered a similar problem. To learn from student-submitted programs, which are labeled as “correct” or “incorrect” according to a set of test cases, we decided to use patterns from the programs’ abstract syntax trees (ASTs) as attributes for machine learning. However, the space of all possible attributes contains many meaningless AST patterns, which cannot be used for explanation. We therefore needed a programming teacher to specify which AST patterns are useful for distinguishing between correct and incorrect programs. However, the programming teacher was unable to fully express this knowledge in advance.

Domingos (2007) identified several reasons why combining machine learning and expert knowledge often fails, and how it should be approached. One of the reasons is that the results of machine learning are rarely optimal on the first attempt. Iterative improvement, where experts and computer improve the model in turns, is needed. Furthermore, some knowledge is hard to make explicit. It is known that humans are much better at explaining (or arguing) particular cases than explicating general knowledge.

Argument-based machine learning (ABML) (Možina, Žabkar, & Bratko, 2007) is an interactive method that helps a domain expert through the mechanism called the ABML knowledge refinement loop. In the loop, the experts are prompted to share only that knowledge which the learning system cannot learn on its own – the experts are asked to provide arguments about selected misclassified examples. Since an argument relates to a single example and the link between its premises and conclusion is presumptive, experts find it easier to explain their knowledge in this way. In our case, instead of asking the expert for all relevant patterns, we ask him or her to provide reasons why a certain program was either correct or incorrect.

This paper presents the following contributions. We first define abstract-syntax-tree paterns and how are they extracted. Then we describe a rule learning algorithm, which bases on the argument-based rule learning algorithm ABCN2 (Možina et al., 2007), for learning rules that represent typical approaches and errors in student solutions of programming assignments. Afterwards we present an extended version of the ABML refinement loop. In the evaluation section, we show that the patterns, which are the result of applying our algorithm on 42 Prolog exercises from our CodeQ tutoring system1, lead to accurate machine learned models.

Section snippets

Knowledge acquisition for tutoring systems

Domain knowledge for a tutoring system is most often represented with a rule-based model, which is easily understood and modified by a human. Both major ITS paradigms represent domain knowledge with rules: model-tracing tutors use production rules to model the problem-solving process (Anderson, Boyle, Corbett, Lewis, 1990, Koedinger, Anderson, 1997), while constraint-based tutors use rules to describe constraints that must hold for every correct solution (Mitrovic, 2012, Ohlsson, 1992).

Creating

Dataset

Our educational data consists of Prolog programs submitted by students using the online programming environment CodeQ during the Principles of Programming Languages course at University of Ljubljana. The students start with an empty editor and start programming. When they think that their solution is ready, they submit it for testing. If their program fails the testing cases, they will continue working on until they get it right. We selected 42 exercises with enough submitted programs for

Argument-based machine learning

An argument is comprised of a series of premises that are intended to give a reason for the conclusion. Humans mostly use arguments to justify or explain their beliefs and sometimes to convince others. In artificial intelligence, argumentation is a branch that analyzes automatic reasoning from arguments - how arguments for and against a certain claim are produced and evaluated. Argument-based machine learning (ABML) is a combination of argumentation and machine learning.

ABML uses arguments to

Experiments and evaluation

In this section we report and discuss the results of learning on a selection of 42 exercises used in our Prolog course. We randomly divided each data set into learning (70%) and testing (30%) set, where all programs submitted by the same student were in the same set. Due to this restriction, the percentage of learning examples was only approximately 70%. Note that all procedures related to the ABML refinement loop, such as using cross-validation for detecting critical examples, can use only

Conclusions

We have described a process for learning rules that characterize typical approaches and errors in students’ programming solutions. We described an argument-based rule learning algorithm that was tailored for this task, and an extended version of the ABML loop for acquiring arguments from experts. The most important extensions are the algorithm for selecting critical examples and the algorithm for selecting counter examples.

We evaluated the approach on our educational data of Prolog programs.

Acknowledgment

This work was partly supported by the Slovenian Research Agency (ARRS).

References (51)

  • J. Demsar

    Statistical comparisons of classifiers over multiple data sets

    Journal of Machine Learning Research

    (2006)
  • J. Demšar et al.

    Orange: Data mining toolbox in python

    Journal of Machine Learning Research

    (2013)
  • P. Domingos

    Toward knowledge-rich data mining

    Journal of Data Mining and Knowledge Discovery

    (2007)
  • J.A. Fails et al.

    Interactive machine learning

    Proceedings of the 8th international conference on intelligent user interfaces

    (2003)
  • J.T. Folsom-Kovarik et al.

    Plan ahead: Pricing ITS learner models

    Proceedings of the 19th behavior representation in modeling & simulation conference

    (2010)
  • J.H. Friedman et al.

    Predictive learning via rule ensembles

    The Annals of Applied Statistics

    (2008)
  • A. Gerdes et al.

    An adaptable programming tutor for haskell giving automated feedback

    Internation Journal of Artificial Intelligence in Education

    (2017)
  • E.L. Glassman et al.

    Overcode: Visualizing variation in student solutions to programming problems at scale

    ACM Transactions on Computer-Human Interaction

    (2015)
  • V. Grossi et al.

    Survey on using constraints in data mining

    Data Mining and Knowledge Discovery

    (2017)
  • V. Groznik et al.

    Elicitation of neurological knowledge with argument-based machine learning

    Artificial Intelligence in Medicine

    (2013)
  • P.J. Guo

    Online python tutor: Embeddable web-based program visualization for cs education

    Proceedings of the 44th SIGCSE technical symposium on computer science education

    (2013)
  • J. Holland et al.

    J-latte: A constraint-based tutor for java

    Proceedings of 17th international conference on computers in education

    (2009)
  • C. Jiang et al.

    A survey of frequent subgraph mining algorithms.

    The Knowledge Engineering Review

    (2013)
  • W. Jin et al.

    Program representation for automatic hint generation for a data-driven novice programming tutor

    Proceedings of the 11th international conference on intelligent tutoring systems (ITS)

    (2012)
  • N.S. Ketkar et al.

    Empirical comparison of graph classification algorithms

    Proceedings of the IEEE symposium on computational intelligence and data mining

    (2009)
  • View full text