Synonyms

Example-based programming; Inductive program synthesis; Inductive synthesis; Programming by examples; Program synthesis from examples

Definition

Inductive programming is the inference of an algorithm or program featuring recursive calls or repetition control structures, starting from information that is known to be incomplete, called the evidence, such as positive and negative input-output examples or clausal constraints. The inferred program must be correct with respect to the provided evidence, in a generalization sense: it should be neither equivalent to it nor inconsistent. Inductive programming is guided explicitly or implicitly by a language bias and a search bias. The inference may draw on background knowledge or query an oracle. In addition to induction, abduction may be used. The restriction to algorithms and programs featuring recursive calls or repetition control structures distinguishes inductive programming from concept learning or classification.

We here restrict ourselves to the inference of declarative programs, whether functional or logic, and dispense with repetition control structures in the inferred program in favor of recursive calls.

Motivation and Background

Inductive program synthesis is a branch of the field of program synthesis, which addresses a cognitive question as old as computers, namely, the understanding of the human act of computer programming, to the point where a computer can be made to help in this task (and ultimately to enhance itself). See Flener (2002) and Gulwani et al. (2014) for surveys; the other main branches of program synthesis are based on deductive inference, namely, constructive program synthesis and transformational program synthesis. In such deductive program synthesis, the provided information, called the specification, is assumed to be complete (in contrast to inductive program synthesis where the provided information is known to be incomplete), and the presence of repetitive or recursive control structures in the synthesized program is not imposed.

Research on the inductive synthesis of recursive functional programs started in the early 1970s and was brought onto firm theoretical foundations with the seminal thesys system of Summers (1977) and work of Biermann (1978), where all the evidence is handled non-incrementally. Essentially, the idea is first to infer computation traces from input-output examples (instances ) and then to use a trace-based programming method to fold these traces into a recursive program. The main results until the mid-1980s were surveyed in Smith (1984). Due to limited progress with respect to the range of programs that could be synthesized, research activities decreased significantly in the next decades. However, a new approach that formalizes functional program synthesis in the term rewriting framework and that allows the synthesis of a broader class of programs than the classical approaches is pursued in Kitzelmann and Schmid (2006).

The advent of logic programming brought a new lan but also a new direction in the early 1980s, especially due to the mis system of Shapiro (1983), eventually spawning the new field of inductive logic programming (ILP). Most of this ILP work addresses a wider class of problems, as the focus is not only on recursive logic programs: more adequate designations are inductive theory revision and declarative program debugging, as an additional input is a possibly empty initial theory or program that is incrementally revised or debugged according to each newly presented piece of evidence, possibly in the presence of background knowledge or an oracle. The main results on the inductive synthesis of recursive logic programs were surveyed in Flener and Yılmaz (1999).

Structure of Learning System

The core of an inductive programming system is a mechanism for constructing a recursive generalization for a set of input/output examples (instances ). Although we use the vocabulary of logic programming, this method also covers the synthesis of functional programs.

The input, often a set of input/output examples, is called the evidence. Further evidence may be queried from an oracle. Additional information, in the form of predicate symbols that can be used during the synthesis, can be provided as background knowledge. Since the hypothesis space – the set of legal recursive programs – is infinite, a language bias is introduced. One particularly useful and common approach in inductive programming is to provide a statement bias by means of a program schema.

The evidential synthesis of a recursive program starts from the provided evidence for some predicate symbol and works essentially as follows. A program schema is chosen to provide a template for the program structure, where all yet undefined predicate symbols must be instantiated during the synthesis. Predefined predicate symbols of the background knowledge are then chosen for some of these undefined predicate symbols in the template. If it is deemed that the remaining undefined predicate symbols cannot all be instantiated via purely structural generalization by non-recursive definitions, then the method is recursively called to infer recursive definitions for some of them (this is called predicate invention and amounts to shifting the vocabulary bias); otherwise the synthesis ends successfully right away. This generic method can backtrack to any choice point for synthesizing alternative programs.

In the rest of this section, we discuss this basic terminology of inductive programming more precisely. In the next section, instantiations of this generic method by some well-known methods are presented.

The Evidence and the Oracle

The evidence is often limited to ground positive examples of the predicate symbols that are to be defined. Ground negative examples are convenient to prevent overgeneralization, but should be used constructively and not just to reject candidate programs. A useful generalization of ground examples is evidence in the form of a set of (non-recursive) clauses, as variables and additional predicate symbols can then be used.

Example 1

The delOdds(L, R) relation, which holds if and only if R is the integer list L without its odd elements, can be incompletely described by the following clausal evidence:

$$\displaystyle\begin{array}{rcl} \begin{array}{rcl} \mathit{delOdds}([\ ],[\ ])& \leftarrow &\mathit{true} \\ \mathit{delOdds}([X],[\ ])& \leftarrow &\mathit{odd}(X) \\ \mathit{delOdds}([X],[X])& \leftarrow &\neg \mathit{odd}(X) \\ \mathit{delOdds}([X,Y ],[Y ])& \leftarrow &\mathit{odd}(X),\ \neg \mathit{odd}(Y ) \\ \mathit{delOdds}([X,Y ],[X,Y ])& \leftarrow &\neg \mathit{odd}(X),\ \neg \mathit{odd}(Y ) \\ \mathit{false}& \leftarrow &\mathit{delOdds}([X],[X]),\ \mathit{odd}(X)\end{array} & &{}\end{array}$$
(1)

The first clause is a ground positive example, whereas the second and third clauses generalize the infinity of ground positive examples, such as delOdds([5], [ ]) and delOdds([22], [22]), for handling singleton lists, while the fourth and fifth clauses summarize the infinity of ground positive examples for handling lists of two elements, the second one being even: these clauses make explicit the underlying filtering relation (odd) that is intrinsic to the problem at hand but cannot be provided via ground examples and would otherwise have to be guessed. The sixth clause summarizes an infinity of ground negative examples for handling singleton lists, namely, where the only element of the list is odd but not filtered.

In some methods, especially for the induction of functional programs, the first n positive input-output examples with respect to the underlying data type are presented (e.g., for linear lists, what to do with the empty list, with a one-element list, up to a list with three elements); because of this ordering of examples, no explicit presentation of negative examples is then necessary.

Inductive program synthesis should be monotonic in the evidence (more evidence should never yield a less complete program, and less evidence should not yield a more complete program) and should not be sensitive to the order of presentation of the evidence.

Program Schemas

Informally, a program schema contains a template program and a set of axioms. The template abstracts a class of actual programs, called instances, in the sense that it represents their dataflow and control flow by means of placeholders, but does not make explicit all their actual computations nor all their actual data structures. The axioms restrict the possible instances of the placeholders and define their interrelationships. Note that a schema is problem independent. Let us here take a first-order logic approach and consider templates as open logic programs (i.e. programs where some placeholder predicate symbols are left undefined or open; a program with no open predicate symbols is said to be closed) and axioms as first-order specifications of these open predicate symbols.

Example 2

Most methods of inductive synthesis are biased by program schemas whose templates have clauses of the forms in the following generic template:

$$\displaystyle{ \begin{array}{rcl} r(X,Y,Z)& \leftarrow &c(X,Y,Z), \\ & &p(X,Y,Z) \\ r(X,Y,Z)& \leftarrow &d(X,H,X_{1},\ldots,X_{t},Z), \\ & &r(X_{1},Y _{1},Z),\ \ldots,\ r(X_{t},Y _{t},Z), \\ & &q(H,Y _{1},\ldots,Y _{t},Z,Y )\end{array} }$$
(2)

where c, d, p, q are open predicate symbols, X is a nonempty sequence of terms, and Y, Z are possibly empty sequences of terms. The intended semantics of this generic template can be informally described as follows. For an arbitrary relation r over parameters X, Y, Z, an instance of this generic template is to determine the values of result parameter Y corresponding to a given value of induction parameter X, considering the value of auxiliary parameter Z. Two cases arise: either the c test succeeds and X has a value for which Y can be easily directly computed through p, or X has a value for which Y cannot be so easily directly computed and the divide-and-conquer principle is applied:

  1. 1.

    divide X through d into a term H and t terms X1, , X t of the same type as X but smaller than X according to some well-founded relation;

  2. 2.

    conquer through t recursive calls to r to determine the values of Y1, , Y t corresponding to X1, , X t , respectively, considering the value of Z;

  3. 3.

    combine through q the terms H, Y1, , Y t , Z to build Y.

Enforcing this intended semantics must be done manually, as any instance template by itself has no semantics, in the sense that any program is an instance of it (it suffices to define c by a program that always succeeds and p by the given program). One way to do this is to attach to a template some axioms (see Smith (1985) for the divide-and-conquer axioms), namely, the set of specifications of its open predicate symbols: these specifications refer to each other, including the one of r, and are generic (because even the specification of r is unknown), but can be manually abduced once and for all according to the informal semantics of the schema.

Predicate Invention

Another important language bias is the available vocabulary, which is here the set of predicate symbols mentioned in the evidence set or actually defined in the background knowledge (and possibly mentioned by the oracle). If an inductive synthesis fails, other than backtracking to a different program schema (i.e., shifting the statement bias), one can try and shift the vocabulary bias by inventing new predicate symbols and inducing programs for them in the extended vocabulary; this is also known as performing constructive induction. Only the invention of recursively defined predicate symbols is necessary, as a non-recursive definition of a predicate symbol can be eliminated by substitution (under resolution ) for its calls in the induced program (even though that might make the program longer).

In general, it is undecidable whether predicate invention is necessary to induce a finite program in the vocabulary of its evidence and background knowledge (as a consequence of Rice’s theorem, 1953), but introducing new predicate symbols always allows the induction of a finite program (as a consequence of a result by Kleene), as shown in Stahl (1995). The necessity of shifting the vocabulary bias is only decidable for some restricted languages (but the bias shift attempt might then be unsuccessful), so in practice one often has to resort to heuristics. Note that an inductive synthesizer of recursive algorithms may be recursive itself: it may recursively invoke itself for a necessary new predicate symbol.

Other than the decision problem, the difficulties of predicate invention are as follows. First, adequate formal parameters for a new predicate symbol have to be identified among all the variables in the clause using it. This can be done instantaneously by using precomputations done manually once and for all at the template level. Second, evidence for a new predicate symbol has to be abduced from the current program using the evidence for the old predicate symbol. This usually requires an oracle for the old predicate symbol, whose program is still unfinished at that moment and cannot be used. Third, the abduced evidence may be less numerous than for the old predicate symbol (note that if the new predicate symbol is in a recursive clause, then no new evidence might be abduced from the old evidence that is covered by the base clauses) and can be quite sparse, so that the new synthesis is more difficult. This sparseness problem can be illustrated by an example.

Example 3

Given the positive ground examples factorial(0, 1), factorial(1, 1), factorial(2, 2), factorial(3, 6), and factorial(4, 24) and given the still open program:

$$\displaystyle{\begin{array}{rcl} \mathit{factorial}(N,F)& \leftarrow &N = 0,\ F = 1 \\ \mathit{factorial}(N,F)& \leftarrow &\mathit{add}(M,1,N), \\ & &\mathit{factorial}(M,G), \\ & &\mathit{product}(N,G,F) \end{array} }$$

where add is known but product was just invented (and named so only for the reader’s convenience), the abduceable examples are product(1, 1, 1), product(2, 1, 2), product(3, 2, 6), and product (4, 6, 24), which is hardly enough for inducing a recursive program for product; note that there is one less example than for factorial. Indeed, examples such as product(3, 6, 18), product(2, 6, 12), product(1, 6, 6), etc. are missing, which puts the given examples more than one resolution step apart, if not on different resolution paths. This is aggravated by the absence of an oracle for the invented predicate symbol, which is not necessarily intrinsic to the task at hand (although product actually is intrinsic to] factorial).

Background Knowledge

In an inductive programming context, background knowledge is particularly important, as the inference of recursive programs is more difficult than the inference of classifiers. For the efficiency of synthesis, it is crucial that this collection of definitions of the predefined predicate symbols be annotated with information about the types of their arguments and about whether some well-founded relation is being enforced between some of their arguments, so that semantically suitable instances for the open predicate symbols of any chosen program schema can be readily spotted. (This requires in turn that the types of the arguments of the predicate symbols in the provided evidence are declared as well.) The background knowledge should be problem independent, and an inductive programming method should be able to perform knowledge mobilization, namely organizing it dynamically according to relevance to the current task.

In data-driven, analytical approaches, background knowledge is used in combination with explanation-based learning (EBL) methods, such as abduction (see Example 4) or systematic rewriting of input/output examples into computational traces (see Example 5).

Background knowledge can also be given in the form of constraints or an explicit inductive bias as in meta-interpretative learning (Muggleton and Lin 2013) or in using higher-order patterns (Katayama 2006).

Programs and Data

Example 4

The dialogs (Dialogue-based Inductive-Abductive LOGic program Synthesizer) method (Flener 1997) is interactive. The main design objective was to take all extra burden from the specifier by having the method ask for exactly and only the information it needs, default answers being provided wherever possible. As a result, no evidence needs to be prepared in advance, as the method invents its own candidate evidence and queries the oracle about it, with an opportunity to declare (at the oracle/specifier’s risk) that enough information has been provided. All answers by the oracle are stored as judgments, to prevent asking the same query twice. This is suitable for all levels of expertise of human users, as the queries are formulated in the specifier’s initially unknown conceptual language, in a way such that the specifier must know the answers if she really feels the need for the wanted program. The method is schema-biased, and the current implementation has two schemas. The template of the divide-and-conquer schema has the generality of the generic template (2). The template of the accumulate schema extends this by requiring an accumulator in the sequence Z of auxiliary parameters. The evidence language (observation language ) is (non-recursive) logic programs with negation. Type declarations are provided as a language bias. The program language (hypothesis language ) is recursive logic programs with negation, with possibly multiple base cases and recursive cases.

For instance, starting from the empty program for the relation delOdds in Example 1, the algorithm design choices of using the divide-and-conquer schema with R as result parameter and L as induction parameter decomposed (with t = 1) through head-tail decomposition by d lead to the following intermediate open program:

$$\displaystyle{ \begin{array}{llll} \mathit{delOdds}(L,R)&& \leftarrow &c(L,R),\ p(L,R) \\ \mathit{delOdds}(L,R)&& \leftarrow &d(L,H,L_{1}), \\ && &\mathit{delOdds}(L_{1},R_{1}), \\ && &q(H,R_{1},R) \\ c({\_},{\_}) && \leftarrow &\mathit{true} \\ d(L,H,T) && \leftarrow &L = [H\vert T]\end{array} }$$
(3)

The first five evidential clauses for delOdds in (1) are then implicitly interactively acquired from the oracle/specifier by the following question and answer dialogue, leading the specifier to reveal the intrinsic predicate symbol odd:

dialogs:  When does delOdds([ ], R) hold? Specifier:  If R = [ ]. dialogs:  When does delOdds([X], R) hold? Specifier:  If (odd(X) and R = [ ]) or (¬ odd(X) and R = [X]). dialogs:  When does delOdds([X, Y ], R)hold, assuming odd(Y )? Specifier:  If (odd(X) and R = [ ]) or (¬ odd(X) and R = [X]). dialogs:  When does delOdds([X, Y ], R)hold, assuming ¬ odd(Y )? Specifier:  If (odd(X) and R = [Y ]) or (¬ odd(X) and R = [X, Y ]).

Next, abduction infers the following evidence set for the still open predicate symbols p and q:

$$\displaystyle{\begin{array}{rclrcl} p([\ ],[\ ])& \leftarrow &\mathit{true} \\ p([X],[\ ])& \leftarrow &\mathit{odd}(X) \\ p([X],[X])& \leftarrow &\neg \mathit{odd}(X) \\ p([X,Y ],[Y ])& \leftarrow &\mathit{odd}(X),\ \neg \mathit{odd}(Y ) \\ p([X,Y ],[X,Y ])& \leftarrow &\neg \mathit{odd}(X),\ \neg \mathit{odd}(Y ) \\ q(X,[\ ],[\ ])& \leftarrow &\mathit{odd}(X) \\ q(X,[\ ],[X])& \leftarrow &\neg \mathit{odd}(X) \\ q(X,[Y ],[Y ])& \leftarrow &\mathit{odd}(X) \\ q(X,[Y ],[X,Y ])& \leftarrow &\neg \mathit{odd}(X)\end{array} }$$

From this, induction infers the following closed programs for p and q:

$$\displaystyle{ \begin{array}{llll} p([\ ],[\ ]) && \leftarrow &\quad \mathit{true} \\ q(H,L,[H\vert L])&& \leftarrow &\quad \neg \mathit{odd}(H) \\ q(H,L,L) && \leftarrow &\quad \mathit{odd}(H)\end{array} }$$
(4)

The final closed program is the union of the programs (3) and (4), as no predicate invention is deemed necessary. Sample syntheses with predicate invention are presented in Flener (1997) and Flener and Yılmaz (1999).

Example 5

The thesys method (Summers 1977) was one of the first methods for the inductive synthesis of functional (Lisp) programs. Although it has a rather restricted scope, it can be seen as the methodological foundation of many later methods for inducing functional programs. The noninteractive method is schema biased, and the implementation has two schemas. Upon adaptation to functional programming, the template of the linear recursion schema is the instance of the generic template (2) obtained by having X as a sequence of exactly one induction parameter and Z as the empty sequence of auxiliary parameters, and by dividing X into t = 1 smaller value X t , so that there is only t = 1 recursive call. The template of the accumulate schema extends this by having Z as a sequence of exactly one auxiliary parameter, playing the role of an accumulator. The evidence language (observation language ) is sets of ground positive examples. The program language (hypothesis language ) is recursive functional programs, with possibly multiple base cases, but only one recursive case. The only primitive functions are nil, cons, head, tail, and empty, because the implementation is limited to the list data type, inductively defined by listnilcons(x, list), under the axioms empty(nil) = true, head(cons(x, y)) = x, and tail(cons(x, y)) = y. There is no function invention.

For instance, from the following examples of a list unpacking function:

$$\displaystyle{\begin{array}{lcl} \mathit{unpack}(\mathit{nil}) & =&\mathit{nil} \\ \mathit{unpack}((A)) & =&((A)) \\ \mathit{unpack}((A\ B)) & =&((A)\ (B)) \\ \mathit{unpack}((A\ B\ C))& =&((A)\ (B)\ (C))\end{array} }$$

the abduced traces are:

$$\displaystyle{\begin{array}{lcl} \mathit{empty}(X) & \rightarrow &\mathit{nil} \\ \mathit{empty}(\mathit{tail}(X)) & \rightarrow &\mathit{cons}(X,\mathit{nil}) \\ \mathit{empty}(\mathit{tail}(\mathit{tail}(X))) & \rightarrow &\mathit{cons}(\mathit{cons}(\mathit{head}(X),\mathit{nil}),\mathit{cons}(\mathit{tail}(X),\mathit{nil})) \\ \mathit{empty}(\mathit{tail}(\mathit{tail}(\mathit{tail}(X))))& \rightarrow &\mathit{cons}(\mathit{cons}(\mathit{head}(X),\mathit{nil}),\mathit{cons}(\mathit{cons}(\mathit{head}(\mathit{tail}(X)),\mathit{nil}), \\ & &\mathit{cons}(\mathit{tail}(\mathit{tail}(X)),\mathit{nil})))\end{array} }$$

and the induced program is:

$$\displaystyle{\begin{array}{lclcl} \mathit{unpack}(X)& =&\mathit{empty}(X) & \rightarrow &\mathit{nil}, \\ & &\mathit{empty}(\mathit{tail}(X))& \rightarrow &\mathit{cons}(X,\mathit{nil}), \\ & &\mathit{true} & \rightarrow &\mathit{cons}(\mathit{cons}(\mathit{head}(X),\mathit{nil}),\mathit{unpack}(\mathit{tail}(X)))\end{array}}$$

A modern extension of thesys is the igor method (Kitzelmann and Schmid 2006). The underlying program template describes the set of all functional programs with the following restrictions: built-in functions can only be first-order, and no nested or mutual recursion is allowed. igor adopts the two-step approach of thesys. Synthesis is still restricted to structural problems, where only the structure of the arguments matters, but not their contents, such as in list reversing. Nevertheless, the scope of synthesizable programs is considerably larger. For instance, tree-recursive functions and functions with hidden parameters can be induced. Most notably, programs consisting of a calling function and an arbitrary set of further recursive functions can be induced. The first step of synthesis (trace construction) is therefore expanded such that traces can contain nestings of conditions. The second step is expanded such that the synthesis of a function can rely on the invention and synthesis of other functions (i.e., igor uses a technique of function invention in correspondence to the concept of predicate invention introduced above). An extension, igor2, relies on constructor term rewriting techniques. The two synthesis steps are merged into one and make use of background knowledge. Therefore, the synthesis of programs for semantic problems, such as list sorting, becomes feasible.

Applications

In the framework of software engineering, inductive programming is defined as the inference of information that is pertinent to the construction of a generalized computational system for which the provided evidence is a representative sample (Flener and Partridge 2001). In other words, inductive programming does not have to be a panacea for software development in the large and infer a complete software system in order to be useful: it suffices to induce, for instance, a self-contained system module while programming in the small, problem features and decision logic for specification acquisition and enhancement or support for debugging and testing. Inductive programming is then not always limited to programs with repetitive or recursive control structures. There are opportunities for synergy with manual programming and deductive program synthesis, as there are sometimes system modules that no one knows how to specify in a complete way, or that are harder to specify or program in a complete way, and yet where incomplete information such as input-output examples is readily available. More examples and pointers to the literature are given in Flener (2002, Section 5) and Flener and Partridge (2001).

In the context of end-user programming, inductive programming methods can be used to enable nonexpert users to take advantage of the more sophisticated functionalities offered by their software. This kind of application is in the focus of programming by demonstration (PBD).

Finally, it is worth having an evidential synthesizer of recursive algorithms invoked by a more general-purpose machine learning method when necessary predicate invention is detected or conjectured, as such general methods require a lot of evidence to infer reliably a recursively defined hypothesis.

Future Directions

Inductive programming is still mainly a topic of basic research, exploring how the intellectual ability of humans to infer generalized recursive procedures from incomplete evidence can be captured in the form of synthesis methods. Already a variety of promising methods are available. A necessary step should be to compare and analyze the current methods. A first extensive comparison of different ILP methods for inductive programming was presented some years ago (Flener and Yılmaz 1999). An up-to-date analysis should take into account not only ILP methods but also methods for the synthesis of functional programs, using classical (Kitzelmann and Schmid 2006) as well as evolutionary (Olsson 1995) methods. The methods should be compared with respect to the required quantity of evidence, the kind and amount of background knowledge, the scope of programs that can be synthesized, and the efficiency of synthesis. Such an empirical comparison should result in the definition of characteristics that describe concisely the scope, usefulness, and efficiency of the existing methods in different problem domains. A first step toward such a systematic comparison was presented in Hofmann et al. (2009).

Since only a few inductive programming methods can deal with semantic problems, it should be useful to investigate how inductive programming methods can be combined with other machine learning methods, such as kernel-based classification.

Finally, the existing methods should be adapted to a broad variety of application areas in the context of programming assistance, as well as in other domains where recursive data structures or recursive procedures are relevant.

Cross-References