Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Examples can help people understand abstractions [1, 6, 22, 23] such as models. One of the great features of the Alloy Analyzer is that it can mechanically generate examples of the user’s model (formula). These examples are inside-the-box, meaning that they are consistent with the model. If the user deems the generated example desirable then it affirms that the model is a true expression of the user’s intent. If the user deems the generated example undesirable, then it is a concrete representation of an underconstraint problem in the model: the model needs to be tightened to exclude the undesirable example. The Alloy Analyzer generates examples arbitrarily, without specifically targeting towards either desirable or undesirable examples.

If the model has a partial overconstraint bug, then Alloy’s example generation facility is of limited assistance. A partial overconstraint bug means that the model unintentionally excludes some examples that should be included. Total overconstraint means that there are no examples that are consistent with the model. Alloy’s unsatisfiable core feature highlights a subset of the model that causes the total overconstraint. Partial overconstraint bugs are tricky to detect [14], and currently have no explicit tool support in Alloy.

A facility for generating near-miss examples (i.e., outside-the-box examples) might help the user diagnose partial overconstraint bugs. What the user might like to see is an example that is formally excluded by the model but which she actually intends the model to include (i.e., is desirable). Cognitive psychologists have found that near-miss examples revealing contrast are effective for human learning [6].

A simple, if inconvenient, technique for generating outside-the-box examples is to manually negate the model and use Alloy’s existing example generation facility. But the chances of this technique generating examples that are desirable is slim, since there are typically so many more examples outside the box than inside the box. The chances of a near-miss example being desirable are higher, because a near-miss example is similar to examples that are desirable.

We have developed a technique and prototype tool, named Bordeaux, for doing relative minimization of examples. Given mutually inconsistent constraints, A and C, it will search for examples a and c, respectively, that are at a minimum distance to each other (measured by the number of tuples added or removed).

To find a near-miss example for A, simply set C to be the negation of A and commence the relative minimization procedure. We say that a is a near-hit example and that c is a near-miss example (both of A with respect to C). The space between the near-hit and the near-miss is the border: there are, by definition, no examples of either A or C on the border. Examples consistent with either A or C must be within A or C and hence are not on the border. Therefore, the distance of an example to the border cannot be assessed directly: only the distance between the near-hit and the near-miss examples can be measured.

To further guide the search towards desirable near-miss examples, the Bordeaux tool has an affordance for the user to specify which relations are permitted to differ. Bordeaux uses Alloy\(^*\) [9], which is an extension of Alloy Analyzer, to solve formulas with higher-order quantifiers.

The experiments in Sect. 5 compare Bordeaux with Aluminum [13] and the Alloy Analyzer version 4.2. Bordeaux does a better job of producing pairs of near-hit and near-miss examples that are close to each other, with some computational cost. In some cases the absolute minimization technique of Aluminum produces results similar to the relative minimization technique of Bordeaux, but in other cases the results differ significantly.

Based on observations of the experiments, we design and implement two optimizations for Bordeaux in Sect. 6: scope tightening and parallelization. The key observation is that, in practice, the near-hit and near-miss are usually very close to each other. The optimizations reduce the computational cost of Bordeaux by over an order of magnitude.

In the next section, we review related works and discuss how Bordeaux differs from similar tools. Section 3 sketches an illustrative example. In Sect. 4, we define the concepts and formulas for finding near-hit and near-miss examples, and discuss some other special cases of these formulas that might be interesting for users. Section 5 demonstrates the experimental evaluation of Bordeaux and its comparison with the state-of-the-art Alloy analysis tools. Two approaches to optimize the prototype are described in Sect. 6. Section 7 concludes.

2 Related Work

Both Nelson et al. [13] and Cunha et al. [4] have proposed techniques to guide Alloy Analyzer’s example generation facility towards more interesting inside examples. The stock Alloy Analyzer generates arbitrary inside examples, which might or might not be interesting, and might or might not help the user discover underconstraint bugs.

The Nelson et al. [13] extension of Alloy, called Aluminum, generates minimal inside examples. We say that this approach produces absolute minimum examples, because it finds the smallest examples that satisfy a given model. By contrast, our technique looks for relatively minimum pairs of examples: one inside (near-hit) and the other outside (near-miss) that are at a minimum distance from each other; they might not be absolutely minimal from Aluminum’s perspective. Aluminum also has a facility for growing the minimal example, called scenario exploration.

Cunha et al. [4] used PMax-SAT [3] to enhance Kodkod [21] to find examples that are close to a target example. They discussed applications in data structure repair and model transformation. Perhaps this technique could be modified to replace our usage of Alloy\(^*\) in Bordeaux.

When the model is completely overconstrained (i.e., inconsistent), then no inside examples are possible. Shlyakhter et al. [19] enhanced Alloy to highlight the unsatisfiable core of such models; Torlak et al. [20] further enhanced this functionality. This tells the user a subset of the model (i.e., formula) that needs to be changed, but does not give an example (because none are possible inside).

Browsing desirable outside examples might help the user understand what is wrong with the model [1]. In an empirical user study, Zayan et al. [23] evaluated the effects of using inside and outside examples in model comprehension and domain knowledge transfer. The study demonstrated evidence of the usefulness of outside in understanding the models, but did not state any preferences for particular examples. Browsing (desirable) outside examples might also help the user understand partially overconstrained models (in which some, but not all, desirable instances are possible).

Batot [2] designed a tool for automating MDE tasks. The tool generates examples from partial or complete metamodels to be evaluated or corrected by an expert. The minimality and coverage of examples are two major criteria for generating useful examples. Mottu et al. [12] proposed a mutation analysis technique to improve model transformation testing. Their technique mutates the model w.r.t. four abstract model transformation operators and generates mutants for evaluating test-suites. Macedo and Cunha [7] proposed a tool for analyzing bidirectional model transformations based on least changes using Alloy. The tool tries different number of changes to find the least number. Selecting proper scopes for Alloy Analyzer is a major obstacles to scaling the tool.

In his seminal work Winston [22], introduced using near-miss examples in learning classification procedures as well as explaining failures in learning unusual cases. Gick and Paterson [6] studied the value of near-miss examples for human learning. They found that contrasting near-miss examples were the most effective examples for learning. Popelínsky [15] used near-miss examples for synthesizing normal logic programs from a small set of examples. Seater [18] employed the concepts of near-miss and near-hit examples to explain the role, i.e., restricting or relaxing, of a constraint in a given model. Modeling By Example [8] is an unimplemented technique used to synthesize an Alloy model using near-miss and near-hit examples. The technique synthesizes an initial model from a set of examples; it learns the boundaries by generating near-miss and near-hit examples to be reviewed by the user. The near-hit and near-miss examples are from a slightly modified model.

ParAlloy [16] and Ranger [17] realize parallel analysis of models written in Alloy. Both tools partition a given Alloy model and make multiple calls to the underlying SAT-solver. The idea of parallelization in Bordeaux relies on selecting proper scopes as opposed to partitioning the model.

3 Illustrative Example

Consider a model that describes an undergraduate degree in computer engineering, as in Fig. 1. In this illustrative model, a student must take two courses to graduate, and she must have taken all necessary prerequisites for each course.

One can ask Alloy Analyzer to generate an inside example consistent with the model, the analyzer generates an example similar to Fig. 2a. Everything looks OK: this example corresponds with the user’s intentions. But this model harbours a partial overconstraint bug: there are examples that the user intends, but which are not consistent with the model.

Fig. 1.
figure 1

Model of requirements for undergraduate Computer Engineering degree

Bordeaux generates two near-miss examples (Figs. 2b and c). These are outside examples at a minimum distance from the example in Fig. 2a, adding one tuple to relations courses and reqs, respectively. The first near-miss example reveals the partial overconstraint: a student is prevented from graduating if they take an extra course. The user rectifies this by changing the equality predicate () on Line 6 of Fig. 1 to a less-than-or-equal-to (). The second near-miss example is not interesting to the user because it just involves a perturbation of the pre-requisites. Subsequent searches can be set to exclude the relation.

Fig. 2.
figure 2

Examples revealing an overconstraint issue in the model of Fig. 1

Alloy can be used to generate an arbitrary outside example (e.g., Fig. 2d) if the user manually negates the model. This unfocused outside example is unlikely to be meaningful for the user, as it might be too divergent from her intention.

4 Proximate Pair-Finder Formula

We first define the basic concepts and formulas required to understand the formulas that Bordeaux synthesizes for producing near-hit and near-miss examples.

Definition 1

(Model). We define an Alloy model as triple \(\langle R, C, B \rangle \) comprising an ordered set of relations R, a set of constraints (formulas) on those relations C, and finite bounds B for those relations.

Definition 2

(Valuation). A valuation V of model M is a sequence of sets of tuples, where each entry in the sequence corresponds to a relation in M, and is within M’s bounds B. Let \(\mathcal {V}\) name the set of all possible valuations of M.

The size (\(\#\)) of a valuation is the number of tuples: \(\#V \triangleq \sum _{i=1}^{|R|} |V_i|\)

Definition 3

(Instance). An instance I of model M is a type-correct valuation of M, according to Alloy’s type system [5]. Briefly, every atom contained in the instance will be in exactly one unary relation, and the columns of each non-unary relation will be defined in terms of the unary relations.

Suppose that I and J are two instances of model M.

The difference of I and J (\(I-J\)) is a valuation of model M that, for each relation, contains the tuples from I that are not in J.

We say that J is a subset (\(\subset \)) of I if there is at least one relation for which J’s tuples are a strict subset of I’s tuples, and no relation for which I’s tuples are not included in J’s tuples; formally: \(J \subset I \triangleq \wedge _{i=1}^{|R|} (J_i \subseteq I_i) \ \wedge \ \exists i | J_i \subset I_i\)

The distance from I to J is \(\#(I-J)\).

Let \(\mathcal {I}\) name the set of all instances M.

Definition 4

(Inside-the-box Example). Instance I is an inside example of model M if I satisfies M’s constraints C.

Definition 5

(Outside-the-box Example). Instance O is an outside example of model M if O does not satisfy M’s constraints C.

4.1 Proximate Pair-Finder Formula

The core of Bordeaux generates variants of the Proximate Pair-Finder Formula (PPFF), which it gives to Alloy\(^*\) to solve. The input to the PPFF generation is two mutually inconsistent sets of constraints, \(C_1\) and \(C_2\), over the same set of relations, R. A solution to the PPFF is a pair of examples, one of which (\(e_1\)) is inside \(C_1\), and the other of which (\(e_2\)) is inside \(C_2\). The key property of these two examples is that they are a minimum distance to each other. In the special case where \(C_2\) is the negation of \(C_1\), which the narrative of this paper focuses on, then \(e_1\) is a near-hit example of \(C_1\) and \(e_2\) is a near-miss example of \(C_1\).

The PPFF is expressed as a set-comprehension that returns a pair of examples \(e_1\) and \(e_2\). The PPFF contains two higher-order quantifiers: they are higher-order because they quantify over valuations (sets of sets). The formula effectively says that there is no other pair of examples that are closer to each other than are \(e_1\) and \(e_2\). Valuation v in the PPFF is the difference \(e_2 - e_1\). Valuation w in the PPFF is the difference \(e_2^\prime - e_1^\prime \). The relative minimization condition is that the size of w is not smaller than the size of v: \(\#v \le \#w\).

Fig. 3.
figure 3

Proximate Pair-Finder Formula (PPFF). The first line defines \(e_1\) and \(e_2\) as examples of \(C_1\) and \(C_2\), respectively. The second line defines v as the difference \(e_2 - e_1\). The third line introduces alternative examples \(e_1^\prime \) and \(e_2^\prime \), and their difference w. The fourth line says that w is not less than v: i.e., there is no pair of alternative examples that are closer to each other than are \(e_1\) and \(e_2\).

In the degenerate case where \(C_1\) and \(C_2\) are not mutually inconsistent, then the PPFF will always return \(e_1 = e_2\), because any arbitrary example is at distance zero to itself. The PPFF is not designed to be meaningful when the constraints are not mutually inconsistent.

The examples \(e_1\) and \(e_2\) are not necessarily absolutely minimal with respect to \(C_1\) and \(C_2\), respectively. These two examples are relatively minimal with respect to each other: that is, the distance between them is small.

4.2 Encoding the PPFF for Alloy\(^*\)

Alloy\(^*\) supports higher-order quantifiers: i.e., quantifiers over relations, which is required to solve PPFF. The user’s model must be written in regular Alloy, with no higher-order quantifiers. Bordeaux transforms the user’s Alloy model into an Alloy\(^*\) model and adds a variant of the PPFF synthesized for the user’s desired search. Bordeaux then transforms the Alloy\(^*\) solution back into the terms of the user’s original model.

While the Alloy\(^*\) language is syntactically a superset of the regular Alloy language, so the user’s model is a legal Alloy\(^*\) model, simply taking the user’s model as-is will not work for the PPFF. This paper focuses on the special case where \(C_2\) is the negation of \(C_1\), and \(C_1\) is all of the constraints of the model. So the transformation to prepare for solving the PPFF must bundle up all of the constraints of the original model (fact blocks, multiplicity constraints, etc.) into a single predicate.

In actuality, the PPFF is generated using existential quantifiers rather than a set comprehension, and the skolemization gives the examples \(e_1\) and \(e_2\).

4.3 Special Cases of Potential User Interest

The user might be interested in some of the following special cases, which can all be easily accommodated by generating the PPFF with specific settings for \(C_1\) and \(C_2\) (some of these are not yet implemented in the current prototype [10]):

  1. 1.

    Find a near-miss example and a near-hit example: Set \(C_2\) to be the negation of \(C_1\) (as discussed above).

  2. 2.

    Find a near-miss example close to an inside example: Set \(C_1\) to be a predicate that defines the inside example, and set \(C_2\) to be the negation of the model’s constraints.

  3. 3.

    Find a near-hit example close to an outside example: Set \(C_1\) to be a predicate that defines the outside example, and set \(C_2\) to be the model’s constraints.

  4. 4.

    Restrict the difference between the examples to certain relations: The difference operation can easily be generated over a user-specified subset of the relations, rather than all of them.

  5. 5.

    Smaller near-miss examples: In PPFF, \(e_2\) is bigger than \(e_1\). If \(C_2\) is the negation of the model’s constraints, this will result in a near-miss example that is larger than the near-hit. To get a smaller near-miss example, simply set \(C_1\) to be the negation of the model’s constraints, and \(C_2\) to be the model’s constraints.

  6. 6.

    Find a near-miss example for an inconsistent model: If the original model is inconsistent, then it has no inside examples. A workaround for this situation is to set \(C_1\) to be an empty example (no tuples), and set \(C_2\) to be the negation of the model.

5 Experiments

To study the idea of browsing near-hit and near-miss examples, we have developed Bordeaux, a prototype that extends Alloy Analyzer. This study includes the experiments carried out to compare Bordeaux with other tools. From this study, we also show paths that optimize the performance of Bordeaux in finding near-miss examples. In this section, we explore the experiments revealing the position of Bordeaux among other similar tools. The next section discusses our ideas to optimize the prototype.

Given an example, Bordeaux can find a near-miss example. Users can browse more near-miss examples or ask for a near-hit example. To support this way of browsing, Bordeaux performs a relative minimization; namely, minimizing a distance between an inside example and an outside example. Although users cannot browse near-hit and near-miss examples with Alloy Analyzer, they can manually modify models to produce inside example and outside examples. Using Aluminum, the users can find minimal examples, and if they manually negate the model, they can browse minimal outside examples, too. Aluminum’s concept of a minimal example, which we call absolute minimal, is an example with the smallest number of tuples.

The experiment includes five models that are shown in Table 1. We have used an Intel i7-2600K CPU at 3.40 GHz with 16 GB memory. All experiments are done with MiniSat. In what follows, we explain the experiments and discuss their contribution to answer the following research questions:

  • RQ-1 What is the extra cost for the relative minimums?

  • RQ-2 How many near-miss examples can Bordeaux find in one minute?

  • RQ-3a How far are arbitrary outside examples from the near-miss?

  • RQ-3b How far are absolute minimum outside examples from the near-miss?

Table 1. Comparing Bordeaux (B) and Alloy Analyzer (A) to find outside examples

To study the extra cost for finding near-miss examples with Bordeaux, we used Alloy Analyzer to find arbitrary inside examples and outside examples and compared their costs to using Bordeaux to find near-hit/near-miss example pairs (Table 1). To find the outside examples, we manually negated the studied models, i.e., if C is a model’s constraint, then \(\lnot C\) gives the negation of the model. In these experiments, for Bordeaux, we set \(C_1\) to be equal to the arbitrary example returned by Alloy.

In Table 1, it can be seen that Bordeaux does not incur much additional cost for small models, but once the model gets larger the costs get significant (Item RQ-1). The small Binary Tree model is an exception where Bordeaux appears to run faster than the stock Alloy Analyzer. Occasional anomalies such as this are common with technology based on SAT solvers.

For answering Item RQ-2, we have done another experiment to count the number of distinct near-miss examples that Bordeaux generates in one minute. The results show how the prototype’s performance degrades for the Alloy models with more relations or larger formula size. Given examples, Bordeaux produces 27, 4, 31, 15, and 9 distinct near-miss examples respectively for Singly-linked List, Doubly-linked List, Binary Tree, Graduation Plan, and File System models in one minute. The performance descends because Bordeaux reformulates and resolves the model per each distinct inside example and outside example. Bordeaux returns more near-miss examples for Singly-linked List and Binary Tree models, as the given examples of both models are fairly simpler than the others. Therefore, the near-miss examples will have relatively fewer tuples. That is, smaller near-miss examples lead to smaller and relatively simpler formulas for excluding redundant near-miss examples.

To answer Item RQ-3a and Item RQ-3b, we have performed another experiment to demonstrate how near-miss examples that Bordeaux systematically produces differ from outside examples that other tools produce from manually modified models. To do so, using various sizes of examples of different models, we evaluated their distances to outside examples that each instance-finder produces. We have selected Alloy Analyzer and Aluminum for comparing with Bordeaux. Although Alloy Analyzer and Aluminum do not provide capabilities for browsing outside examples, we have manually transformed the models and synthesized required statements.

For comparing relative minimal, absolute minimal, and arbitrary outside examples, we have used the aforementioned tools to find outside examples given arbitrary, small, medium, and large size examples. In the case of arbitrary examples, each tool finds a pair of inside example and outside example without any extra constraints on the size of examples. With restricted-size examples, all the tools have to first generate the same size examples, then generate outside examples for them. Depending on the models, the size of the examples varies from two to five tuples in small size examples and nine to thirteen tuples for the large size examples. We have recorded the size of inside examples and outside examples that each tool produces, as well as the number of tuples that should be added or removed from an example to make an example identical with its paired outside example.

As Fig. 4 shows, Aluminum generates absolute minimal inside examples and outside examples once the example size is arbitrary. It also always produces minimal outside examples regardless of the size of given examples. Alloy Analyzer generates arbitrary examples close to absolute minimal size, but the sizes of outside examples do not follow any particular pattern. Although Bordeaux produces examples in arbitrary sizes, it produces outside examples with one more tuple in all the models.

Fig. 4.
figure 4

Comparing Bordeaux, Alloy Analyzer, and Aluminum with respect to the number of tuples that differ between an example and an outside-the-box example.

Depicted in Fig. 4, Bordeaux produces an outside example in a minimum distance from a given example. Answering Item RQ-3a, Alloy Analyzer behaves arbitrarily to produce outside examples close to the examples. The distances from examples to outside examples increase for larger examples. Answering RQ-3b, for arbitrary and small examples, Aluminum produces outside examples that are fairly close to the examples. Given medium and large examples, Aluminum finds outside examples with larger distances from the given examples. Although the distances between inside examples and outside examples, generated by Aluminum, do not fluctuate like the distances between inside examples and outside examples produced by Alloy Analyzer, they show relative minimum distance similar to those found by Bordeaux.

Moreover, finding an outside example by negating the model provides no direction for adding or removing tuples. Although we expected to see a near-miss example with extra tuples, as generated by Bordeaux, Aluminum produced an outside example with fewer tuples for the Singly Linked-list model. Unlike Bordeaux, Alloy Analyzer and Aluminum do not directly produce outside examples of a model. Simulating a model’s negation does not necessarily cause that Alloy Analyzer and Aluminum produce outside examples in a minimum distance from given examples of the studied models.

6 Optimization

By reviewing the experiment results, we have observed a trend in distances between inside examples and outside examples returned by Bordeaux. In the studied models, with the addition of a single tuple, all inside examples and outside examples become identical. In the other words, the examples are already near-hit examples, and they can be pushed to be outside examples with the minimum number of changes, i.e., a single tuple. This observation assists us to select tighter scopes and parallelize searches for inside examples and outside examples. Without choosing tight scopes, the analysis becomes infeasible. Using parallelization, the time to find near-miss examples improves to 2.2 s on average from several minutes without parallelization.

6.1 Selecting Tighter Scopes

If most examples are near-hit examples, as the case studies show, Bordeaux can approximate the scope of each unary relation to be one more than the number of its tuples in the example when Alloy\(^*\) is used for the underlying solver. As depicted in Table 2, we have rerun our experimental models by selecting scopes of one (+1), two (+2), and three (+3) more than the number of tuples of the example for each unary relation in the models. Note that these scopes limit the number of tuples only for unary relations. Non-unary relations still can have any tuples in difference between inside example and an outside example.

Table 2. Showing how selecting different scopes affects the cost of analysis performed by Alloy\(^*\). The notations ‘+1’, ‘+2’, and ‘+3’ show the records when the scopes of all unary relations in the studied models are set to one, two, and three more tuples than the number of tuples in the same relations of examples. The columns with ‘+1’ in their headers contain the actual records. The other columns contain the increase ratios.

When the scopes of unary relations increase by one, Bordeaux can find a near-miss example for a studied model within 7.5 min on average. Provided the scopes increase by two, the time to find a near-miss example is inflated by the ratio of 8.43 on average. If the scopes increase by three, the time to find a near-miss example is fifteen times longer than the scopes with one more unary tuple. Moreover, except for one model, Bordeaux did not terminate within 90 min if the example size is large, and the scopes increase by two. Such a lack of results within the time-limit is more frequent once the scope increases by three. Selecting the tightest scope increase can make the problem tractable for Bordeaux. If Bordeaux cannot find a near-miss example with the least scope increase, it can increase the scopes and search in a larger universe of discourse.

6.2 Parallelization

Increasing the number of atoms exponentially elevates the size of the SAT-formula, the translation time to generate it, and its solving time. In some cases, such as the Binary Tree model with a large size example, if the scope is not properly selected, Alloy\(^*\) cannot find a near-miss example within several hours. Another factor that influences on the magnitude of the SAT-formula is the number of integer atoms that Bordeaux incorporates into the formula to prevent the integer overflow that might occur for distance calculations.

Table 3. Parallelizing PPFF can improve the efficiency of Bordeaux. Columns show the ratio of metrics measured from solving without breaking PPFF, recorded in columns labeled by ‘+1’ in Table 2, to different approaches solving PPFF for each relation. The columns Min-R and Max-R show the improvement ratio for using parallelization. For Min-R, the first process finds a near-miss example, and for Max-R, all processes finish their searches. Columns Seq-R shows the differences while all processes run sequentially.

Observing that most examples are near-hit examples and can become near-miss examples by adding or removing a single tuple, we make new formulas so that each one applies PPFF on individual relations. Solving each formula, Bordeaux may find a near-miss example for a given example regarding a particular relation of the model.

Finding near-miss examples for each relation has the benefit of avoiding additional integers in the universe of discourse. Depending on how many relations a model has, Bordeaux can solve a PPFF per each relation that leads to a relatively smaller universe of discourse. As Bordeaux can independently find near-miss examples per each relation, one approach is to parallelize the search so that each process searches for a near-miss example per each relation.

The parallelization applies to all relations in the model. The parallelization has no particular restriction on the scopes of the model’s relations. However, selecting proper scopes increases the performance. If a process tries to find a minimum distance with respect to a unary relation, increasing the scope of the relation by one causes a found instance to be in the distance of one, provided adding tuples is requested. Since Alloy only allows restricting scope for unary relations, no increase is the tightest scopes for a process that tries to find a minimum distance with respect to a non-unary relation; more than one tuple of a non-unary relation can also change.

In this approach, if a process finds a near-miss example first that is at a distance of one from the given example, then all other processes can stop their searches. Otherwise, all processes should continue. In the end, either (a) the last process returns a near-miss example with the distance of one from the given example, (b) all processes return nothing, (c) some processes return near-miss examples with a distance of two, or (d) some processes return near-miss examples with a distance of three or more.

If PPFF can find a near-miss example in distance one from the given example, then one of the processes must be able to find it too. Clearly, the near-border examples finder formula found the example because only one relation gets or loses a tuple, so running the formula over that relation will get the same result. If such a near-miss example exists, one process has to return it. In case (a), the last finished process returns such a near-miss example.

If no process returns a near-miss example, i.e., case (b), either such an instance does not exist at any distance from the given example or adding or removing more than one tuple from two or more relations turns the given example into an outside example. In the first situation, PPFF also returns no near-miss example; however, PPFF returns a near-miss example once tuples of more than a single relation need to be changed.

In case (c), some processes return near-miss examples with distance two from the given example. Then two is the true minimum distance. If there were a closer near-miss, it would be at distance one, and one of the other processes would have found it. Since that didn’t happen, two is the minimum distance.

If a process returns a near-miss example with a distance of three from a given example and all other processes return no shorter distances, the PPFF might find a near-miss example in a closer distance which is exactly two. The distance might be two if simultaneously altering tuples of two relations makes a near-miss example; therefore, the individual processes cannot find such a near-miss example. If there was a near-miss example with distance one, processes should have returned it. In this case, distance three might be a local minimum. The same argument is valid for a distance of four or more in case (d).

As the case studies show, distances between an example and its paired near-miss example is highly likely to be one; therefore, parallelizing Bordeaux would often give the near-miss examples. As discussed, the process could also provide a good approximation of the minimum distance. In our practiced cases, all near-miss examples are found in distance one from given examples.

As Table 3 shows, parallelization improves the search for near-miss examples. In all studied models, regardless of the sizes of the given examples, parallelization decreases the size of the SAT-formula, the translation time to generate it, and the solving time. We have measured this improvement by recording the time and resources taken to find the first near-miss example, as well as the time and resources taken to finish all parallel processes. Compared to using non-broken PPFF with the least scope increase, the time to concurrently find the first near-miss example decreases by the ratio of 70.9 on average. Waiting for the termination of all processes’ results changes the ratio to 30.2.

If there are not enough resources available for parallelization, sequentially running the decomposed processes still has value. The studied models show that the sum of the process times is often less than the general time when the size of an example is large. Since most of the Alloy statements synthesized for each process are the same, the translation time might be saved by reusing some parts that have already been translated for the formula of another relation. Full assessment of this idea is left for future work.

7 Conclusion

Bordeaux is a tool for finding near-hit and near-miss example pairs that are close to each other. The near-hit example is inside-the-box: it is an example of the model the user wrote. The near-miss example is outside-the-box: it is almost consistent with the user’s model except for one or two crucial details. Others have found near-miss examples to be useful [1, 6, 22, 23]. In particular, Gick and Paterson [6] found that pairing a near-miss example with a similar near-hit example increased human comprehension of the model. We posit that such pairs might be particularly helpful for discovering and diagnosing partial over-constraints in the model. Tool support for this task is currently limited.

The Bordeaux prototype has been built to work with ordinary Alloy models. It works by transforming the user’s Alloy model and synthesizing a query with higher-order quantifiers that can be solved by Alloy\(^*\) [9]. Through experiments we have observed that near-hit and near-miss examples often differ in no more than one tuple. We have based two optimizations on this observation: scope tightening and parallelization. Together, they significantly reduce the cost of searching. The formalization of the idea, the PPFF (Fig. 3), is more general than the specific use-case that our narrative has centred on. The formalization works from a pair of inconsistent constraints. The use-case narrative in this paper has focused on the specific circumstance when one constraint is the negation of the other, and sometimes even more narrowly on when the first constraint is a specific example. We intend to make use of the more general facility in our forthcoming implementation of a pattern-based model debugger [11].