Introduction

Learning experiences are increasingly happening in open and unstructured contexts in which learners can choose their own learning goals, and dynamically reset them as they proceed. Learning objectives can be met by gleaning experiences from a variety of sources, often accessible through the web. But, even (or especially) in such open-ended learning contexts learners need guidance as to what learning goals to set and how to achieve them. This paper is about how to provide such guidance, and in particular how to provide guidance that is personalized to learners and appropriate to under-specified and dynamically changing learning contexts.

In more traditional learning contexts with well known topics, curricula, and defined learning objectives, intelligent tutoring systems have been built that can provide personalized guidance to learners through instructional (now usually called “pedagogical”) planning (Vassileva and Wasson 1996). Pedagogical planning systems take advantage of known relationships in the knowledge to be learned, prior knowledge of misconceptions and errors that learners typically make, already established methods for evaluating learners, and models of various learner characteristics to personalize the learning experience for each learner. For example, the Annie system (Thomas and Young 2011) is a discovery environment that uses structured input about the domain and the learning activity (including the initial and goal states of the world and a list of tasks with preconditions and expected postconditions) and supports learner discovery of the sequence of steps needed to achieve the goal. But for the open and unstructured learning contexts we are interested in, unfortunately none of these knowledge sources is typically available.

An alternative way of helping learners that is more suitable for such environments is through adaptive hypermedia (Brusilovsky et al. 1998). Typically, in an adaptive hypermedia system there is a model of each learner that is powered by domain models created by authoring teams (Brusilovsky 2012). The system consults these models for information that can be used to help the user navigate through “hyperspace”. Some adaptive hypermedia systems have been focussed on learning applications, such as learning Java programming (Hsiao et al. 2013). Systems can also serve as a tool for learners to organize and reflect on their own exploration such as in a system about math (Conati and Bunt 2004) or about colour concepts (Kashihara and Kawai 2010). Usually, content is well structured in the traditional sense, such as a course with specific desired learning outcomes.

Another alternative is through a recommender system for education (Manouselis et al. 2011). Recommender systems typically have a user model that tracks characteristics of the user (the learner in educational contexts) that are used to recommend items of relevance or interest to them. There are two main types of recommender system: content based and user based (Burke 2002). The former works by finding items with similar features to other items the “target” user has found interesting and recommending them to the user, but this method requires considerable metadata describing the content. The latter works by using collaborative filtering, a technique that finds users who have similar interests to the target user (as expressed in their user models) and recommends items that similar users have found interesting.

Typically, recommender systems recommend items one at a time, rather than longer learning paths, which are more appropriate for educational applications. Although there is some work with sequences of items, this research has not yet expanded much into educational applications. An approach called collaborative temporal order modeling in the recommender system literature (Karatzoglou 2011) makes use of user sequences of previous actions to recommend the next item (though not a sequence). In another approach called constructive recommendation, the system recommends structured groups of objects (Dragone 2017) that may take the form of a guided path. Tourism or travel route recommenders will recommend an ordered sequence of points to visit efficiently, without a lot of backtracking. This can be approached as a structured prediction problem (Chen et al. 2017) or by using recurrent neural networks (RNNs) (Baral et al. 2018). RNNs are used to incorporate the dimension of time into recommendations and have been implemented with individual user information to create personalized prediction of the next item in a sequence (Donkers et al. 2017).

Our goal is to find ways of recommending personalized sequences of items that will help a learner, and to be able to do so without having to create a lot of metadata about the learning material, without having to know the explicit goals that the learner has at any given point, and without having to create a structured learner model in advance and to constantly update and maintain it. In this paper we will describe a pedagogical planner that is able to continuously suggest a promising sequence of next steps to a learner as they navigate their way through an open-ended and unstructured learning environment, and is able to do this without pre-ordained knowledge about the information the learner is consuming and without pre-ordained knowledge about the learner, although having a learner model (or computing one from learner behaviour) can be useful to our approach.

The Ecological Approach and Pedagogical Planning

So far we’ve described open-ended and unstructured learning in an informal way. In this section we will be more precise about the kind of learning environment we are envisaging. In particular we are assuming learning environments that can be framed according to the ecological approach (EA) architecture (McCalla 2004). In the EA, a learning environment is assumed to be a collection of learning objects (LOs), where “learning object” is very broadly defined. A LO could be a block of text, an interactive simulation, a module of a tutoring system, a set of questions, a visualization of a complicated system, a formal test, in short any digital object from which somebody could learn. In particular, any web page could be a learning object. The EA architecture also assumes that there will be a learner model for each learner that contains characteristics of the learner, although this learner model does not necessarily have to be created and maintained externally. In fact, a learner model can be extracted and maintained over time from patterns in learner behaviour as the learner interacts with the learning objects. This is enabled in the EA since each learning object captures usage data as a learner interacts with it, both episodic data describing the actual interaction behaviour of the learner and an instantiation of the current version of the learner model. As learners interact with LOs, data accumulates with each LO about the learners who interacted with it and their behaviour while interacting with it. It isn’t our intention here to discuss in detail the many interesting aspects of the EA (for that see McCalla 2004), but note that an EA learning object repository is a rich and evolving source of data about learner behaviour that can be data mined for all sorts of interesting pedagogical patterns. For our purposes in this paper the goal is to find patterns that can inform a pedagogical planning algorithm.

To aid a learner who is driving their own learning in an EA environment, the planning problem becomes one of recommending a sequence of LOs that is personalized to the needs and capabilities of the learner. A planner would not only recommend a sequence of LOs, but also continuously monitor a learner’s interactions with each LO looking for patterns in the interaction data attached to the LO that provides information about how well the learner is doing. Through such monitoring, at any time the planner could recommend a new sequence of LOs that might be more effective for the learner. With the advice of such a planner the learner can be guided to more effective outcomes than a learner who haphazardly searches for learning objects with little or no guidance.

Planning in an EA environment has very different constraints than planning in a more structured environment. In an EA environment there is no externally created metadata about how the learning objects relate to one another nor how the learning objects relate to learners or learning outcomes. Instead, all relationships must be derived by the planner based on evidence extracted from the learner model data and behavioural data attached to learning objects. This means that an educational data mining approach based on finding patterns in learner data is the only way to inform a pedagogical planner in the EA. In fact for our planner we have used a collaborative filtering approach to find a neighbourhood of learners who in the past have had a similar learning experience to a target learner (as judged through analysis of interaction data), and then to create a plan for the target learner based on successful future learning experiences of learners in the neighbourhood (success also being judged through analysis of interaction data). We call this planner the CFLS planner: Collaborative Filtering based on Learning Sequences.

CFLS Planner

The CFLS planner first must build a neighbourhood of learners that are similar to the target learner. The CFLS planner looks at the b (standing for “backward”) most recent learning objects that the target learner has interacted with, and searches to find any other learners who also followed the same (or a similar) sequence of b LOs at some point in the past. Those learners with a comparable sequence at any point in their history are added to the neighbourhood. The neighbourhood could include learners who interacted with the materials long ago, since their usage data remains within the EA to benefit future learners. When their usage data was captured in the EA, a timestamp would have been recorded so that the order of the LOs visited can be re-created, even if months or years have passed since the interactions occurred. Neighbours could be assembled using an exact sequence match, or using a less restrictive set match, or a partial order in between. The best approach depends on the density of data available. We discuss our design decisions in “Experimenting with the CFLS Planner”.

Next, the CFLS planner checks each neighbour to see what happened to them after they interacted with the matching sequence of LOs. Along the neighbour’s subsequent path, the CFLS planner examines the usage data and makes a judgement about how “successful” the path was in terms of pedagogical outcomes (described more precisely near the end of “Experimenting with the CFLS Planner”). The CFLS planner then ranks the candidate paths and presents the most promising path to the target learner, producing a plan consisting of f (standing for “forward”) learning objects.

After the learner follows the sequence, the CFLS planner can recommend the next sequence ahead by repeating the above process. It discards the old neighbourhood and calculates a new one based on the learner’s now most recent sequence of b LOs. The members of a target learner’s neighbourhood will thus be changing all the time because the neighbourhood is reassembled as the target learner gathers new experiences.

The CFLS planner will perform more or less effectively depending on the amount of usage data available, i.e. the number of LOs, the number of learners, and the amount of interaction data that has been captured in the repository. If there isn’t much data available, there is a cold start problem. In this case the CFLS planner has to assemble a neighbourhood of learners with just a few LOs or even a single LO in common with the target learner. As a last resort, the CFLS planner can select a random learner.

We now need to show through an experiment that the CFLS planner works. We do this through simulation. We use simulation mainly because it allows us to generate lots of “learners” very quickly without the time and expense of using real human learners, and without any negative consequences to human learners. We can also fine tune the experiments by having complete control of the mix of learners and the various parameter tunings of the CFLS planner. Finally, simulation allows us to overcome the cold start problem by generating an initial knowledge base of synthetic data before running the actual experiment with learners who now have sufficient interactions for the CFLS planner to find meaningful neighbourhoods.

In the next section we describe the simulation model we have created to investigate the characteristics of the CFLS planner.

A Simulation Model to Test the CFLS Planner

Simulation is especially useful for designing technology where desirable behaviour isn’t explicitly built-in top down, but rather emerges from independent agents interacting with each other. Social scientists use simulation for exploring theories of social interaction by building artificial societies (Gilbert and Troitzsch 2005). Computer scientists use simulation to design for emergent behaviour in multi-agent systems (Keogh and Sonenberg 2014).

The same can be done for learners in educational environments. For example, the RLATES system uses a simulation for its training phase before being deployed with real learners (Iglesias et al. 2009). The authors trained this system on the interaction data from simulated learners using reinforcement learning (a Markov decision process), which learns to teach by trial and error and uses the simulation to try to reduce the number of teaching steps required. Simulation has also been used to create a synthetic dataset for a recommender system for an open corpus learning environment (Drachsler et al. 2008). In the AIED community simulation has been suggested as a promising technique ever since (VanLehn et al. 1994)’s seminal paper outlining three major ways it could be used: (i) for human teachers to practise teaching simulated agents, (ii) to test system designs through simulation, and (iii) for building simulated learning companions. Use (iii) has seen much work that has led to an entire AIED sub-field called pedagogical agents. But there hasn’t been all that much follow up research into the use of simulation for the other two purposes. Recently, however, some research into the other uses of simulation has emerged, such as a high fidelity simulation of a cognitive tutor to be used in reciprocal learning (Matsuda et al. 2015) (use (i)), a medium fidelity simulation to explore aspects of a Ph.D. program (Lelei and McCalla 2018) (use (ii)), and a low fidelity simulation to explore simple planning and peer learning algorithms (Champaign 2012) (use (ii)). We compare our approach (also use (ii)) in more detail to Champaign’s work in the literature review section below, since our work also deploys low fidelity simulation to explore a planning algorithm. For our approach, we simulate an environment based on the ecological approach architecture. Our simulation is comprised of simulated learners who interact with simulated learning objects. These interactions generate usage data that allow us to observe the effects of the CFLS planner in operation. Our model doesn’t have to be too sophisticated or realistic to shed light on the CFLS planner.

To model the learners, we simply capture that they differ from one another, calling this difference their aptitude. We don’t mean that some learners are more intelligent than others, just that some will be more likely to achieve high performance. We don’t need to model the reasons; whether they worked harder to become better prepared, whether they have some external advantage, or some other reason, it doesn’t matter to our model. It only matters that different learners bring different chances of success. Aptitude is represented with a number between 1 and 10, where a higher number means that in the evaluation function (below) the learner will “master” learning objects more easily. To keep this model simple, we say that an individual learner’s aptitude does not change as they learn, although, of course, their knowledge of the topic will.

Our model also contains learning objects. We capture that LOs are different from each other by creating an attribute called difficulty level, a number between 1 and 10 where a higher number indicates more difficult material. Again, in the evaluation function (below) learners will find more difficult LOs harder to master.

Additionally, we need to capture that LOs are often related to each other. In particular, a LO can be a prerequisite of another LO in that if a learner fails to grasp a prerequisite LO, they are more likely to struggle with the subsequent LO. In our simulation model we thus give each LO a list of references to prerequisite LOs. It should be emphasized that these prerequisite relationships are a property of the simulated world, but they are unavailable to the CFLS planner, which has been designed to work for situations where such explicit prerequisite relationships among LOs (or any other relationships among LOs) are unavailable. This is discussed in more detail in “Experimenting with the CFLS Planner”.

Next, our simulation needs a representation of learning outcomes. To represent whether a learner has learned a LO, we have created a number called P[learned] with a value between 0 and 1. P[learned] can be thought of as the likelihood that the learner mastered the LO, or the system’s belief that the learner knows the LO. P[learned] is calculated using an evaluation function, a concept we introduced in earlier work (Erickson et al. 2013).

The evaluation function that we’ve defined for this simulation model is shown in (1). It has four terms, each normalized to a value between 0 and 1, denoted in small caps. Each term is weighted as to its overall importance in determining whether the learner has learned the LO. Note that P[learned] will always calculate to a number between 0 and 1. Whenever P[learned] ≥ 0.6, the learner is considered to have mastered the LO. A higher threshold like 0.8 wasn’t chosen because it would have made it too difficult for low aptitude learners to progress.

Let us now look at how each term is computed.

$$ \begin{array}{@{}rcl@{}} &P[learned] = \\ &(0.2)(\textsc{aptitude}) + (0.1)(\textsc{1 - difficulty-of-LO}) +\\ &{}(0.5)(\textsc{hasPrerequisites}) + (0.2)(\textsc{seenBefore}) \end{array} $$
(1)

The first term takes into account the learner’s aptitude. The higher the aptitude, the more this term will contribute to the learner knowing the LO.

The second term takes into account the difficulty level of the LO. Equation (1) includes an inverse (a subtraction from 1) because the lower the difficulty level, the higher the chance the learner learned the LO.

The third term, hasPrerequisites, is computed as follows. If the LO has no prerequisites, then the number 1 is taken as the value for this term. If the LO has a single prerequisite, the number 1 is also taken if the learner has successfully mastered (P[learned] ≥ 0.6) the prerequisite; otherwise 0 is taken. If the LO has multiple prerequisites, the simulation checks the usage data to find out how many of the prerequisites the learner actually mastered, and then it takes the fraction.

The last term is seenBefore, capturing that a learner is more likely to master the LO if they’ve seen it before. We defined seenBefore to return 0 if the learner has never seen the LO before, to return 0.1 if it has been seen once before, 0.2 for twice before, and so on up to 1.0 for ten times before. The value for this term is derived from the previous usage data (such data has been kept, as prescribed by the EA).

Of all four terms, the one with the heaviest weighting is hasPrerequisites, making up half of the overall weight. This is because in formal course settings, failure to master prerequisites is known to have a big impact on learning (hence all of the effort in curriculum design). So, the simulated learners are more likely to fail if the LO selection algorithm feeds them LOs that deviate from prerequisite order, but it is still possible for learners to master them. As mentioned before, prerequisites have no influence whatsoever on the CFLS planner, since the planning algorithm does not know anything about prerequisites. The effect of prerequisite relationships among the LOs is only discernible to the CFLS planner indirectly through their effect on how well the learners learn.

The next two terms, aptitude and seenBefore, have equal weight. The least weight is given to 1 - difficulty-of-LO, thus allowing for the possibility that even a LO with high difficulty can be mastered.

Each time a simulated learner interacts with a simulated LO, the P[learned] is computed, and a learner identifier, the P[learned] value, and a timestamp are “captured” and associated with the LO, as required in the EA. In our simulation this is the only usage data that is kept. But the total amount of data that informs the CFLS planner grows and grows as the simulation proceeds, with the evaluation function being calculated hundreds or even thousands of times.

Our simulation model has “low fidelity”, but does capture important aspects of a learning environment. Learner aptitude allows us to represent that there are differences among learners. Each learning object has content, albeit consisting only of a difficulty level and its prerequisite relationships to other LOs. The outcome of learner interactions with learning objects can be determined, using the evaluation function. And, the capture of a learner’s interactions with a LO (required in the EA) is through associating the P[learned] value and timestamp of that learner with the LO. Of course, if the CFLS planner were deployed in the real world of human learners and actual learning objects, the learner models would be much more complex, the learning objects would have serious educational content, the interactions being captured would be derived from all sorts of learner behaviour (mouse clicks, scrolling actions, page dwell times, etc.), outcomes would have to be determined from inferences made about learners and their behaviour, and much more data would need to be captured with the learning objects. Nevertheless, we feel the simulation model does have enough fidelity to the real world to allow us to explore the characteristics and capabilities of the CFLS planner operating in an open ended EA environment. In the next section we discuss the actual simulation experiments we undertook to shed light on the CFLS planner.

Experimenting with the CFLS Planner

We have run a simulation experiment to explore whether the CFLS planner is effective.Footnote 1 To this end, we created two other planners, baselines to which to compare the CFLS planner.

The first of these is a simple prerequisite planner (SPP). The SPP serves up a plan to a simulated learner that suggests that they learn one of the next unmastered LOs in the prerequisite graph (similar to “traditional” instructional planning Peachey and McCalla 1986). The SPP can still have considerable variation in the order that LOs are consumed from learner to learner even though the prerequisite graph itself remains the same (i.e. the same graph that informs the LO prerequisites described in the previous section). For instance, the child LOs of a prerequisite parent can be given in varying order. Or, a LO that is deep in the graph could appear much earlier for a learner with high aptitude who has mastered the prerequisites more quickly as compared to a low aptitude learner who wouldn’t see it until much later. The determination of mastery depends on a stochastic process, the evaluation function, which indirectly impacts the ordering of LOs in the SPP.

The second baseline is a random planner (Random) that chooses randomly from LOs, regardless of prerequisite relationships or whether a LO has been mastered. We would like to compare how successful simulated learners are when using any of these three planners to guide them as they attempt to master the LOs in a repository.

We have two measures of success: coverage, i.e. how many LOs have been mastered, and expertise, i.e. how well a learner has learned the most advanced LOs. The coverage measure is computed in terms of the percentage of LOs mastered (P[learned] ≥ 0.6) as a proportion of all the LOs in the learning object repository. The expertise measure is computed as the average P[learned] on LOs that are leaf LOs in the prerequisite graph, the ones furthest into the graph. Leaf nodes can also be thought of as LOs representing something like capstone learning objectives or final exams, if we wish to view these LOs in terms of a standard course structure.

In our simulation experiment we created a learning object repository of 40 learning objects, connected in a randomly generated prerequisite graph. We created 65 simulated learners: 21 low aptitude learners (range 1-3), 26 medium aptitude learners (range 4-7), and 18 high aptitude learners (range 8-10). For a given experimental run, all 65 learners are assigned to use the same planning algorithm: one of the CFLS planner, the SPP, or the Random planner. During an experimental run, each of the 65 learners engages in a learning session that consists of a sequence of t = 200 interactions between the simulated learner and simulated LOs, where the next LO (or LO sequence) after an interaction is chosen for a learner by the designated planner for that experimental run. An experimental run simulates all 65 learners learning in parallel to each other, thus generating lots of data. After the experimental run is completed, coverage and expertise measures are computed for each learner, and averages over all learners extracted. The same 65 learners can be used for each of our experimental runs (an advantage in experimental control that only simulation can provide). After the three experimental runs, each one using a different planner, we can compare the performance of the three planners as to their average coverage measure and their average expertise measure. This is possible because we can leave everything the same in each run apart from the planner: the same 65 learners each time, the same evaluation function, the same learning object repository, so any differences in average coverage or expertise are due to the planner used.

But before doing any comparisons, we first needed to tune the CFLS planner; that is, we had to explore different possible settings of b and f , to see which values give the best results according to our success measures. To try to find the best settings, a total of 25 simulations were run, one for each combination of b and f with values ranging from 1 to 5. A second goal was to explore whether the CFLS planner works differently among our three classes of learners: low, medium, and high aptitude learners.

Then, using the best values for b and f, we could compare the CFLS planner to the baseline planners. We wished to show that the CFLS planner worked at least as well as the SPP and much better than the Random planner. If this turned out to be true, then the CFLS planner, which works without any knowledge of the inherent relationships among LOs, would have been shown to be as effective as a planner with this knowledge. The CFLS planner would, therefore, have in some sense inferred strictly from learner behaviour as much information as the SPP has been explicitly supplied with. This in turn would mean that the CFLS is a planner that could work in the open-ended learning environments envisioned by the ecological approach where there is no explicit annotation or mark up of learning objects, where information must be extracted from the actual activities of learners.

As with all collaborative filtering approaches, the CFLS planner relies on having usage data from other learners. Thus, the simulated learning environment needs to be in operation for some time before the CFLS planner is introduced, and then it can be “launched” using interaction data from previous learners. To allow EA usage data to accumulate, the SPP was used to select learning objects for t = 200 iterations. This data was saved as a synthetic dataset and used to initialize the case base before the CFLS planner was brought in. A new population of simulated learners (with identical characteristics as the learners who were informed by the SPP) was then created to use the CFLS planner.

There is still a cold start problem even after the simulation has been initialized. At the beginning of the run, the simulated learners who are to be guided by the CFLS planner have not yet viewed any LOs themselves, so there is no history to match the b LOs to create the plan. In this situation, the CFLS planner matches the learner with another random learner (from the interaction data generated by the SPP), and recommends whatever initial path that the randomly selected learner took when they first arrived.

In tweaking the value of b, we discovered that the ideal setting is dependent on the data available. In finding learners who are similar to a target learner (i.e. learners in the same neighbourhood), the CFLS planner matches their behaviour on the previous b LOs. As b gets larger, the number of potential learners in the neighbourhood diminishes. Further restricting the neighbourhood size is the ecological approach requirement that the outcome of the learner’s interaction with each LO be taken into consideration in making the match. The CFLS planner will only include a peer in the neighbourhood if they have achieved the same pass/fail outcome as the target learner on each of the b previous LOs that the target learner has interacted with (pass means P[learned] ≥ 0.6 and fail means P[learned] < 0.6). Finally, we must choose whether to use a “set match”, where all that is required is that the b LOs be the same with the same outcomes, or the more exacting “sequence match”, where the b LOs must not only be the same with the same outcomes but must also be in the same order. The number of potential learners in a neighbourhood, of course, gets smaller if we choose a sequence match. Given the density of our synthetic dataset, we chose to use a set match, since especially for larger values of b we would otherwise have very sparse neighbourhoods. The match is found by iterating through the peer’s history using an inner loop to check sequences of length b with similar outcomes, and if matched (using “set match”), the peer is added to the neighbourhood. It would not be difficult to switch to a sequence match, which might lead to better outcomes in situations where there is lots of data.

Once a set of similar neighbours is found, the particular recommended path must be chosen. To determine whether a candidate path should be recommended to the target learner, the CFLS planner looks at the success each neighbour had as they consumed the next f learning objects. The success metric is taken to be the average P[learned] that the neighbour achieved over all f LOs along this path. The path recommended to the target learner is then the one with the highest average P[learned] among all of the neighbours. There is no minimum required value for this average; it merely needs to be the best forward path among all of the neighbours.

For our experiment we imposed the following additional constraints: b, once chosen, was the same throughout the 200 interactions in a session; and f , once chosen, was also the same throughout the 200 interactions. Finally, there is the issue of how long the target learner “sticks” with the recommended path, i.e. how many learning objects are consumed before replanning occurs. We call this parameter s. For this experiment, s was set to always equal f throughout a simulation run, meaning the target learner consumed all of the f LOs in the recommended path, one after the other in sequence, before replanning occurred and couldn’t opt out earlier. For this paper we did not further explore the s parameter and the implications of varying it in various situations, so it does not factor into our experimental design or show up as an element of our results.

Results

As discussed above, in our simulation experiments we first explored various settings of the values for the b and f parameters of the CFLS planner, seeking the values that led to the best outcomes in terms of coverage and expertise. Then, using these best settings for b and f , we compared the CFLS planner to the SPP and Random baseline planners, also in terms of coverage and expertise.

Again as discussed above, in each of the experimental runs described below, there were 65 simulated learners, 21 of whom were in the low aptitude group, 26 in the medium aptitude group, and 18 in the high aptitude group. There were 40 LOs, each with a difficulty level and possible prerequisite relationships with other LOs. Each learner was tracked over sessions consisting of t = 200 interactions after which average coverage and expertise measures were computed over each aptitude group.

What Combinations of b and f Worked Best?

To find the best settings for b and f , a total of 25 simulations were run, one for each combination of b and f ranging in value from 1 to 5. The resulting average performance measurements over all learners in each aptitude group (at t = 200) are summarized in the heatmaps in Figs. 1 and 2.

Fig. 1
figure 1

Average Coverage Score (% Learning Objects Mastered) by aptitude group

Fig. 2
figure 2

Average Expertise Score (avg. P[learned] on leaf nodes) by aptitude group

Figure 1 shows the average % LOs mastered, our measure of coverage, and Fig. 2 shows the average P[learned] score on the leaf node learning objects, our measure of expertise. In all six heatmaps, the cells coloured blue (largely in the upper right of the figures) indicate a low performance, whether it be coverage or expertise respectively. Cells coloured red (largely in the lower left of the figures) indicate a high performance, and white is used for numbers in the middle. Pink means that the number is somewhere between the middle (white) and the very highest performance (dark red). Light blue means the number is somewhere between the middle and the very lowest performance (dark blue). The colour scale from blue to red is applied individually to each aptitude group to easily pick out the best or worst combinations of b and f for that group.

All the heatmaps have a diagonal split that is visible between blue and red, with better results in the red half (where bf). One can think of them as red triangles of success and blue triangles of failure. Are these apparent differences actually real?

We used Student’s t-test to check whether the differences in adjacent cells were statistically significant. For this analysis, it was possible to use paired t-tests because the simulated learners have exactly the same characteristics in all of the simulation runs, the only difference being the order in which LOs were interacted with. A two-tailed t-test was used because it was not certain whether one distribution was going to be higher or lower than the other.

The t-test was conducted between each adjacent cell in the 5 × 5 heatmap. Each heatmap has 40 possible cell-to-cell comparisons. The comparisons are done for each aptitude group, giving 120 cell-to-cell comparisons in all, each containing the t-test for both performance measurements. A sample is included below to illustrate our methodology (Table 1). Numbers in bold are statistically significant (i.e. low p-values). A value of n/a means that a t-test could not be conducted because the values are the same for both populations being compared in the two simulation runs. This happened for coverage when all learners in each run mastered 100% of the LOs.

Table 1 Student’s t-test p-values for b = 3, \(f=\left \{ {1 {\ldots } 5}\right \}\)

We now explain how to cross reference Table 1 with the heatmaps. Table 1 contains the t-test p-values that are a comparison of two cells of a heatmap. For example, the column headed 3b3f vs. 3b4f is a comparison of the cell in the heatmap (where b = 3, f = 3, or 3b3f) to the cell immediately below it (where b = 3, f = 4, or 3b4f). Any values that are bolded in Table 1 show that this group of learners had significantly different results between the two simulation runs. For example, for low aptitude learners, we can see from the appropriate upper left heatmap for 3b3f vs. 3b4f that the coverage was 62% and 72.1% respectively. From Table 1 we can see that the corresponding p-value is 0.0004, meaning that this is a statistically significant difference between the two simulation runs. For medium aptitude learners, the coverage values from the upper middle heatmap are 98.6% and 99.5%, corresponding to a statistically significant p-value of 0.0152. Looking now at expertise on the same two 3b3f and 3b4f cells in the lower heatmaps in (Fig. 2), the low aptitude learners in the appropriate heatmap scored on average 0.2256 for the 3b3f simulation run and 0.3016 for the 3b4f run, corresponding to a statistically significant p-value in Table 1 of 0.0231.

The results in the example above would help us to decide whether to use the CFLS planner with f = 4 rather than f = 3, when b = 3. For low aptitude learners, on both coverage and expertise measures it is statistically better to recommend sequences of length f = 4. For medium aptitude learners this is also true for coverage. But expertise is not statistically significantly different between the 3b3f and 3b4f runs (p-value is 0.1588). Nevertheless, there is a marginally better result (0.6874 vs. 0.6715) for f = 4 rather than f = 3, when b = 3, so in the absence of other factors the planner should recommend sequences of length f = 4 for medium aptitude learners. For high aptitude learners, on coverage there was no statistical comparison, since in each of the runs the learners mastered all of the LOs over the 200 interactions they engaged in. However, on the expertise measure, high aptitude learners achieved 0.7727 in 3b3f and 0.7633 in 3b4f (lower right heatmap), although this is not significantly different (p = 0.1026). For high aptitude learners, then, it seems to actually be better to use f = 3 rather than f = 4, although not significantly so.

Overall, there were 50 cell-to-cell comparisons that showed statistically significant differences according to the t-tests, and 24 of these were along the red/blue border (almost half of all cases). In 23 of these cases, the cell-to-cell comparisons were statistically different according to both coverage and expertise. In one case only, coverage showed significant differences but expertise did not. This means that the diagonal pattern observed in the heatmap colour visualization is in fact statistically real.

One could interpret this pattern as a rule of thumb: a planner recommending sequences of length f = k should create a neighbourhood based on a maximum of k LOs in common. Another interpretation is that if a learner has been matched with a neighbourhood containing sequences of k LOs in common, then learners absolutely should follow the path ahead for at least k LOs before re-planning occurs. Abandoning the path too soon would lead to less effective learning. Note that it appears the diagonal pattern generalizes beyond b = 5,f = 5. We ran b = 6,f = 5 and found that indeed there was a drastic drop in performance. Another row was also run using a fixed f = 6 and varying b. Again, a drop in learner performance was found at b = 7. So, the pattern appears to continue on for greater values of b and f.

Outside the red/blue border, it was rare for measurements in neighbouring cells to be statistically significant, but when they were, it was most often with low aptitude learners on the successful side of the diagonal split (red triangle). Of these, coverage was statistically significant more often than expertise. This was a bit surprising because coverage deals with all of the LOs in the system, whereas expertise only deals with a small number of LOs and small changes (i.e. an unusual P[learned] on one of the LOs) would seem to have the potential for more impact on this measure.

Another general observation is that different aptitude groups found their best performance with differing values of b and f: the higher the learner’s aptitude, the higher values of b and f should be used. It’s harder to see this in Fig. 1 because coverage was the same for all aptitude groups much of the time. Instead, we can look at expertise in Fig. 2, where the dark red cells indicate the best combinations of b and f. For the low aptitude group, the darkest red appears in the top left corner (b = 1,f = 1). This means that low aptitude learners attained the highest expertise when the CFLS planner used b = 1,f = 1. For medium aptitude learners, the dark red cells occur at slightly higher values of b and f (b = 1 or 2, f = 2). The high aptitude learners performed best with higher values still, with the highest being b = 3,f = 3.

This pattern makes sense. The lower the aptitude of the learners, the less likely it is that they are following coherent paths to learning, and thus the less useful it is to form neighbourhoods based on these paths and the less useful it is for them to stick to a longer path going forward. The opposite is true for high aptitude learners who are generally creating coherent longer learning paths that other high aptitude learners are able to follow successfully. Of course, there is an upper limit as to how large b can be. At some point there are diminishing returns as b gets higher and higher, as there will be fewer and fewer learners who have followed a similar long path from which to form a neighbourhood.

Comparison to the Random planner and the Simple Prerequisite Planner baselines

To give a basis of comparison for the CFLS planner, two baselines were used: a Random planner and a Simple Prerequisite Planner (SPP). Table 2 shows coverage and expertise for each aptitude group for these baseline planners. In general, and as expected, the low aptitude learners scored lower than the medium aptitude learners who scored lower than the high aptitude learners. This is clearly indicated by expertise for both the Random planner and the SPP, where the higher the aptitude of the learners the better the expertise. For coverage, all learners using the SPP had the same score, mastering 100% of the LOs. Using the Random planner, high aptitude learners mastered a higher percentage of LOs than the other learners. This suggests they were better able to cope with receiving LOs in random order. The low aptitude learners mastered a slightly higher percentage (27.3%) of LOs than did the medium aptitude learners (26%). This was unexpected, but perhaps represents nothing more than an artifact of the LOs served up by the Random planner to the low aptitude learners being easier than the LOs given randomly to the medium aptitude learners.

Table 2 Baseline values

At a glance, it is clear from Table 2 that on both expertise and coverage measures, the SPP is a vastly superior planner to Random, not surprisingly given the importance of mastering a learning object’s prerequisites to the chances of a learner mastering the LO. So, we expect the CFLS planner to easily outperform Random, but to have a more difficult time when challenging the SPP. If, however, the CFLS planner can perform as well as or better than the SPP, then it is an impressive feat, given that the CFLS planner has no access to the important prerequisite information that has so benefitted the SPP.

Looking in more detail, we compare the results of the baseline planners in Table 2 to the results of the CFLS planner on various settings of b and f in the heatmaps in Figs. 1 and 2. Compare, first, the CFLS planner to the Random planner. For all but one setting of b and f, the CFLS planner readily outperforms Random on both coverage and expertise measures. The one exception is for b = 2,f = 1 for low aptitude learners, where the CFLS planner is marginally worse than the Random planner on both measures. As discussed above, this suggests that the sequence of LOs consumed by low aptitude learners doesn’t provide much guidance for the CFLS planner’s decision making. In all other cells the CFLS planner is better than the Random planner, vastly better in the cells in the red triangles where the b and f settings are more appropriately chosen.

Next we compare the CLFS planner to the SPP. Here, the comparisons between the two planners are more mixed, with various combinations of b and f yielding different results. For low aptitude learners, the CFLS planner only outperforms (on both coverage and expertise) the SPP in one case: b = 1 and f = 1. For all other combinations of b and f, the low aptitude learners using the CFLS planner did not do as well as learners who used the simple prerequisite planner. However, for both medium and high aptitude groups, the CFLS planner outperformed the SPP on expertise in all cells within the success triangle, and on coverage in all cells within the success triangle except b = 2,f = 3; b = 3,f = 3; b = 2,f = 4; b = 3,f = 4; and b = 4,f = 4 where the coverage of the CFLS planner is just marginally less than the SPP’s 100%. This suggests that when the LO sequences followed by learners have more often led to successful learning, as for medium and high aptitude learners, they can often provide more reliable information to the CFLS planner about what paths to recommend than even the direct knowledge of the LO prerequisites used by the SPP.

The important lesson, though, that can be drawn from the comparison between the CFLS planner and the SPP is that there are settings of b and f in which the CFLS planner works better than the SPP. On the raw numbers from this analysis, the recommendation would be for the CFLS planner to use b = 1,f = 1 for low aptitude learners, b = 1,f = 2 for medium aptitude learners, and b = 3,f = 3 for high aptitude learners. A simpler metric that would be nearly as good would be to set b = f, with b = 1 for low aptitude learners, b = 2 for medium aptitude learners, and b = 3 for high aptitude learners. This is an interesting metric in that it suggests a pattern where the length of the sequence used for planning is a good guide as to how long the learner should follow the plan once it is made; and that longer sequences are increasingly useful for planning the “better” the learner is.

So the CFLS planner, based on its knowledge of the learner (in our case the learner’s aptitude) and the learner’s behaviour (the most recent LOs consumed), can actually perform better than standard prerequisite planning. And, most impressively, it can do this without knowing anything about the underlying connections among the learning objects (in our case the LO prerequisite relationships), but just by following traces of learner behaviour. All planning by a CFLS planner exclusively uses information about learners and their interactions with the learning objects. No externally imposed metadata about the learning objects is needed, and even the learner model information could be extracted from patterns mined from learner behaviour rather than added from external sources such as surveys. In fact, in complex real world learning situations, a CFLS planner could implicitly extract all sorts of underlying patterns about learners and learner interactions with learning objects that are much more subtle than the aptitude levels and prerequisite relationships in our simple simulation model. This could lead to successful planning of learning sequences even in such complex learning environments without much externally imposed knowledge engineering being required. This is promising for building planners that really do work well in open-ended, unstructured learning environments.

Literature Review

In this section we discuss literature that is relevant to our research, and in particular to the techniques we have used. Our approach has two main points of difference in originality when compared to other approaches to pedagogical planning in learning environments: using learner sequences of previous actions, and generating sequences intended to be consumed as a path. Other research in this area often overlaps with one of these aspects, but it’s rare to find work that is concerned with both. A third area of relevance to our approach is the broader literature about using learner traces (not necessarily sequences) as a basis for a recommendation (not necessarily a sequence).

In the literature that uses learner sequences of previous actions, there are many reasons to do so, including: for curriculum analysis, for generating metadata, and for learner modelling.

In curriculum analysis, user sequences are a series of course enrolments taken by learners that are studied to find out if actual student registration behaviour corresponds to the expectations of program designers (Bendatu and Yahya 2015) or to help students make better registration decisions. Another study compares three approaches to course recommendation based on user sequences: process mining, sequential pattern mining, and finally a dependency graph that was found to contribute the most to student performance (Wang and Zaïane 2018). Course enrolment pattern analyses also can be done with a probabilistic approach (Gruver et al. 2019) or with collaborative filtering (Polyzou et al. 2019). The main difference from our focus is that curriculum analysis is usually for a formal program and not an open-ended, unstructured environment.

Learner sequences have also been used to solve the problem of missing LO metadata by using previous interaction data of learners to automatically generate the metadata. In one application, authors show how to mark up an electronic textbook by taking user sequences and using machine learning to predict the outcome and prerequisite concepts (Labutov I. et al. 2017). In another application, authors show how to track student learning, self-efficacy, and motivation using the iLOG system, which generates the metadata using association rule mining and the EA architecture (Miller et al. 2011).

The EA approach to metadata storage has points of similarity with the concept of stigmergy, referring to the asynchronous communication method used by agents who communicate with each other by leaving messages for each other within the environment. Stigmergy allows for advanced coordinated behaviour to emerge in dynamic and uncertain environments (Ricci et al. 2007).

Learner modelling from user sequences can be done to understand the different types of learners based on their learning interaction patterns. One technique is with clustering and association rules for exploratory environments (Bernardini and Conati 2010). Different sequence patterns may signify different learner attributes. For example, it has been found that higher performers have different sequence patterns than lower performers (Gitinabard et al. 2019). Sequence mining can be used to track and identify learners’ cognitive skills and learning behaviours in open exploratory environments (Kinnebrew et al. 2014) or detecting productive inquiry (Perez et al. 2017). Other than content sequences, another way to look at learners is to study their affective trajectories (Padrón-Rivera and Rebolledo-Mendez 2015). Research has gone into collecting and displaying learner traces in a dashboard to help instructors understand their students (Santos et al. 2015). The sequences may also give a clue as to users’ interests in a recommender system where different orderings may indicate different interests (Yu et al. 2006).

We also use sequences for learner modelling (in addition to attributes such as aptitude). The CFLS planner takes into account that an identical path could be successful for one learner but a failure for another. The CFLS planner doesn’t try to model the reasons that a given sequence was successful or not. It could have been due to a learner’s more distant past experiences, their learning preferences, various affective issues, or anything else. Because of the EA architecture, we don’t need to know the reasons for the success or failure, only the result and the path that led to that result.

In the literature about generating sequences intended to be consumed as a path, the work often falls in the areas of curriculum analysis, problem sequencing, narrative and story-based systems, or tourism and travel planning. In the introduction to this paper we already overviewed the tourism and travel planning area in our discussion of recommender systems that suggest sequences of destinations for potential travellers, so we will only go through the other three areas here.

In curriculum analysis, the knowledge about a formal program can provide more structure for generating sequences than in an open-ended, unstructured environment. For example, using association rule mining, the correlations between enrolments of courses taken by students both before and after a given course can generate a recommendation of a path ahead (Chang et al. 2016). One of the visualizations developed by the authors shows enrolment paths of a sequence of courses taken. The path of courses taken after a given course can be used as a recommendation to other students after taking that course (similar to our recommendation of learning objects, but at a much coarser granularity than learning objects).

The area of problem sequencing is related to the areas of adaptive testing and item response theory where, like our approach, the items are chosen from a repository of items. A standards-based approach can generate a customized sequence of lessons using LO metadata and a topic relationship graph (Farrell et al. 2004). Temporal collaborative filtering can be used to predict whether learners will correctly solve a problem, thus helping the system to choose an effective problem to recommend next (Cetintas et al. 2010). There’s a tradeoff between a gain in understanding of student learning of items vs. the pedagogical desirability of presenting easier items before more difficult ones (Čechák and Pelánek 2019). Finally, an area called “Knowledge Spaces” (exemplified by Falmagne and Doignon 2011) uses a structure that is empirically obtained and serves to recommend different sequences to different learners. Our approach differs in that sequences come from a local calculation (using collaborative filtering) rather than a global structure and can thus adapt to change as more learners interact with the LOs, as new LOs are added, and as old LOs are less frequently accessed, so what may be a favourable sequence for one learner is not necessarily favourable for another.

In story and narrative-based systems, the plot points need to be planned to take into account the user’s past interactions. Systems can use collaborative filtering to plan a sequence of plot points based on a story library and previous user feedback (Yu and Riedl 2012). Yu and Riedl used a non-Markov Decision process, where the next step depends on all previous steps. Similarly, in Crystal Island, a scientific discovery world, the system can recommend a sequence of embedded assessments by using model-based collaborative filtering (Min et al. 2013). In this system, learners fill in concept maps that are dynamically linked to objects in the world. Another system, the Cordillera natural language tutor, has been used to study how the system’s choice of micro-steps, chosen with reinforcement learning, could improve student performance by strengthening their qualitative understanding of physics concepts (Chi et al. 2011).

In the broader literature about using learner traces as a basis for a recommendation, the recommendation is usually a single item or Top-N list, rather than recommending a sequence where the intention is that the learner follow all of the steps along a path. Some key issues in the literature that impact the generation of sequences are: the word open and its several meanings, the amount of control the learner will have over the sequence, and how the sequence will be shaped by other learners.

There are many meanings of the word “open”. Our sense of the term is that the learning environment is “open-ended”: that is, it is dynamic and open to new learning objects and new learners. A related sense of the word is the system’s ability to work with this dynamic repository without needing extensive knowledge engineering first. Another sense of the term is that of “opening” the learning beyond a single student with a digital tutor toward a focus on the existing learning environment such as a group in a classroom (Hoppe 2016). Hoppe also describes the sense of open learner models, where learners can reflect on and advance their learning by interacting with the system’s model of their own learning (Kay and Bull 2015).

When a sequence is generated for the learner, another sense of “open” is whether the learner has freedom to deviate from that sequence. In our approach, such flexibility is reflected in the choice of the f parameter, which can be varied, but once chosen cannot change until all f LOs have been consumed by the learner. Our model also allows for an s parameter (how long the learner sticks to a plan) that can be changed, but in this paper we have not explored how this might be usefully varied from learner to learner or even situation to situation. This meaning of “open” is related to the constructivist approach to education and open-ended learning environments (OELEs) (Hannafin 1994; Land 2000). It may be best to limit the control given to beginners to help them avoid getting overwhelmed or caught up in a filter bubble (Nagulendra and Vassileva 2014). Learner control can be shared between the learner and system to help achieve a balance (Corbalan et al. 2006). After some background knowledge is achieved, it has been suggested that learners might then benefit from some control (Clarebout and Elen 2008). This ability to adapt relates to the expertise reversal effect (Kalyuga et al. 2003): that is, that the same teaching approach could work very well for advanced learners but have poor results with novices.

The generated sequence may also be influenced by others. Digital traces, when combined with an ontology, have been used to provide learners with a “semantic nudge” to recommend concepts to learners based on the previous interaction of other learners (Al-Tawil et al. 2014). The research area of social navigation makes use of interaction data to highlight the paths of other users that can be seen as recommendations. Social navigation encourages learners down a fruitful “road less travelled” and is particularly useful where content isn’t well structured, like Open Corpus systems (Brusilovsky et al. 2004). Navigation paths can emerge from usage rather than being designed ahead of time. Research has looked at user trust in following the paths of others, the influence of the navigation support, and the likelihood that learners will follow a path (Farzan and Brusilovsky 2019). Social navigation can also be augmented with knowledge-based guidance to recommend a sequence (Hosseini et al. 2015). Compared to the types of social navigation identified in the literature (Forsberg et al. 1998), our approach may be considered indirect (as opposed to direct solicitation from peers), and, perhaps a cross between intended and unintended because the CFLS planner intentionally gives the recommendation but the previous learners themselves didn’t make their choices with the idea of influencing others. By comparing a learner’s current behaviour with that of past learners at the same point, a system like iList is able to give reactive feedback (Fossati et al. 2009).

Research that is close to ours is Champaign’s approach to planning content in courses (Champaign 2012). In Champaign’s approach LOs are also recommended within an EA architecture and simulation is also used to explore the characteristics of the approach. Champaign’s approach doesn’t use or recommend sequences, per se, but rather is based on a reactive loop where the next LO recommendation is computed at each step, taking into account the learner’s most recent experience. To select the next LO, a greedy algorithm computes the benefit of other potential LOs and recommends the LO with highest anticipated benefit. The anticipated benefit is computed based on the experience of other learners who interacted with the LO by taking into account their learning gains (the difference in pre- and post- assessments of the LO interaction) and their similarity to the target learner, where similarity is based on a normalized comparison of their pre-assessments on the LO.

Thus, our approach overlaps other research along many dimensions: using learner sequences of previous actions, generating sequences, as well as the broader area of using learner traces to make recommendations. The main distinctions of our approach are the use of both backward and forward sequences and the ability of the approach to work in open-ended, unstructured environments by using behavioural data that accumulates naturally over time.

Contributions

Open-ended, unstructured learning environments are now commonplace, with all sorts of learners pursuing their own individual learning goals via active browsing of web sources or by searching through other repositories of information. A particularly important type of learning that can be supported by such open-ended learning environments is lifelong learning, undertaken by professionals upgrading their skills or by “just plain folks” learning about new things throughout their lives.

We have created a planner that can work in such environments: the CFLS (Collaborative Filtering based on Learning Sequences) planner. As discussed in the paper, the CFLS planner can find for a target learner a sequence of learning objects f that can help that learner understand some topic. It does this by using a collaborative filtering approach, looking at the sequence of the previous b learning objects consumed by the target learner, and finding a neighbourhood of other learners who have consumed a similar sequence of b learning objects in the past. It then finds among the neighbourhood learners the sequence of f learning objects that worked most effectively going forward, and recommends this sequence to the target learner.

This CFLS planner is robust to the requirements of an open-ended, unstructured learning environment. It creates plans based on traces kept of learner interactions with learning objects and (if available) learner model information. There is no need for externally engineered metadata to be attached to the learning objects. New learning objects can thus easily be incorporated into the learning environment. So can new learners. And, as the new learning objects get used and the new learners start interacting with both new and existing learning objects, the CFLS planner can make increasingly good decisions about what sequences to recommend to these new learners based on their past behaviour and can increasingly make informed recommendations about new learning objects as behavioural data accumulates around them. This ability to handle a dynamically changing world is essential to any planner that is to work in open-ended, unstructured learning environments.

The CFLS planner works efficiently. The most computationally expensive part of the CFLS planner is finding the learners in the neighbourhood and this is at worst linear in the number of learners and linear in the amount of learning object interaction history created by each learner. The way the CFLS planner creates a neighbourhood is different from a typical recommender algorithm, where each learner is either matched with each other learner or each learning object. The CFLS planner first narrows down the neighbourhoods when each learner’s history of LO interactions is searched for a matching set of the last b learning objects. The forward searching of the next f learning objects is then executed using only the small resulting neighbourhood.

Through simulation studies, using simple simulated learners and minimal learning objects embedded in a specific open-ended learning architecture called the ecological approach (EA), we demonstrated that the CFLS planner can be very effective. The CFLS planner in fact proved to be massively better (on two different measures of learning outcomes) than an approach where learners chose the next learning object randomly. With appropriate tunings of b and f the CFLS planner was even considerably better than a “traditional” instructional planner that created plans based on its explicit knowledge of the prerequisite relationships among the learning objects. It is especially encouraging that the CFLS planner was able to do so well even though it did not have any knowledge itself of the prerequisite relationships among the learning objects, but was extracting its recommendations based purely on information about the learners and their interactions with the learning objects.

Our simulation studies of the CFLS planner also discovered interesting pedagogical patterns. In particular it turns out that there is a relationship between b and f that provides the best learning outcomes. In general the learning outcomes are better if b is less than f. The CFLS planner works best for low aptitude learners if b = f = 1, for medium aptitude learners if b = 1,f = 2, and for high aptitude learners if b = f = 3. With further simulations it may turn out that there actually is a general rule that the best outcome occurs when b = f, since the learning outcomes for medium aptitude learners for b = f = 2 is only very marginally lower than the outcomes for b = 1,f = 2. There well may be many other interesting patterns lurking in the data that further studies would reveal. With more complex learners and learning objects, there may be niche groups of learners for which the CFLS planner can find plans based on very different aspects of the learner model and the behavioural traces, yielding very different values of b and f for learners sharing this niche.

The big issue with any collaborative filtering approach is the cold start problem. This problem occurs initially when the learning environment is first released to learners with an initial set of learning objects, and there is no behavioural data at all. In this context, we recommend doing what we did in our study: use simulation to generate a set of artificial learners who then undertake simulated interactions with the learning objects. The new learners can then be matched to the artificial ones. Gradually, data about real learners builds up and the artificial learners can fade away. There is an ongoing cold start problem, though, when new learners or new learning objects join the learning environment over time. One way to handle this is for the CFLS planner to simply make random recommendations (to new learners or of new learning objects) until real data starts to accumulate. This may be the only way to incorporate a new learning object, but for a new learner there may be a better way. If the new learner has some sort of profile (aka learner model) provided (perhaps from a survey taken before joining the learning environment), the CFLS planner can make matches based on the learner model information, even with no behavioural data. As behavioural data accumulates the matching can revert to using a more normal balance of learner model and behavioural information. It should be further noted that there need be no cold stop problem. That is, there is no reason to delete learners who have left the system or learning objects that become less and less used. The data left behind can still be used in future matching by the CFLS planner, even if the learner is no longer active and the learning object seldom used.

Future Directions

There are many future directions in which to take this research. In the short term, we could explore larger values of b and f , although this isn’t likely to yield too much since the neighbourhoods for larger b values will be smaller and smaller, and larger f values will likely produce lengthy plans that are too rigid. To overcome such rigidity we could see what happens when we allow learners to opt out of such lengthy plans, by reintroducing the “stickiness” factor s (how long the learner must stick with the plan) and exploring the effects of varying s so it is no longer constrained to be the same as f . What happens when all 3 variables interact? What patterns emerge from these interactions? Is f or s the more important parameter even in our current results?

Another interesting direction is to take the aptitude level into account in finding matching learners, for example only matching learners of a particular aptitude level to other learners at the same level when forming the neighbourhood and then looking at the previous b objects to refine the neighbourhood. This would require many more simulated learners in order to get an appropriate density of matches in forming the neighbourhoods. It would also be interesting to match learners of different aptitudes when forming the neighbourhoods. For example, if the CFLS planner were to match low aptitude learners only to high aptitude learners who have consumed the same b objects, does this result in better (or worse) paths being recommended to the low aptitude learners? What are the effects of these kinds of manipulations on the relationships between b, f, and s?

It would also be interesting to explore other manipulations of the matching algorithm when a neighbourhood is being formed. One natural next step would be to explore a matching algorithm in which the b learning objects must not only be the same LOs but also be in the exact same order. This would require the generation of many more simulated learners so that neighbourhood density can be maintained. Another possibility would be to incorporate new factors into the matching algorithm. This would likely only become a really useful direction, though, with a more sophisticated simulation model, with more information kept in each learner model, and (perhaps) learning objects with more attributes. In this context, many interesting patterns may emerge, with trade-offs observed between how useful the learner model information is versus the interaction data. It could turn out that each individual learner has different values of b, f, and s that lead to the best learning outcomes. Further, these values may actually change over time as the learner spends more time in the learning environment and learns more and provides more data that the CFLS planner can use.

At some point it will be important for the CFLS planner to be trialled with data from real learners. Initially, this data may inform more realistic simulations that would shed light on what might work in the real world (and might not). However, eventually a CFLS planner tuned as well as possible through these simulation studies will have to be put to the test in a real world environment, recommending learning paths to real learners. It will then become clear whether the CFLS planner can detect patterns that really do affect learning even amongst the noise of a real world learning situation.

Extending the CFLS planner into such a real world situation would raise two complex issues. The first of these is the “learner modelling” issue: it would be necessary in the real world to determine what information to gather about learners, how to gather it, and how to maintain it over time. A second issue is the “content” issue, i.e. the need to deal with learning objects that have real educational content. The CFLS planner won’t need to know anything explicitly about this content since the planner reasons only with learner data, i.e. learner behavioural data attached to learning objects and information gleaned from the learner models. So, in one sense the fact that learning objects actually have real content isn’t an issue in and of itself. But, as with the learner modelling issue, decisions would have to be made about what learner behavioural data to capture and at what granularity (keystroke level or some higher level of interaction), and how to infer important things from this learner data such as determining how successfully a learner has interacted with a given learning object (the role of the evaluation function in our simulation model). As suggested by the ecological approach, finding patterns in the learner data through educational data mining (EDM) would be a key element in resolving both the learner modelling and content issues. But which EDM algorithms to use, what patterns might emerge, and whether all of this can be done without the need for any kind of explicit human-engineered metadata cannot be known until the real world experiment is designed, implemented, carried out, and evaluated. It would seem likely that the answers to these questions may in fact be dependent on the particular characteristics of the real world educational environment being investigated, although some general lessons will no doubt emerge.

Conclusion

In conclusion, the CFLS planner is an innovative approach to supporting learning in open-ended, unstructured and dynamic learning environments. It can handle the “forcing functions” of this kind of learning environment, most importantly the requirement in such contexts to minimize the need for explicit external knowledge engineering of metadata.

This research shows the usefulness of using even low fidelity simulations to shed light on AIED issues. For example, in companion research we’ve taken the same simulation model used to study the CFLS planner and used it to study the impact of peers on learning (Frost and McCalla 2013). We see future AIED research using simulation much more frequently to explore important issues.

The CFLS planner is one of the few planners or recommender systems to use sequences to recommend sequences. The various pedagogical patterns that emerged through our experimentation are intriguing and show the promise of the CFLS planner to be effective. The patterns that emerged, though, also demonstrate the need for the CFLS planner to personalize its recommendations, for example, to take into account aptitude levels in forming and producing the plans, but likely in more complex situations to take into account many aspects of the learner and the learner’s behaviour. The most effective learning path for a learner may differ depending on the individual. The strength of the CFLS planner is that it holds promise to discover such individualized learning paths through observing learner behaviour without the need for external knowledge engineering.

Jim Greer’s Influence

This paper emanates from the master’s thesis (Frost 2017) of the first author. As a member of my (Stephanie’s) committee, Jim encouraged the development of this approach with his questions and feedback. Noting that the CFLS planner should theoretically get better over time, he asked: would it asymptotically approach the performance of the SPP? This paper shows that to be a good insight, since the CFLS planner not only has approached, but exceeded the SPP, at least in our simulation study. Jim also asked: what would happen if we deployed this planner in a real MOOC while mixing in simulated learners with real ones? I hope to keep exploring this and other questions!

Jim’s influence went a lot further than my thesis. For eleven years, most of my full time work went toward initiatives led by Jim at the University of Saskatchewan. He was always asking his team what we thought and always checking to make sure that logistics and concerns were addressed so that everyone could see the benefits of the latest innovation. Especially in learning analytics projects, we’d often build something with a big wow factor. And then came the exploration: Jim would teach us never to blindly trust the first result. Because of his questions, we’d double check, cross reference, run follow up queries, and explore curiosities that came up with faculty members and colleagues using the technology.

Jim was always encouraging me to connect my research with my day job, and I’m starting to understand that it’s because Jim himself always worked grounded in research. As a quiet and reserved person, I understand how Jim’s way of stirring things up could make some want to step back and watch first! But Jim would help you find your balance and then you’d be further than you’d ever be on your own. Thank you, Jim. Losing you was to lose a primary guiding beacon. But, as Gord says, you’ve left us with so much.

The second author, of course, also owes a huge debt of gratitude to Jim. Jim arrived at the University of Saskatchewan in 1987 as my (Gord’s) postdoctoral student, but it wasn’t long before he was hired into a faculty position in the Department of Computer Science. We knew a gem when we saw one! Jim and I worked closely and deeply together for 30+ years on AIED and UMAP research, constantly sparking ideas off each other. Our research collaboration led to the formation of the ARIES Laboratory, which over the years attracted top quality graduate and undergraduate students, postdoctoral scholars and research associates, new faculty members (including the guest co-editor of this special issue of IJAIED, Julita Vassileva), and visits from many internationally renowned AIED and UMAP researchers. ARIES was an intellectual hotbed, fizzing with interesting ideas. For example, the ecological approach environment that is so critical to the research in this paper was a natural outgrowth of highly innovative work in ARIES on peer help systems driven by Jim, Julita, and me (see Vassileva et al. 2016). Jim and I were equally close away from research, exploring interesting new ideas for our Department and University, in teaching, curriculum, research, and outreach. Our frequent shared coffee and lunch breaks were crucial times for thinking laterally. Many of our best ideas arose during discussions over a coffee or a sandwich or, when at a conference or workshop, over a beer or two. Of course, I don’t mean to imply that Jim only worked with me, and I with him. In fact, as Julita and I discuss in the introduction to this special issue of IJAIED, Jim had a fabulous career with diverse collaborations and many groundbreaking contributions. And I haven’t even mentioned his pervasive sense of humourFootnote 2 and the elaborate practical jokes he was constantly perpetrating on the unsuspecting. But that’s for another forum. Let me just conclude that without Jim my life would have been much less rich, both intellectually and emotionally. Thanks, Jim, for your friendship and all you’ve done over the years for me and for all those who knew you.