Keywords

1 Introduction

This paper presents the way to design the learning support system toward acquiring a creative skill on learning. The objective of this research is designing the learning goal space for a creative learner. There are two research goals. One is to establish designing the creative learning task of the learning support system. The other is to make clear the human sense of creativity. In the background of this research, the progress of information technology will make that about half of human jobs are replaced by computers in near future. The remained jobs having technical difficulty for both AI and computers need high creativity or social skills.

This paper describes the creativity for both human and the system. Previous research on human creativity suggests that “creativity is the ability to come up with ideas or artifacts that are novel and valuable” [1]. In addition, “one process of creating ideas involves making unfamiliar combinations of familiar ideas, requiring a rich store of knowledge” [1]. However, it is too difficult for computer to acquire human’s creativity. The reason is that it is not clear to combine familiar ideas by unfamiliar ways.

To solve this problem, we focus on the way to utilize higher creativity of human than that of computers. We propose the mechanism based on the framework of our continuous learning support system for a human learner to see the creativity on his/her own learning. In the proposed mechanism, the support system generates a derived learning achievement by combining the original achievement and the solution found by the learner. Then the learner can reflect his/her own learning trace on the learning goal space. Owing to them, the support system suggests the awareness toward both unclear learning results and the sense of values for the learner.

There are three kinds of the proposed support methods. First, it is the visualization of learning traces to support awareness of creativity on the learning. We design the learning goal space to visualize learning traces. They are the distribution of the learning goals found by the learner who learns the original achievement and the derived one generated by the support system. It makes easier to reflect the learning orientation by means of showing the relative positioning of the learning goal to the learning trace.

Second is the discovery support for unknown solutions by generating the derived achievement based on negation of his/her found shortest solution. It encourages the learner to perceive his/her unclear solutions. Third is the generating the derived achievement by the justification of the found redundant solution. It encourages the learner to perceive his/her unclear sense of values.

2 Background

This section describes the theoretical background of this research. After the research on the creativity is described, we summarize an overview of continuous learning because it is the basic framework of our creative learning process, and then we describe the creative learning skill.

2.1 The Research on the Creativity

We consider creativity at the base of J.P. Guilford’s approach. Guilford says creativity has primary characteristics, sensitivity to problems, fluency in generating ideas, flexibility and novelty of ideas, and the ability to synthesize and reorganize information [2]. Sensibility to problems is the skill to find the problem. We consider that it is the skill to comprehend the learning task. Fluency in generating ideas shows how many ideas a human create. Flexibility is the skill to create various ideas. Novelty of ideas is the skill to create unusual ideas. The ability to synthesize and reorganize information is the skill to utilize a thing for the divergent purposes. We consider it needs to focus on the interpretation of the meta-learning process.

Then we describe several characteristics which are concerned in our research. As sensibility to problems in the task, the learner can comprehend the structure of the learning environment through the trial and errors. As fluency in generating ideas in the task, the learner can find many solutions since the task gives the achievement to him/her. As flexibility in the task, the learner can find the various solutions by seeing his/her learning trace in the learning goal space.

2.2 The Creative Learning

We define the creative learning as continuous learning with discoveries of unusual solutions from achievements in the learner’s own. In our previous research [6], the human designed the achievements for a learner as the sequence of mazes. However, creative learner needs a new achievement continuously. In other words, it is necessary for the creative learner to discover the new achievements by himself/herself, but it is not easy. So we propose the interactive mechanism between a human learner and the learning support system in which the system derives the achievements from his/her found solutions with two kinds of heuristics. Once the learner found an unusual solution, the system can derive the new achievements from the unusual one.

2.3 The Creative Learning Process

An overview of continuous learning

Our creative learning process is based on the individual continuous learning process [5]. The concept of continuous learning comes from Industrial and organizational psychology [4]. One of conceptual definitions of individual continuous learning is follows; “Continuous learning at the individual level is regularly changing behavior based on a deepening and broadening of one’s skills, knowledge, and worldview” [3]. In detail, please refer [6].

The flow of the creative learning process

Figure 1 shows the flow of the creative learning process based on the continuous learning process. This process consists of triple cycle. Innermost cycle is called a trial. A trial is defined as a transition sequence from start state to encountering either a goal state or a wall. In this cycle, a learner repeats an action and his/her mental process including awareness until he/she results in either success or fail of the task. Second cycle is called an achievement. An achievement is defined as a unit of the main task which is the learning of a maze with the start and the invisible goal. In this cycle, when the trial terminates by the encounter with a goal, the learner finds the solution of the achievement. Then, the learner reflects the trial by the reflection of viewing his/her learning traces on the learning goal space. This process is described at Sect. 3.4. If current trial is not accomplished, he/she restarts the trial from the start state. Outmost cycle is the creative learning cycle. When the learner accomplished current achievement, the system generates a derived achievement according to the learner’s solution, and then, He/she can challenge next new achievement. Section 3.3 describes this process.

Fig. 1.
figure 1

The flow of the creative learning process

2.4 The Creative Learning Skill

We define the creative learning skill as the learning skill to try to find more creative solutions on the given tasks or problems having optimal or entrenched solution. We propose the interactive mechanism consisting of two parts. The human part is to find a new solution from the achievement. The support system’s part is to generate a new achievement derived from the human learner’s solution by adding the sub-reward on it randomly to support the learner to find more creative solutions.

3 Designing the Creative Learning Support System

3.1 The Learning Environment by an Maze Model

As the learning environment for a human learner, we adopt a grid maze from start to goal since it is a familiar example to find the path through a trial and error process. First, we define a maze, a path in the maze, and a solution in the maze. A maze is the shape of 2D maze defined by three kinds of states (start, goal and normal state) and the walls surrounding the states. In detail, it is described later as the maze model. A path consists of states and action transition sequence from the start state to the goal state. A solution is a path of the achievement of the maze.

A maze model for creative learning version consists of five elements, state set, transitions and walls, action set, and rewards. Figure 2 shows the structure of a 2D grid maze. The n × m grid maze with four neighbors consists of the n × m n number of 1 × 1 squares. It is called a simple maze which is surrounded by walls in a rectangle shape. Figure 2(a) shows a 3 × 2 simple maze with a start and a goal. In a grid maze, every square touches one of their edges except for a wall. Each square in a maze model is called a state. A state can be visited at once.

Fig. 2.
figure 2

The structure of a 3 × 2 simple maze [5]

Transitions between states in a maze model is defined whether corresponding square with four neighbors, {up, down, left, right} is connected or not connected by a wall. They are represented as the labeled directed graph as shown in Fig. 2 (b). Action set is defined as a set of labels to distinguish the possible transitions of a state. In a grid maze, the learner can take four kinds of actions: up, right, down, left.

Note that a trial is a transition sequence from the start state to encountering either a goal state or a wall, and the action toward a wall results in the transition to the start state to restart the trial. Transition to the goal state results in the success of the achievement, then the learner finds a solution and obtains a main reward (+1).

3.2 Designing the Creative Learning Support System

This section describes the way to automatically generate a new achievement as shown in Fig. 1. First we describe a stage, an achievement and a stage in the creative learning task. A stage is a set of achievements of the same maze shape. An achievement of the creative learning task is defined as the learning of a maze to find a path from the start state to the goal state. An achievement consists of a maze shape and generated sub-rewards if any. It is a unit of the learning which is either an original achievement which consists of only maze shape or a derived achievement which contains generated sub-rewards.

Figure 3 shows the flow of generating a new achievement by the system. The inner loop in Fig. 3 shows the interactive process of generating a new achievement by the system from the solution the learner searched. After a solution is found by the learner, if it is a new one, it is displayed on the learning goal space as the found learning goal for the reflection of the learner, and then the system derives the achievement by adding a reward according to the type of the solution. The system resets the achievement and the learner tries it. The outer loop in Fig. 3 shows the progress of the learning stage. It is two ways. One is when the learner decides to leave a current stage. Another is the decision of the system when the learner finds same solution in series. In this paper, the sequence of the rectangle-shaped maze shape of the stage is predefined as 2 × 3, 3 × 3, 3 × 4, 4 × 4, and so on.

Fig. 3.
figure 3

The flow of generating a new achievement by the system

3.3 Designing Automatically Generating the Derived Achievement

Classification of the solutions for creative learning

This section describes the classification of the solutions for creative learning. First, we mention the size of solution. Table 1 shows the classifying solutions based on the length of them. We classify them whether it is shortest or not. Note that to make a redundant solution into the learning goal, it is necessary to introduce some optimality.

Table 1. Classifying solutions based on the length of them

Second, we introduce the optimality of a solution to define the quality of solution. This paper adopts average reward reinforcement learning framework. In it, optimal solution is defined as a solution with the maximum average reward. Note that average reward of it is the sum of the acquired rewards divided by the solution length. Therefore, the shortest solution with acquiring rewards has a tendency to be optimal.

In the field of reinforce learning, the way to find an optimal solution has been investigated in recent years. However, it is not the end of learning on creative learning. So we focus on the learning after an optimal solution found, and also focus on redundant, i.e. non-optimal solutions to utilize them since they have not been drawn attention as learning goals. Next subsection describes how to derive the learning goal from a shortest solution and from a redundant solution.

Deriving a new achievement from negating an optimal solution

This subsection describes the method to generate a derived achievement. When a learner finds a shortest solution, the system adds a negative sub-reward on one of transitions in the found path to try to negate it. This roughly negation of the optimal solution derives a new achievement creatively since the rest of redundant solutions encourage him/her to be creative for avoiding the negative sub-reward to find new solutions. Note that the position of negative sub-reward is random on the path, and its value is −1.

Deriving a new achievement from justifying a redundant solution

This subsection describes the method to generate a derived achievement from the justification of a non-optimal solution. When a learner finds a redundant solution, the system adds a positive sub-reward on one of transitions in the found path to try to justify it. This roughly justification of the redundant solution derives a new achievement creatively since there may be better solutions with the positive sub-reward than this redundant solution. Note that the position if positive sub-reward is random on the path and its value is +1.

3.4 Designing the Learning Goal Space

An overview of the learning goal space

The learning goal space is the space in which learning goals are positioned to display. The learning goal space has the vertical axis as the solution quality and the horizontal axis as the solution size. A learning goal is a set of solutions which both the solution quality and the solution size are the same in solutions found at achievements by the learner. It is positioned as a point on the learning goal space. In this paper, the vertical axis is defined as the total self-entropy based on the acquired rewards, and the horizontal axis is defined as the length of the solution. Note that these definitions are not presented to a human learner. Figure 4 shows the illustrated example of the learning goal space. LGi is the ith learning goal found by the learner. The number of LG expresses the order in which the learner finds them. The transition from LG1 to LG2 shows that the direction of learning is right, and it means the learning only increases the solution size towards the horizontal learning goal. LG3 is transited from LG2 towards the vertical direction in which the solution quality only grows. LG4 is transited from LG3 towards the direction simultaneously to increase both the solution size and the solution quality.

Fig. 4.
figure 4

The illustrated example of the learning goal space

The total self-entropy based on the acquired reward

In this research, we employ the difficulty to obtain positive sub-rewards in solutions as the index expressing the solution quality. Based on the self-entropy in the field of information theory, the self-entropy is proportional to the reciprocal of the occurrence probability of an acquired reward. In this paper, the self-entropy based on the acquired reward equation is as follows:

$$ I\left( r \right) = - \log_{10} P(r) $$

where r is acquiring the positive sub-reward as the event and P(r) is the occurrence probability of the event per a step. Here, the occurrence probability of the positive sub-rewards is as follows:

$$ P\left( r \right) = n(r)/L $$

where n(r) is the total positive sub-reward sizes and L is the expected length of the cycle. The total self-entropy based on the acquired reward is as follows:

$$ {\text{S}} = \sum\nolimits_{{r\epsilon {\mathbb{R}}}} {I(r)} $$

where S is the total self-entropy based on the acquired reward and \( {\mathbb{R}} = \left\{ {r_{1} , r_{2} , \ldots } \right\} \) is the set of the positive sub-reward.

4 Preliminary Experiment

To examine the effects of our creative learning support system, we conducted the preliminary experiment by three subjects.

4.1 Experimental Setup

The creative learning task

The creative learning task is the maze learning task which is the sequence of stages which consist of different size of maze. The objective is both to find various solutions and to get as many scores implemented as rewards as possible. A stage is a set of achievements of the same maze. We employ six kinds of simple mazes with visible walls. The number of stages is six, 2 × 3, 3 × 3, 3 × 4, 4 × 4, 4 × 5, and 5 × 5. The learning goal space is displayed to the learner through the task. It is updated when a subject find a new learning goal.

The instruction for subjects

We show a brief summary of the instruction for subjects.

  • The objective of the creative learning task is both to find more solutions in each maze and as many scores as possible.

  • If you obtain a positive score on a transition, the color changes green at the corresponding rectangle grid between the neighboring square grids.

  • If you obtain a negative score on a transition, the color changes purple at the corresponding rectangle grid between the neighboring square grids.

  • If you find a same solution in series, the stage progresses.

The measurement items

The major measurement items of the experiment are as follows:

  1. (a)

    the number of the trials

  2. (b)

    the number of the found solutions

  3. (c)

    the number of the learning goals

  4. (d)

    the total number of the acquired positive sub-rewards

  5. (e)

    the total number of the acquired negative sub-rewards

  6. (f)

    Score

  7. (g)

    The experiment duration

Note that (g) = (c) + (d) − (e).

4.2 Experimental Results

The experimental results suggest that the subjects learned continuously because the average experiment duration of three subjects is twenty five minutes (one thousand five hundred seconds), and the average number of the found solutions of the subject is one hundred twenty two. Table 2 shows the experimental results. Especially, the number of the stages in which three subjects found more than ten solutions is three. This is the same as the half of the all six stages.

Table 2. Experimental results

Next, we report the detail of the results. Table 3 shows the average learning results of three subjects per stage in which they found more than ten solutions.

Table 3. The average learning results per stage in which three subjects found more than ten solutions

In these stages, the average number of the found solutions per stage is 36.7 and the total number of the found solutions in the three stages accounted for ninety percent (330/367) of the total number in all the stages. From above, it is suggested that the subjects found an average of more than one hundred solutions in the learning experiment, and they could continuously conduct the task.

Then, we describe the effect of the support toward finding creative solutions by automatically generating the derived achievement. As shown in Table 3, the number of the acquired positive sub-rewards by automatically generating the derived achievement accounted for ninety eight percent (1048/1061, 384/395, 317/328) in the stages in which the subjects found more than ten solutions. These suggest that the support by automatically generating the derived achievement motivated the three subjects creatively to find new solutions.

4.3 Discussions

First, we analyze how effect our proposed creative learning support methods gave to the subject’s creative learning from the questionnaires. Table 4 shows the factors which motivate the subject to continue the task. In the questionnaire, all subjects answered searching the solutions motivated them to continue the task. In the hearing investigation, two subjects answered that the learning goal space motivated them to continue the task and they do not pay attention the sub-rewards. By contrast, the other subject answered that he don’t see the learning goal space and he find out how to add a sub-reward.

Table 4. The factors which motivate the subject

Second, we discuss the effect of the reflection of learning by presenting the learning goal space. Figures 5, 6, and 7 show the results of the learning goal space of each subject. Figure 5 shows the experimental result of the learning goal space of the subject who obtains the maximum sum of the positive score (the sum of the positive rewards) within three. The learning goals on it are narrowly distributed toward upper as the solution grows in comparison with that of other subjects shown in Figs. 6 and 7. Note that the number of learning goals on the learning goal space equals the number of the points on the learning goal space, and the points on the horizontal axis are the solutions without positive sub-rewards.

Fig. 5.
figure 5

The learning goal space of 1st subject

Fig. 6.
figure 6

The learning goal space of 2nd subject

Fig. 7.
figure 7

The learning goal space of 3rd subject

Third, we discuss the meaning of our creative learning support which is directing the creative learning by adding two kinds of sub-rewards. Automatically generating the derived achievement supports both convergent thinking and divergent thinking on creative thinking. Adding a positive sub-reward to the found path supports convergent thinking because the learner tries to approach the similar solution to the found one to obtain the positive sub-reward. In contrast, adding a negative sub-reward to the found path supports divergent thinking because the learner tries to find different solution against the found one to avoid the negative sub-reward. So the learning diverges.

5 Conclusions

This paper presented the interactive method for the human to creatively learn under the learning support system. We described the way to design the learning support system towards acquiring the creative skill on learning. We proposed automatically generating the derived achievement from the found solution and the learning goal space to the learner as the reflection of learning. As the future work, we are planning to conduct the update version of the experiment.