1 Introduction

CBR [1] solves problems using the already stored knowledge, and captures new knowledge, making it immediately available for solving the next problem. Therefore, CBR can be seen as a method for problem solving, and also as a method to capture new experience and make it immediately available for problem solving. It can be seen as an incremental learning and knowledge-discovery approach, since it can capture from new experience general knowledge, such as case classes, prototypes and higher-level concepts.

The CBR paradigm has originally been introduced by the cognitive science community. The CBR community aims at developing computer models that follow this cognitive process. For many application areas computer models have successfully been developed based on CBR, such as signal/image processing and interpretation tasks, help-desk applications, medical applications and E-commerce-product selling systems.

In this paper we will explain the CBR process scheme in Sect. 2. We will show what kinds of methods are necessary to provide all the necessary functions for such a computer model. Then we will focus on similarity in Sect. 3. Memory organization in a CBR system will be described in Sect. 4. Both similarity and memory organization are concerned in learning in a CBR system. Therefore, in each section an introduction will be given as to what kind of learning can be performed. In Sect. 5 we will describe open topics in CBR research for specific applications. We will focus on meta-learning for parameter selection, image interpretation, incremental prototype-based classification and novelty detection and handling. In Sect. 5.1 we will describe meta-learning for parameter selection for data processing systems. CBR based image interpretation will be described in Sect. 5.2 and incremental prototype-based classification in Sect. 5.3. New concepts on novelty detection and handling will be presented in Sect. 5.4. While reviewing the CBR work, we will try bridging between the concepts developed within the CBR community and the concepts developed in the statistics community. In the conclusion, we will summarize our concept on CBR in Sect. 6.

2 Case-Based Reasoning

CBR is used when generalized knowledge is lacking. The method works on a set of cases formerly processed and stored in a case base. A new case is interpreted by searching for similar cases in the case base. Among this set of similar cases the closest case with its associated result is selected and presented to the output.

In contrast to a symbolic learning system, which represents a learned concept explicitly, e.g. by formulas, rules or decision trees, a CBR learning system describes a concept \( C \) implicitly by a pair \( (CB,sim) \) where \( CB \) is the case base and \( sim \) the similarity, and changes the pair \( (CB,sim) \) as long as no further change is necessary because it is a correct classifier for the target concept C.

Formal, we like to understand a case as the following:

Definition 1.

A case F is a triple (P, E, L) with a problem description P, an explanation of the solution E and a problem solution L.

The problem description summarizes the information about a case in the form of attributes or features. Other case representations such as graphs, images or sequences may also be possible. The case description is given a-priori or needs to be elicitated during a knowledge acquisition process. Only the most predictive attributes will guarantee us to find exactly the most similar cases.

Equation 1 and Definition 1 give a hint as to how a case-based learning system can improve its classification ability. The learning performance of a CBR system is of incremental manner and it can also be considered as on-line learning. In general, there are several possibilities to improve the performance of a case-based system. The system can change the vocabulary V (attributes, features), store new cases in the case base CB, change the measure of similarity sim, or change V, CB and sim in combinatorial manner.

That brings us to the notion of knowledge containers introduced by Richter [2]. According to Richter, the four knowledge containers are the underlying vocabulary (or features), the similarity measure, the solution transformation, and the cases. The first three represent compiled knowledge, since this knowledge is more stable. The cases are interpreted knowledge. As a consequence, newly added cases can be used directly. This enables a CBR system to deal with dynamic knowledge. In addition, knowledge can be shifted from one container to another container. For instance, in the beginning a simple vocabulary, a rough similarity measure, and no knowledge on solution transformation are used. However, a large number of cases are collected. Over time, the vocabulary can be refined and the similarity measure defined in higher accordance with the underlying domain. In addition, it may be possible to reduce the number of cases, because the improved knowledge within the other containers now enables the CBR system to better differentiate between the available cases.

The abstraction of cases into a more general case (concepts, prototypes and case classes) or the learning of the higher-order relation between different cases may reduce the size of the case base and speed up the retrieval phase of the system [3]. It can make the system more robust against noise. More abstract cases which are set in relation to each other will give the domain expert a better understanding about his domain. Therefore, beside the incremental improvement of the system performance through learning, CBR can also be seen as a knowledge-acquisition method that can help to get a better understanding about the domain [4, 5] or learn a domain theory.

The main problems with the development of a CBR system are the following: What makes up a case?, What is an appropriate similarity measure for the problem?, How to organize a large number of cases for efficient retrieval?, How to acquire and refine a new case for entry in the case base?, How to generalize specific cases to a case that is applicable to a wide range of situations?

3 Similarity

3.1 Similarity Measures

Although similarity is a concept humans prefer to use when reasoning over problems, they usually do not have a good understanding of how similarity is formally expressed. Similarity seems to be a very incoherent concept.

From the cognitive point of view, similarity can be viewed from different perspectives [8]. A red bicycle and a blue bicycle might be similar in terms of the concept “bicycle”, but both bicycles are dissimilar when looking at the colour. It is important to know what kind of similarity is to be considered when reasoning over two objects. Overall similarity, identity, similarity, and partial similarity need to be modelled by the right flexible control strategy in an intelligent reasoning system. It is especially important in image data bases where the image content can be viewed from different perspectives. Image data bases need to have this flexibility and computerized conversational strategies to figure out from what perspective the problem is looked at and what kind of similarity has to be applied to achieve the desired goal. From the mathematical point of view, the Minkowski metric is the most used similarity measure for technical problems:

$$ d_{{ii^{\prime}}}^{(p)} = \left[ {\frac{1}{J}\sum\limits_{j = 1}^{J} {\left| {x_{ij} - x_{{i^{\prime}j}} } \right|^{p} } } \right]^{1/p} $$
(1)

the choice of the parameter p depends on the importance we give to the differences in the summation. Metrical properties such as symmetry, identity and unequality hold for the Minkowski metric.

If we use the Minkowski metric for calculating the similarity between two 1-dimensional curves, such as the 1-dimensional path signal of a real robot axis, and the reconstructed 1-dimensional signal of the same robot axis [9], calculated from the compressed data points stored in a storage device, it might not be preferable to chose p = 2 (Euclidean metric), since the measure averages over all data points, but gives more emphasis to big differences. If choosing p = 1 (City-Block metric), big and small differences have the same influence (impact) on the similarity measure. In case of the Max-Norm (p = ∞) none of the data point differences should exceed a predefined difference. In practice it would mean that the robot axis is performing a smooth movement over the path with a known deviation from the real path and will never come in the worse situation to perform a ramp-like function. In the robot example the domain itself gives us an understanding about the appropriate similarity metric.

Unfortunately, for most of the applications we do not have any a-priori knowledge about the appropriate similarity measure. The method of choice for the selection of the similarity measure is to try different types of similarity and observe their behaviour based on quality criteria while applying them to a particular problem. The error rate is the quality criterion that allows selecting the right similarity measure for classification problems. Otherwise it is possible to measure how well similar objects are grouped together, based on the chosen similarity measure, and at the same time, how well different groups can be distinguished from each other. It changes the problem into a categorization problem for which proper category measures are known from clustering [24] and machine learning [30].

In general, distance measures can be classified based on the data-type dimension. There are measures for numerical data, symbolical data, structural data and mixed-data types. Most of the overviews given for similarity measures in various works are based on this view [10, 12, 16]. A more general view to similarity is given in Richter [11].

Other classifications on similarity measures focus on the application. There are measures for time-series [54], similarity measures for shapes [53], graphs [29], music classification [13], and others.

Translation, size, scale and rotation invariance are another important aspect of similarity as concerns technical systems.

Most real-world applications nowadays are more complex than the robot example given above. They are usually comprised of many attributes that are different in nature. Numerical attributes given by different sensors or technical measurements and categorical attributes that describe meta-knowledge of the application usually make up a case. These n different attribute groups can form partial similarities Sim1, Sim2, …, Simn, that can be calculated based on different similarity measures and may have a meaning for itself. The final similarity might be comprised of all the partial similarities. The simplest way to calculate the overall similarity is to sum up over all partial similarities: \( Sim = w_{1} Sim_{1} + w_{2} Sim_{2} \ldots + w_{n} Sim_{n} \) and model the influence of the particular similarity by different weights wi. Other schemas for combining similarities are possible as well. The usefulness of such a strategy has been shown for meta-learning of segmentation parameters [14] and for medical diagnosis [15].

The introduction of weights into the similarity measure in Eq. 1 puts a different importance on particular attributes and views similarity not only as global similarity, but also as local similarity. Learning the attribute weights allows building particular similarity metrics for the specific applications. A variety of methods based on linear or stochastic optimization methods [18], heuristics search [17], genetic programming [25], and case-ordering [20] or query ordering in NN-classification, have been proposed for attribute-weight learning.

Learning distance function in response to users’ feedback is known as relevance feedback [21, 22] and it is very popular in data base and image retrieval. The optimization criterion is the accuracy or performance of the system rather than the individual problem-case pairs. This approach is biased by the learning approach as well as by the case description.

New directions in CBR research build a bridge between the case and the solution [23]. Cases can be ordered based on their solutions by their preference relations [26] or similarity relation [27] given by the users or a-priori known from application. The derived values can be used to learn the similarity metric and the relevant features. That means that cases having similar solutions should have similar case descriptions. The set of features as well as the feature weights are optimized until they meet this assumption. Learning distance function by linear transformation of features has been introduced by Bobrowski et al. [19].

3.2 Semantic of Similarity

It is preferable to normalize the similarity values between \( 0 \) and \( 1 \) in order to be able to compare different similarity values based on a scale. A scale between \( 0 \) and \( 1 \) gives us a symbolic understanding of the meaning of the similarity value. The value of \( 0 \) indicates identity of the two cases while the value of \( 1 \) indicates the cases are unequal. On the scale of \( 0 \) and \( 1 \) the value of 0.5 means neutral and values between 0.5 and 0 means more similarity and values between 0.5 and 1 means more dissimilarity

Different normalization procedures are known. The most popular one is the normalization to the upper and lower bounds of a feature value.

The main problem arises when the case base is not yet filled and contains only a small number of cases while the other cases are collected incrementally as soon as they arrive in the system. In this case the upper and lower bounds of a feature value \( \left[ {x_{\hbox{min} ,i,k} ,x_{\hbox{max} ,i,k} } \right] \) can only be judged based on this limited set of cases at the point in time \( t_{k} \) and must not meet the true values of \( x_{\hbox{min} ,i} \) and \( x_{\hbox{max} ,i} \) of feature i. Then the scale of similarity might change over time periods \( t_{k} \) which will lead to different decisions for two cases at the points in time \( t_{k} \) and \( t_{k + l} \). The problem of normalization in an incremental case-based learning system needs to be considered in a different way. A first explanation of this problem is given in [55].

4 Organization of Case Base

The case base plays a central role in a CBR system. All observed relevant cases are stored in the case base. Ideally, CBR systems start reasoning from an empty memory, and their reasoning capabilities stem from their progressive learning from the cases they process [28].

Consequently, the memory organization and structure are in the focus of a CBR system. Since a CBR system should improve its performance over time, imposes on the memory of a CBR system to change constantly.

In contrast to research in data base retrieval and nearest-neighbour classification, CBR focuses on conceptual memory structures. While k-d trees [31] are space-partitioning data structures for organizing points in a k-dimensional space, conceptual memory structures [29, 30] are represented by a directed graph in which the root node represents the set of all input instances and the terminal nodes represent individual instances. Internal nodes stand for sets of instances attached to that node and represent a super-concept. The super-concept can be represented by a generalized representation of the associated set of instances, such as the prototype, the mediod or a user-selected instance. Therefore a concept C, called a class, in the concept hierarchy is represented by an abstract concept description (e.g. the feature names and its values) and a list of pointers to each child concept M(C) = {C1, C2, …, Ci, …, Cn}, where Ci is the child concept, called subclass of concept C.

The explicit representation of the concept in each node of the hierarchy is preferred by humans, since it allows understanding the underlying application domain.

While for the construction of a k-d tree only a splitting and deleting operation is needed, conceptual learning methods use more sophisticated operations for the construction of the hierarchy [33]. The most common operations are splitting, merging, adding and deleting. What kind of operation is carried out during the concept hierarchy construction depends on a concept-evaluation function. There are statistical functions known, as well as similarity-based functions.

Because of the variety of construction operators, conceptual hierarchies are not sensitive to the order of the samples. They allow the incremental adding of new examples to the hierarchy by reorganizing the already existing hierarchy. This flexibility is not known for k-d trees, although recent work has led to adaptive k-d trees that allow incorporating new examples.

The concept of generalization and abstraction should make the case base more robust against noise and applicable to a wider range of problems. The concept description, the construction operators as well as the concept evaluation function are in the focus of the research in conceptual memory structure.

The conceptual incremental learning methods for case base organization puts the case base into the dynamic memory view of Schank [32] who required a coherent theory of adaptable memory structures and that we need to understand how new information changes the memory.

Memory structures in CBR research are not only pure conceptual structures, hybrid structures incorporating k-d tree methods are studied also. An overview of recent research in memory organization in CBR is given in [28].

Other work goes into the direction of bridging between implicit and explicit representations of cases [34]. The implicit representations can be based on statistical models and the explicit representation is the case base that keeps the single case as it is. As far as evidence is given, the data are summarized into statistical models based on statistical learning methods such as Minimum Description Length (MDL) or Minimum Message Length (MML) learning. As long as not enough data for a class or a concept have been seen by the system, the data are kept in the case base. The case base controls the learning of the statistical models by hierarchically organizing the samples into groups. It allows dynamically learning and changing the statistical models based on the experience (data) seen so far and prevents the model from overfitting and bad influences by singularities.

This concept follows the idea that humans have built up very effective models for standard repetitive tasks and that these models can easily be used without a complex reasoning process. For rare events the CBR unit takes over the reasoning task and collects experience into its memory. The aspects of case-based maintenance can be found in [6] and the lifecycle of a CBR system is described in [7].

5 Applications

CBR has been successfully applied to a wide range of problems. Among them are signal interpretation tasks [35], medical applications [36], and emerging applications such as geographic information systems, applications in biotechnology and topics in climate research (CBR commentaries) [37]. We are focussing here on hot real-world topics such as meta-learning for parameter selection, image & signal interpretation, prototype-based classification and novelty detection & handling. We first give an overview on CBR-based image interpretation system.

5.1 Meta-Learning for Parameter Selection of Data/Signal Processing Algorithms

Meta learning is a subfield of Machine learning where automatic learning algorithms are applied on meta-data about machine-learning experiments. The main goal is to use such meta-data to understand how automatic learning can become flexible as regards solving different kinds of learning problems, hence to improve the performance of existing learning algorithms. Another important meta-learning task, but not so widely studied yet, is parameter selection for data or signal processing algorithms. Soares et al. [39] have used this approach for selecting the kernel width of a support-vector machine, while Perner and Frucci et al. [14, 40] have studied this approach for image segmentation.

The meta-learning problem for parameter selection can be formalized as follows: For a given signal that is characterized by specific signal properties A and domain properties B find the parameters of the processing algorithm that ensure the best quality of the resulting output signal:

$$ f:A \cup B \to P_{i} $$
(2)

with Pi the i-th class of parameters for the given domain.

What kind of meta-data describe classification tasks, has been widely studied within meta-learning in machine learning. Meta-data for images comprised of image-related meta-data (gray-level statistics) and non-image related meta-data (sensor, object data) are given in Perner and Frucci et al. [14, 40]. In general the processing of meta-data from signals and images should not require too much processing and they should allow characterizing the properties of the signals that influence the signal processing algorithm.

The mapping function f can be realized by any classification algorithm, but the incremental behaviour of CBR fits best to many data/signal processing problems where the signals are not available ad-hoc but appear incrementally. The right similarity metric that allows mapping data to parameter groups and in the last consequence to good output results should be more extensively studied. Performance measures that allow to judge the achieved output and to automatically criticize the system performances are another important problem.

Abstraction of cases to learn domain theory are also related to these tasks and would allow to better understand the behaviour of many signal processing algorithms that cannot be described anymore by standard system theory [41].

5.2 Case-Based Image Interpretation

Image interpretation is the process of mapping the numerical representation of an image into a logical representation such as is suitable for scene description. This is a complex process; the image passes through several general processing steps until the final result is obtained. These steps include image preprocessing, image segmentation, image analysis, and image interpretation. Image pre-processing and image segmentation algorithm usually need a lot of parameters to perform well on the specific image. The automatically extracted objects of interest in an image are first described by primitive image features. Depending on the particular objects and focus of interest, these features can be lines, edges, ribbons, etc. Typically, these low-level features have to be mapped to high-level/symbolic features. A symbolic feature such as fuzzy margin will be a function of several low-level features.

The image interpretation component identifies an object by finding the object to which it belongs (among the models of the object class). This is done by matching the symbolic description of the object to the model/concept of the object stored in the knowledge base. Most image-interpretation systems run on the basis of a bottom-up control structure. This control structure allows no feedback to preceding processing components if the result of the outcome of the current component is unsatisfactory. A mixture of bottom-up and top-down control would allow the outcome of a component to be refined by returning to the previous component.

CBR is not only applicable as a whole to image interpretation, it is applicable to all the different levels of an image-interpretation system [12, 42] and many of the ideas mentioned in the chapters before apply here. CBR-based meta-learning algorithms for parameter selection are preferable for the image pre-processing and segmentation unit [14, 40]. The mapping of the low-level features to the high-level features is a classification task for which a CBR-based algorithm can be applied. The memory organization [29] of the interpretation unit goes along with problems discussed for the case base organization in Sect. 5. Different organization structures for image interpretation systems are discussed in [12]. The organization structure should allow the incremental updating of the memory and learning from single cases more abstract cases. Ideally the system should start working with only a few samples and during usage of the system new cases should be learnt and the memory should be updated based on these samples. This view at the usage of a system brings in another topic that is called life-time cycle of a CBR system. Work on this topic takes into account that a system is used for a long time, while experience changes over time. The case structure might change by adding new relevant attributes or deleting attributes that have shown not to be important or have been replaced by other ones. Set of cases might not appear anymore, since these kinds of solutions are not relevant anymore. A methodology and software architecture for handling the life-time cycle problem is needed so that this process can easily be carried out without rebuilding the whole system. It seems to be more a software engineering task, but has also something to do with evaluation measures that can come from statistics.

5.3 Incremental Prototype-Based Classification

The usage of prototypical cases is very popular in many applications, among them are medical applications [43], Belazzi et al. [45] and by Nilsson and Funk [44], knowledge management systems [46] and image classification tasks [48]. The simple nearest-neighbour- approach [47] as well as hierarchical indexing and retrieval methods [43] have been applied to the problem. It has been shown that an initial reasoning system could be built up based on these cases. The systems are useful in practice and can acquire new cases for further reasoning during utilization of the system.

There are several problems concerned with prototypical CBR: If a large enough set of cases is available, the prototypical case can automatically be calculated as the generalization from a set of similar cases. In medical applications as well as in applications where image catalogues are the development basis for the system, the prototypical cases have been selected or described by humans. That means when building the system, we are starting from the most abstract level (the prototype) and have to collect more specific information about the classes and objects during the usage of the system.

Since a human has selected the prototypical case, his decision on the importance of the case might be biased and picking only one case might be difficult for a human. As for image catalogue-based applications, he can have stored more than one image as a prototypical image. Therefore we need to check the redundancy of the many prototypes for one class before taking them all into the case base.

According to this consideration, the minimal functions a prototype-based classification system should realize are: classifications based on a proper similarity-measure, prototype selection by a redundancy-reduction algorithm, feature weighting to determine the importance of the features for the prototypes and to learn the similarity metric, and feature-subset selection to select the relevant features from the whole set of features for the respective domain.

Statistical methods focus on adaptive k-NN that adapts the distance metric by feature weighting or kernel methods or the number k of neighbours off-line to the data. Incremental strategies are used for the nearest- neighbour search, but not for updating the weights, distance metric and prototype selection.

A prototype-based classification system for medical image interpretation is described in [48]. It realizes all the functions described above by combining statistical methods with artificial intelligence methods to make the system feasible for real-world applications. A system for handwriting recognition is described in [49] that can incrementally add data and adapt the solutions to different users’ writing style. A k-NN realization that can handle data streams by adding data through reorganizing a multi-resolution array data structure and concept drift by realizing a case forgetting strategy is described in [50].

The full incremental behaviour of a system would require an incremental processing schema for all aspects of a prototype-based classifier such as for updating the weights and learning the distance metric, the prototype selection and case generalization.

5.4 Novelty Detection by Case-Based Reasoning

Novelty detection [51], recognizing that an input differs in some respect from previous inputs, can be a useful ability for learning systems.

Novelty detection is particularly useful where an important class is under-represented in the data, so that a classifier cannot be trained to reliably recognize that class. This characteristic is common to numerous problems such as information management, medical diagnosis, fault monitoring and detection, and visual perception.

We propose novelty detection to be regarded as a CBR problem under which we can run the different theoretical methods for detecting the novel events and handling the novel events [34]. The detection of novel events is a common subject in the literature. The handling of the novel events for further reasoning is not treated so much in the literature, although this is a hot topic in open-world applications.

The first model we propose is comprised of statistical models and similarity-based models. For now, we assume an attribute-value based representation. Nonetheless, the general framework we propose for novelty detection can be based on any representation. The heart of our novelty detector is a set of statistical models that have been learnt in an off-line phase from a set of observations. Each model represents a case-class. The probability density function implicitly represents the data and prevents us from storing all the cases of a known case-class. It also allows modelling the uncertainty in the data. This unit acts as a novel-event detector by using the Bayesian decision-criterion with the mixture model. Since this set of observations might be limited, we consider our model as being far from optimal and update it based on new observed examples. This is done based on the Minimum Description Length (MDL) principle or the Minimum Message Length (MML) learning principle [52].

In case our model bank cannot classify an actual event into one of the case-classes, this event is recognized as a novel event. The novel event is given to the similarity-based reasoning unit. This unit incorporates this sample into their case base according to a case-selective registration-procedure that allows learning case-classes as well as the similarity between the cases and case-classes. We propose to use a fuzzy similarity measure to model the uncertainty in the data. By doing that the unit organizes the novel events in such a fashion that is suitable for learning a new statistical model.

The case-base-maintenance unit interacts with the statistical learning unit and gives an advice as to when a new model has to be learnt. The advice is based on the observation that a case-class is represented by a large enough number of samples that are most dissimilar to other classes in the case-base.

The statistical learning unit takes this case class and proves based on the MML-criterion, whether it is suitable to learn the new model or not. In the case that the statistical component recommends to not learn the new model, the case-class is still hosted by the case base maintenance unit and further up-dated based on new observed events that might change the inner-class structure as long as there is new evidence to learn a statistical model.

The use of a combination of statistical reasoning and similarity-based reasoning allows implicit and explicit storage of the samples. It allows handling well-represented events as well as rare events.

6 Conclusion

In this paper we have presented our thoughts and work on Case-Based Reasoning. We presented the methods, techniques, and applications. More work on CBR can be found in [38]. CBR solves problems using already stored knowledge, and captures new knowledge, making it immediately available for solving the next problem. To realize this cognitive model in a computer-based system we need methods known from statistics, pattern recognition, artificial intelligence, machine learning, data base research and other fields. Only the combination of all these methods will give us a system that can efficiently solve practical problems. Consequently, CBR research has shown much success for different application areas, such as medical and technical diagnosis, image interpretation, geographic information systems, text retrieval, e-commerce, user-support systems and so on. CBR systems work efficiently in real-world applications, since the CBR method faces on all aspects of a well-performing and user-friendly system.

We have pointed out that the central aspect of a well-performing system in the real-world is its ability to incrementally collect new experience and reorganize its knowledge based on these new insights. In our opinion the new challenging research aspects should have its focus on incremental methods for prototype-based classification, meta-learning for parameter selection, complex signals understanding tasks and novelty detection. The incremental methods should allow changing the system function based on the newly obtained data.

Recently, we are observing that this incremental aspect is in the special focus of the quality assurance agency for technical and medical application, although this is in opposition to the current quality performance guidelines.

While reviewing the CBR work, we have tried bridging between the concepts developed within the CBR community and the concepts developed in the statistics community. At the first glance, CBR and statistics seem to have big similarities. But when looking closer at it one can see that the paradigms are different. CBR tries to solve real-world problems and likes to deliver systems that have all the functions necessary for an adaptable intelligent system with incremental learning behavior. Such a system should be able to work on a small set of cases and collect experience over time. While doing that it should improve its performance. The solution need not be correct in the statistical sense, rather it should help an expert to solve his tasks and learn more about it over time.

Nonetheless, statistics disposes of a rich variety of methods that can be useful for building intelligent systems. In the case that we can combine and extend these methods under the aspects necessary for intelligent systems, we will further succeed in establishing artificial intelligence systems in the real world.

Our interest is to build intelligent flexible and robust data-interpreting systems that are inspired by the human CBR process and by doing so to model the human reasoning process when interpreting real-world situations.