Deriving two-stage learning sequences from knowledge in fuzzy sequential pattern mining
Introduction
Sequential pattern mining is the determining of frequently occurring patterns related to time or other sequences [1], where a sequence is an ordered list of itemsets [2]. Actually, sequential patterns can help managers determine which items are bought after other items had been bought [1], or to analyze browsing orders of homepages in a Web site [3]. Recently, Hu et al. [4] have proposed fuzzy sequential pattern mining, and much worthwhile fuzzy knowledge related to consumer purchase behavior, such as “large amounts of product A were frequently purchased for each consumer” or “small amounts of one product and large amounts of another product were purchased sequentially” may be discovered from transaction databases. The former fuzzy representation or term may be roughly interpreted as the consumers’ purchase preference of product A, and the latter fuzzy term may be a fuzzy sequential pattern that roughly represents consumers’ buying order of products. In fact, a fuzzy sequential pattern is derived from consumers’ purchase preference of products and expresses the temporal relation between them. We also find that the fuzzy sequential patterns described by natural languages are well suited for the use by human subjects and will help to increase the flexibility for users in making decisions.
We consider that decision makers can “learn” the above-mentioned fuzzy knowledge as two different types, that is, they can be interested in investigating either consumer behavior or current strategies based on fuzzy sequential patterns. In other words, they can acquire or learn corresponding knowledge from fuzzy sequential patterns. Finally, decision makers should be confident in solving certain decision problems, for example, proposing a more competitive marketing strategy. However, the acquisition of knowledge should be carefully planned rather blindly learned. For example, it seems to be easier to acquire “algorithms” after both “introduction to computer science” and “data structures” have been already acquired, compared to the situation when only “introduction to computer science” has been acquired. That is, how to derive appropriate “learning” sequences from fuzzy knowledge extracted from fuzzy sequential pattern mining for decision makers is an important problem.
Competence sets was initiated by Yu [5], and its mathematical foundation was provided by Yu and Zhang [6]. For each decision problem, there is a competence set consisting of ideas, knowledge, information and skills for its satisfactory solution [5], [6], [7]. From this viewpoint, we can view knowledge found in fuzzy sequential pattern mining as a needed competence set for solving one decision problem. In order to effectively acquire the needed competence set, it is necessary to find appropriate learning sequences for acquiring those useful patterns, the so-called competence set expansion.
Since the set related to “consumers’ product buying orders” (denoted by C2) is derived from the set related to “consumers’ purchase preference of products” (denoted by C1), we assume that it is helpful for decision makers to learn C2 after first learning C1. That is, by treating C1 as an aggregate skill we consider that a two-stage learning sequence is designed to consist of two subsequences: one generated from C1, the other generated from C2. Actually, two-stage learning sequences with minimum costs are derived by a powerful method, the minimum spanning table method (MST), proposed by Feng and Yu [8], since MST is especially powerful for the expansion of set of single skills or terms. From the experimental results, we can see that it is possible to help decision makers effectively acquire a needed competence set found in the fuzzy sequential pattern mining, enabling them to set up strategies for promoting their products or improving their services. It is noted that a compound skill represents a collection of single skills that might be acquired by decision makers [9], [10]; however, it is not considered in this paper for simplicity.
The rest of this paper is organized as follows. Since the fuzzy sequential pattern mining is developed by the simple fuzzy partition method [11], [12], this method is introduced in Section 2. Subsequently, the fuzzy data mining technique for discovering fuzzy sequential patterns is briefly introduced in Section 3, where the generation and representations of C1 and C2 are demonstrated in detail. Section 4 introduces the MST. Detailed experimental results of a numerical example are presented in Section 5. We end this paper with discussions and conclusions in Section 6.
Section snippets
Simple fuzzy partition method
Fuzzy sets were originally proposed by Zadeh [13], who also proposed the concept of linguistic variables and its applications to approximate reasoning [14]. A linguistic variable is a variable whose values are linguistic words or sentences in a natural language [15]. For example, the values or linguistic terms of the linguistic variable “amounts of apple juices that were purchased” may be “close to 3 pounds” or “very close to 5 pounds”. In this paper, triangular membership functions are used
Generate C1 and C2
In this section, the concrete meanings of C1 and C2 are described in detail, and the computational steps of the proposed method are also briefly introduced as follows.
The fuzzy sequential pattern mining consists of two phases. After candidate 1-dim fuzzy grids have been generated, we must determine how to find frequent fuzzy grids, frequent fuzzy k-sequences (k⩾1) and fuzzy sequential patterns from those candidate 1-dim fuzzy grids. Frequent fuzzy grids with small dimension, say m, are used to
Competence set expansion
Competence set expansion means a learning sequence of acquiring the needed skills so that the needed competence set is obtained [8]. It can be regarded as a tree construction process if there are no compound skills [19]. Feng and Yu [8] proposed a powerful method, the minimum spanning table method (MST), that can employ a directed graph with an expansion table to find a spanning tree with minimum cost. Then, an optimal expansion is acquired from minimum spanning tree.
This procedure views each
Numerical example
A database relation, BOUGHT, with 10 tuples tp(r) (1⩽r⩽5, α1=2, α2=3, α3=1, α4=3, α5=1) is given in Table 1, where the asterisks denote that one product was not purchased in that transaction. We first employ the method for fuzzy sequential pattern mining to discover fuzzy sequential patterns from BOUGHT. Subsequently, two-stage learning sequences are derived from those fuzzy terms found in the mining results. For simplicity, some columns or rows of the following tables are omitted and indicated
Discussions and further topics
Fuzzy knowledge discovered in the fuzzy sequential pattern mining can be viewed as a competence set of one decision problem, and the acquisition of two-stage learning sequences, consisting of C1 and C2, with minimum learning costs is the focus of this paper. In addition to the aforementioned descriptions in previous sections, many further topics should be discussed.
First, the meaning of the fuzzy terms of the quantitative attribute xl can be changed by a linguistic hedge [13] such as “very” or
Acknowledgements
We would like to thank the anonymous referees for their valuable comments and constructive suggestions.
References (22)
- et al.
A foundation for competence set analysis
Mathematical Social Sciences
(1990) Fuzzy sets
Information Control
(1965)The concept of a linguistic variable and its application to approximate reasoning
Information Science (part 1)
(1975)Approximation of fuzzy concepts in decision making
Fuzzy Sets and Systems
(1997)- et al.
Data Mining: Concepts and Techniques
(2001) - R. Agrawal, R. Srikant, Mining sequential patterns, in: Proceedings of the 11th International Conference on Data...
Web usage mining for Web site evaluation
Communications of the ACM
(2000)- et al.
A fuzzy data mining algorithm for finding sequential patterns
International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
(2003) Forming Winning Strategies: An Integrated Theory of Habitual Domains
(1990)- et al.
Optimal competence set expansion using deduction graph
Journal of Optimization Theory and Application
(1994)