Handling sequential pattern decay: Developing a two-stage collaborative recommender system

https://doi.org/10.1016/j.elerap.2008.10.001Get rights and content

Abstract

This study proposes a sequential pattern based collaborative recommender system that predicts the customer’s time-variant purchase behavior in an e-commerce environment where the customer’s purchase patterns may change gradually. A new two-stage recommendation process is developed to predict customer purchase behavior for the product categories, as well as for product items. The time window weight is introduced to produce sequential patterns closer to the current time period that possess a larger impact on the prediction than patterns relatively far from the current time period. This study is the first to propose time-decaying sequential patterns within a collaborative recommender system. The experimental results show that the proposed system outperforms the traditional collaborative system using a public food mart dataset and a synthetic dataset.

Introduction

Recommender systems are developed to deal with information overload and provide personalized recommendations, content and services to users (Adomavicius and Tuzhilin, 2005, Rashid et al., 2002). These software systems have been applied in many areas including e-commerce, news, advertisement, document management and e-learning. In the e-commerce application, recommender systems can potentially turn browsers into buyers by providing personalized shopping information that interests the customer, thus improving cross-sales and attaining customer loyalty (Schafer et al., 1999, Schafer et al., 2001).

Recommender systems make recommendations using three basic steps: acquiring preferences from the customers’ input data, computing recommendations using proper techniques and presenting the recommendations to customers (Wei et al., 2007). Customer preferences can be captured using four types of customer data: demographic data, rating data, browsing pattern data and transaction data (Wei et al., 2007). Transaction data provides sequences of purchased items and can be used to predict future customer purchasing preferences. In the rapid changing business environment, product life cycle has become shorter with customer buying behaviors changing more frequently. Therefore, some researchers (Cho et al., 2005, Min and Han, 2005) applied the customer’s time-variant buying sequence as a dynamic customer profile to improve recommender system performance. The periodic transaction data for a customer can be called the “dynamic customer profile”, which was originated by Cho et al. (2005). Traditional recommender systems incorporate customer transaction data from only a single temporal period, known as the static customer profile. These systems do not make use of customer purchase sequences (Cho et al., 2005). The dynamic nature of a customer’s purchase sequences can improve the recommendation quality of recommender systems.

Although the dynamic customer profile provides a basis for generating recommendations, in an environment in which the customer gradually changes purchase patterns (interests), the transaction data close to the current temporal period are usually more important than that temporally far from the current period. This study introduces time window weights to provide higher importance on the patterns generated in more recent time periods to improve the recommendation quality based on the dynamic customer profile.

In addition to the time-variant dynamic customer profile, predicting the customer’s purchasing using various techniques plays an important role in designing a recommendation system. Many different approaches categorized as heuristic-based or model-based (Adomavicius and Tuzhilin, 2005) have been applied to the problem of making accurate recommendations. These techniques include nearest neighbor algorithms, Bayesian analysis, neural networks and association rules. Association rules have been used for many years in e-commerce merchandising, both to analyze preference patterns across products and recommend products to consumers based on other products that they have selected (Schafer et al., 2001). The association rule expresses the relationship that one product is often purchased along with other products.

Recently, temporal data mining – sequential data mining – was introduced by Agrawal and Srikant (1995). A sequence database consists of a series of sequences. Each sequence is composed of several transactions (also called events) sorted in time ascending order. Sequential pattern mining is the mining of frequently occurring ordered events or sequences (Han and Kamber, 2006). Unlike association rules, sequential patterns may suggest that a consumer who buys a new product in the current time period is likely to buy another product in the next time period. Sequential pattern mining has been applied to sales promotions, targeted marketing, production processes, web access pattern analysis, network intrusion detection and DNA sequence analysis. In e-commerce, sequential patterns are useful for personalizing product recommendations and product related advertisements to improve customer satisfaction. There has not been any research on a collaborative system that uses sequential purchase patterns based on a multi-period time-variant customer profile. Researchers have addressed time decay in the importance of previous purchases (Cho et al., 2005, Min and Han, 2005), but until now, no system has yet been developed that applies time decay to sequential patterns.

Collaborative recommender systems recommend items that are similar to items already purchased by people in the same preference group. The collaborative recommender systems have data sparsity limitation problems. That is, the quantity of products that customers may have purchased is relatively few, compared with large product sets (e.g. Amazon.com). In this situation, the recommender systems may be unable to make any product recommendations for a particular customer, as these customers may have purchased too few common products, thus leading to poor recommendations. This is known as the data sparsity problem (Adomavicius and Tuzhilin, 2005, Sarwar et al., 2000). To reduce this limitation, this study uses a two-stage recommendation: predicting the top-M product categories according to sequential patterns and deriving the top-N product items among the predicted product categories. In the first stage, the sequential purchasing patterns for the product categories are established in advance. The target customers’ possible favorite product categories can then be predicted based on the sequential purchase patterns. Focusing on product category prediction has the merits of reducing the data dimension, increasing the scalability and reducing the data sparsity limitation. In the second stage, the product items that the target customers are likely to purchase are generated using the top-N approach based on the categories predicted in the first stage. The proposed two-stage recommender system based on the customers’ time-variant customer profile can improve the recommendation quality.

The rest of this paper is organized as follows. Section 2 introduces related works. Section 3 describes the framework of the proposed sequential pattern based collaborative recommender system. Section 4 examines the performance of the proposed CF system using a public food mart dataset and a simulated dataset. Section 5 provides conclusions.

Section snippets

Recommender systems

In addition to the collaborative filtering technique (CF) (Resnick et al., 1994, Wang et al., 2004), the content-based filtering (CBF) (Krulwich and Burkey, 1996, Lang, 1995) and hybrid-based recommender systems (Balabanovic and Shoham, 1997) have also been used (Adomavicius and Tuzhilin, 2005, Wei et al., 2007, Sarwar et al., 2000, Perugini et al., 2004). The CBF recommends items similar to the customer’s past preferences (customer profiles). A customer profile that contains information about

System framework

Given all of the target customers’ transactional sequences in the current time period T and the previous r periods, T  1, T  2,…, and T  r, this study determines the active customer’s most likely purchase items in the next time period T + 1 (target prediction period). The proposed system, as shown in Fig. 1, consists of model training for the target customers and model use (implementation) for the active customers. Active customers are selected from the target customer to receive recommendations

Datasets and experiments

We implemented a prototype of the proposed recommender system based on the Foodmart supermarket database, which is a public database stored in Microsoft SQL Server 2000, and a synthetic dataset from the IBM AssocGen program (Agrawal and Srikant, 1994). The prototype system was implemented on an Intel Pentium M 740 1.73GHz Processor. The operating system is WINDOWS XP with web server IIS 5.1. The proposed recommender system was developed using MS Visual InterDev 6.0 in Active Server Page (ASP)

Findings and applications

Because a customer’s purchasing patterns may change gradually, the customers’ transaction data can be analyzed on a multi-period basis. The sequential patterns closer to the current periods are given larger weights by the time window weight. Traditional recommender systems have little capability to deal with situations in which the customers’ preferences change gradually, as they discover customers’ purchasing patterns based on transaction data using only a single period. Based on the dynamic

Acknowledgements

The authors would like to thank the anonymous referees for their valuable comments, which helped in improving the quality of this paper. The authors would also like to thank the National Science Council of the Republic of China, Taiwan for financially supporting this research under Contract No. NSC 96-2221-E-327-008.

References (48)

  • L.Y. Tseng et al.

    A genetic approach to the automatic clustering problem

    Pattern Recogn

    (2001)
  • Y.F. Wang et al.

    A personalized recommender system for the cosmetic business

    Expert Syst Appl

    (2004)
  • G. Adomavicius et al.

    Towards the next generation of recommender systems: a survey of the state-of-the-art and possible extensions

    IEEE Trans Knowl Data Eng

    (2005)
  • Agrawal R, Srikant R. Mining sequential patterns. Research Report RJ 9910, IBM Almaden Research Center, San Jose,...
  • Agrawal R, Srikant R. Mining sequential patterns. In: IEEE international conference on data engineering, Taipei,...
  • Agrawal R, Srikant R, Mining sequential patterns: generalizations and performance improvements. In: Proceeding of the...
  • M. Balabanovic et al.

    Fab: content-based collaborative recommendation

    Commun ACM

    (1997)
  • Y.L. Chen et al.

    Discovering fuzzy time-interval sequential patterns in sequence databases

    IEEE Trans Syst, Man Cybernet, B

    (2005)
  • W. Conover

    Practical nonparametric statistics

    (1980)
  • D.L. Davies et al.

    A cluster separation measure

    IEEE Trans Pattern Anal Machine Intell

    (1979)
  • L. Davis

    Handbook of genetic algorithms

    (1991)
  • D.E. Goldberg

    Genetic algorithms in search optimization and machine learning

    (1989)
  • D. Goldberg et al.

    Using collaborative filtering to weave an Information tapestry

    Commun ACM

    (1992)
  • J. Han et al.

    Data mining concepts and techniques

    (2006)
  • Cited by (55)

    • Hybrid fast unsupervised feature selection for high-dimensional data

      2019, Expert Systems with Applications
      Citation Excerpt :

      The wrapper approach uses a specific learning algorithm, and often improves learning performance at the cost of increasing computational complexity. The feature subset selected in this approach partially depends on the type of learning algorithm (Huang & Huang, 2009; Wan, Wang, Ye, & Lai, 2016; Wei et al., 2017). Embedded methods embed feature selection into the learning algorithm.

    • Personalized recommendations based on time-weighted overlapping community detection

      2015, Information and Management
      Citation Excerpt :

      We also compare our results with some temporal recommendation algorithms proposed in the recent literature with some adjustments to fit our experiment/needs. Huang and Huang proposed the Sequential Pattern-based Collaborative Recommendation algorithm (SPCR), which predicted a user's time-variant interest/purchase behavior [16]. This model first identifies the Top-M recommended classes among hierarchical-clustering classes and then obtains the Top-N recommendation of the M recommended classes for each person.

    View all citing articles on Scopus
    View full text