Classifying the segmentation of customer value via RFM model and RS theory

https://doi.org/10.1016/j.eswa.2008.04.003Get rights and content

Abstract

Data mining is a powerful new technique to help companies mining the patterns and trends in their customers data, then to drive improved customer relationships, and it is one of well-known tools given to customer relationship management (CRM). However, there are some drawbacks for data mining tool, such as neural networks has long training times and genetic algorithm is brute computing method. This study proposes a new procedure, joining quantitative value of RFM attributes and K-means algorithm into rough set theory (RS theory), to extract meaning rules, and it can effectively improve these drawbacks. Three purposes involved in this study in the following: (1) discretize continuous attributes to enhance the rough sets algorithm; (2) cluster customer value as output (customer loyalty) that is partitioned into 3, 5 and 7 classes based on subjective view, then see which class is the best in accuracy rate; and (3) find out the characteristic of customer in order to strengthen CRM.

A practical collected C-company dataset in Taiwan’s electronic industry is employed in empirical case study to illustrate the proposed procedure. Referring to [Hughes, A. M. (1994). Strategic database marketing. Chicago: Probus Publishing Company], this study firstly utilizes RFM model to yield quantitative value as input attributes; next, uses K-means algorithm to cluster customer value; finally, employs rough sets (the LEM2 algorithm) to mine classification rules that help enterprises driving an excellent CRM. In analysis of the empirical results, the proposed procedure outperforms the methods listed in terms of accuracy rate regardless of 3, 5 and 7 classes on output, and generates understandable decision rules.

Introduction

Due to the complication and diversification of business operation, information of company is essential and vital forces for advantage competition and going-concern. Particularly, the growing of information technology (IT) in rapid changing and competitive environment today motivates the activity of transaction, which increasingly facilities the markets competition. Based on this relationship, information serves as central to face the opportunities and challenges of day-to-day operation for companies. It is very difficult for companies that strengthen business’s competitive advantage if information only becomes to support the functions within company when facing to the heavy challenges coming from outsides surroundings. Thus, how to enhance the market competitive power for companies is an interesting issue because of the more the competitive power, the more the probability for going-concern. The key point gaining profit of companies is to integrate the upstream members of supply chain via an effective IT in order to reduce cost, and reinforce the downstream customer relationships via an excellent CRM in order to gain more profit. CRM becomes the focal point of company profits and more and more important for companies because customers are main resources of profits. Therefore, this study insists on that an excellent CRM with customers for companies is a critical for gaining more profit.

The fulfillment of customer requirements is one of key factors for the success of business operation. CRM is to achieve the needs of customers and to enhance the strength with customers for company (Thompson & Sims, 2002). However, the effective and efficient utilization of IT to support the CRM process is short path for successful CRM. Although understanding the situations of customers is somewhat different, the companies that all provide products and services for customers to satisfy their demands are similar to mine valuable information of customers, to realize the customer value maximization, to increase customer loyalty and finally to obtain plenty profits for themselves (Joo & Sohn, 2008). Therefore, a large number of companies apply the different tools such as computer software package, statistical techniques, to enhance a more efficient CRM, in order to let companies understanding more about their customers.

Nowadays, by utilizing data mining tools for assisting CRM, some techniques, which include decision trees (DT), artificial neural networks (ANN), genetic algorithms (GA), association rules (AR), etc., are usually used in some fields such as engineering, science, finance, business, to solve related problems with customers (Witten & Frank, 2005). A decision tree is a flow-chart-like tree structure, where each internal node denotes a test on an attribute, each branch represents an outcome of the test, and leaf nodes represent class or class distributions (Han & Kamber, 2001). An artificial neural network is a large number of highly interconnected processing elements (neurons) that uses a mathematical model, computational model or non-linear statistical data modeling tools for information processing to capture and represent complex input/output relationships. Genetic algorithms, which were formally introduced in the United States in the 1970s by John Holland at University of Michigan, are search algorithms applied to solve problems on a computer based on the mechanics of natural selection and the process of natural evolution (Holland, 1973, Miller et al., 1989). Association Rules based on co-occurrence can be used to address relationships that customers which buy X tend to buy Y, and to support related activity of business operations such as product promotions, CRM programs, and inventory control. The problem of mining association rule has firstly been stated in 1993 and is one of the most used tools to find the relationships among items (products) in large databases today (Dunham, 2003).

In recent years, data mining has not only a great popularity in research area but also in commercialization. Data mining can help organizations discovering meaningful trends, patterns and correlations in their customer, product, or data, to drive improved customer relationships and then decrease the risk of business operations. The basic data mining techniques include classification, clustering, association rules, regression analysis, sequence analysis, etc. Other data mining techniques include rule-based reasoning approach, genetic algorithms, decision trees, fuzzy logic, inductive learning systems, statistical methods, and so forth (Witten & Frank, 2005).

Generally, no tool for data mining in CRM is perfect because there are some uncertain drawbacks in it. For example, in decision trees, too many instances lead to large decision trees which may decrease classification accuracy rate and do not clearly create the relationships which come from the training examples. In artificial neural networks, number of hidden neurons, number of hidden layers and training parameters need to be determined, and ANN has long training times in a large dataset especially. Moreover, ANN served as “black box” which leads to inconsistency of the outputs, is a trial-and-error process. In genetic algorithm, GA also has some drawbacks such as slow convergence, a brute computing method, a large computation time and less stability. In association rules, major drawback is the number of generated rules is huge and may be a redundancy.

For solving the problems of previous paragraph, two methods, K-means algorithm and RS theory, are worth to be explored in this study. K-means is one of the well-known algorithms for cluster analysis and it has been used extensively in various fields including data mining, statistical data analysis and other business applications. Cluster analysis is a statistical technique that are used to identify a set of groups that both minimize within-group variation and maximize between-group variation based on a distance or dissimilarity function, and its aim is to find an optimal set of clusters (Witten & Frank, 2005). With respect to rough set theory (RS theory), five advantages are expressed in the following: (1) the RS theory do not require any preliminary or additional parameter about the data; (2) they can work with missing values, switch among different reducts, and use less expensive or time to generate rules; (3) they offer the ability to handle large amounts of both quantitative and qualitative data; (4) they yield understandable decision rules and own stability; and (5) they can model highly non-linear or discontinuous functional relationships provides a powerful method for characterizing complex, multidimensional patterns (Hashemi et al., 1998, Pawlak, 1982). Thus, this study is on the use of some techniques (i.e. RS theory) to cope with these shortcomings and then to improve CRM for enterprises, based on RFM (recency–frequency–monetary) attributes and K-means method for clustering customer value. Generally, the purpose of this study is to generate classification rules for achieving an excellent CRM which are believed to maximize profits with win–win situation for company–customer.

The rest of the paper is organized in the following: In Section 2 we describe an overview of the related works, while Section 3 presents the proposed procedure and briefly discusses its architecture. Section 4 describes analytically the experimental results. Finally, Section 5 concludes the paper.

Section snippets

Related works

This study proposes an enhanced rough sets method to verify that whether it can be helpful on operation management in enterprises. This section mainly explores the issue or theoretical parts of operation model and management, and some techniques for clustering customer value. Thus, this study reviews related studies of the customer relationship management, customer value analysis, K-means algorithm, rough set theory and the LEM2 rule extraction method.

Methodology

This section briefly introduces the research model of this study and the proposed procedure for classifying customer value.

CRM is to achieve the needs of customers and enhance the strength with customers for company (Thompson & Sims, 2002). In recent years, data mining has not only a great popularity in research area but also in commercialization. Nowadays, by utilizing data mining tools for assisting CRM, some techniques, which include DT, ANN, GA, AR, etc., are usually used in some fields

Empirical case study

In this section, we introduce the empirical case (C-company) and the computing process using C-company dataset.

Conclusions

This study has proposed a procedure which joins RFM attributes and K-means algorithm into rough sets theory (the LEM2 algorithm) not only to enhance classification accuracy but also to extract classification rules for achieving an excellent CRM for enterprises. Additionally, it can effectively improve some drawbacks of data mining tools. To demonstrate the proposed procedure, this study employs a practical collected C-company dataset in Taiwan’s electronic industry, which include 401 instances,

References (30)

  • J. Han et al.

    Data mining: Concepts and techniques

    (2001)
  • J.H. Holland

    Genetic algorithms and the optimal allocation of trials

    SIAM Journal on Computing

    (1973)
  • A.M. Hughes

    Strategic database marketing

    (1994)
  • R. Kalakota et al.

    e-Business roadmap for success

    (1999)
  • Kaymak, U. (2001). Fuzzy target selection using RFM variables. In IFSA World congress and 20th NAFIPS international...
  • Cited by (0)

    View full text