Skip to main content
Log in

Segmentation modeling algorithm: a novel algorithm in data mining

  • Published:
Information Technology and Management Aims and scope Submit manuscript

Abstract

Many enterprises have accumulated a large amount of data over time. To achieve competitive advantages, enterprises need to find effective ways to analyze and understand the vast amounts of raw data they have. Different methods and techniques have been used to reduce the data volume to a manageable level and to help enterprises identify the business value from the data sets. In particular, segmentation methods have been widely used in the area of data mining. In this paper, we present a new algorithm for data segmentation which can be used to build time-dependent customer behavior models. The proposed model has the potential to solve the optimization problem in data segmentation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  1. Zeng L, Li L, Duan L (2012) Business intelligence in enterprise computing environment. Information Technology and Management published online doi:10.1007/s10799-012-0123-z

  2. Li L, Ge R, Zhou S, Valerdi R (2012) Guest editorial integrated heathcare information systems. IEEE Trans Inf Technol Biomed 16(4):515–517

    Article  Google Scholar 

  3. Xu L (2011) Enterprise Systems: state-of-the-art and future trends. IEEE Trans Industr Inf 7(4):630–640

    Article  Google Scholar 

  4. Zeng L, et al. (2012) Distributed data mining: a survey information technology and management published online doi: 10.1007/s10799-012-0124-y

  5. Duan L, Street W, Xu E (2011) Heathcare information systems: data mining methods in the creation of a clinical recommender system. Enterp Inf Syst 5(2):169–181

    Article  Google Scholar 

  6. Xu L, Liang N, Gao Q (2008) An integrated approach for agricultural ecosystem management. IEEE Trans Syst Man Cybern Part C Appl Rev 38(4):590–599

    Article  Google Scholar 

  7. Shi Z et al (2007) MSMiner-a developing platform for OLAP. Decis Support Syst 42(4):2016–2028

    Article  Google Scholar 

  8. Liu B, Cao S, He W (2011) Distributed data mining for e-business. Inf Technol Manage 12(2):67–79

    Article  Google Scholar 

  9. Duan L, Xu L (2012) Business Intelligence for Enterprise Systems: a Survey. IEEE Transactions on Industrial Informatics online published 2012. doi:10.1109/TII.2012.2188804

  10. McCarty J, Hastak M (2007) Segmentation approaches in data-mining: a comparison of RFM, CHAID, and logistic regression. J Bus Res 60(6):656–662

    Article  Google Scholar 

  11. Li J, Wang K, Xu L (2009) Chameleon based on clustering feature tree and its application in customer segmentation. Ann Oper Res 168(1):225–245

    Article  Google Scholar 

  12. Qi J et al (2009) ADTrees Logit model for customer churn prediction. Ann Oper Res 168(1):247–265

    Article  Google Scholar 

  13. Shan S, Wang L, Wang J, Hao Y, Hua F (2011) Research on e-government evaluatioon model based on the principal component analysis. Inf Technol Manage 12(2):173–185

    Article  Google Scholar 

  14. Sheth-Voss P, Carreras I (2010) How informative is your segmentation? A simple new metric yields surprising results. In: A magazine of management and applications. Marketing research. American Marketing Association, pp 8–13

  15. Han J, Kamber M (2001) Data mining: conceptions and techniques. Morgan Kaufman, San Francisco

    Google Scholar 

  16. Kaufman L, Rousseew P (1990) Finding groups in data: an introduction to cluster analysis. Wiley, New York

    Book  Google Scholar 

  17. Jain A, Dudes R (1988) Algorithms for clustering data. Prentice-Hall, Englewood Cliffs

    Google Scholar 

  18. Murtagh F (1985) Multidimensional clustering algorithms. Physica-Verlag HD, Vienna

    Google Scholar 

  19. Olson C (1995) Parallel algorithms for hierarchical clustering. Parallel Comput 21:1313–1325

    Article  Google Scholar 

  20. Lance G, Williams W (1967) A general theory of classification sorting strategies. Comput J 9:373–386

    Google Scholar 

  21. Tan P, Dowe D (2004) MML inference and oblique decision trees. In: Webb G, Yu X (eds) Lectures notes in artificial intelligence. Springer, Berlin, Heidelberg

  22. Mitchell T (1997) Machine learning. McGraw-Hill, New York. ISBN: 0070428077

  23. Rokach L, Maimon O (2007) Data mining with decision trees. Theory and applications. World Scientific Publishing Co., Singapore

    Google Scholar 

  24. Colin A (1996) Building decision trees with the ID3 algorithm. Dr. Dobbs Journal

  25. MacKay D (1995) A short course in information theory. Cavendish Laboratory, Cambridge

    Google Scholar 

  26. Michalski R, Stepp R (1983) Learning from observation: conceptual clustering. In: Michalski R, Tecuci G (eds) Machine learning: an artificial intelligence approach. Morgan Kaugman Publishers, Inc, San Francisco

  27. McLachlan G, Basford K (1987) Mixed models: inference and applications to clustering. CRC Press, New York

    Google Scholar 

  28. Dempster A, Laird N, Rubin D (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc (Series B) 39(1):1–38

    Google Scholar 

  29. Korenjak-Černe S, Batagelj V, Pavešić B (2011) Clustering large data sets described with discrete distributions and its application on TIMSS data set. Stat Anal Data Min 4(2):199–215

    Article  Google Scholar 

  30. Ng R, Han J (1994) Efficient and effective clustering methods for spatial data mining. In: Proceedings of the 20th Conference on VLDB, Santiago, Chile, pp 144–155

  31. Kaufman L, Rousseew P (2005) Finding groups in data: an introduction to cluster analysis. Wiley, New York

    Google Scholar 

  32. Hartigan J (1975) Clustering algorithms. Wiley, New York

    Google Scholar 

  33. Hartigan J, Wong M (1979) Algorithm AS136: a k-means clustering algorithm. Appl Stat 28(1):100–108

    Article  Google Scholar 

  34. Dhillon I (2001) Co-clustering documents and words using bipartite spectral graph partitioning. Proceedings of the 7th ACM SIGKDD, San Francisco, California, pp.269–274

  35. Hamerly G (2010) Making k-means even faster. In: Proceedings of the Tenth SIAM International Conference on Data Mining, pp. 130–140

  36. Ackermann M, Lammersen C, Märtens M, Raupach C, Sohlerz C, Swierkot K (2010) StreamKM++: a clustering algorithm for data streams. In: Proceedings of the Twelfth Workshop on Algorithm Engineering and Experiments (ALENEX), pp. 173–187

  37. Braverman V, Meyerson A, Ostrovsky R, Roytman A, Shindler M, Tagiku B (2011) Streaming k-means on well-clusterable data. In: Proceedings of the Twenty-Second Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 26–40

  38. Duan L, Xu L, Liu Y, Lee J (2009) Cluster-based outlier detection. Ann Oper Res 168(1):151–168

    Article  Google Scholar 

  39. Duan L, Xu L, Guo F, Lee J, Yan B (2007) A local-density based spatial clustering algorithm with noise. Inf Syst 32(7):978–986

    Article  Google Scholar 

  40. Blackstock A, Manatunga A, Park Y, Jones D, Yu T (2011) Clustering based on periodicity in high-throughput time course data. Stat Anal Data Min 6(4):579–589

    Article  Google Scholar 

  41. Taliun A, Bohlen M, Mazeika A (2009) Core : nonparametric clustering of large numeric databases. In: Proceedings of the SIAM International Conference on Data Mining pp. 14–25

  42. Muller E, Assent I, Krieger R, Gunnemann S, Seidl T (2009) DensEst: density estimation for data mining in high dimensional spaces. In: Proceedings of the SIAM International Conference on Data Mining pp. 175–186

  43. Liao W, Liu Y, Choudhary A (2004) A grid-based clustering algorithm using adaptive mesh refinement. www.ece.northwestern.edu/~choudhar/publications/pdf/LiaLiu04A.pdf

  44. Schikuta E, Erhart M (1997) The BANG-clustering system: grid-based data analysis. In: Proceedings of the 2nd International Symposium on Advances in Intelligent Data Analysis, Reasoning about Data, Springer, pp. 513–524

  45. Wang W, Yang J, Munz R (1997) STING: a statistical information grid approach to spatial data mining. In: Proceedings of the 23rd Conference on VLDB, Athens, Greece, pp.186–195

  46. Li L (2011) Introduction: advances in e-business engineering. Inf Technol Manage 12(2):49–50

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Larisa Bulysheva.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bulysheva, L., Bulyshev, A. Segmentation modeling algorithm: a novel algorithm in data mining. Inf Technol Manag 13, 263–271 (2012). https://doi.org/10.1007/s10799-012-0136-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10799-012-0136-7

Keywords

Navigation