Skip to main content
Log in

Analyzing customer behavior from shopping path data using operation edit distance

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Radio frequency identification (RFID) technology has been successfully applied to gather customers’ shopping habits from their motion paths and other behavioral data. The customers’ behavioral data can be used for marketing purposes, such as improving the store layout or optimizing targeted promotions to specific customers. Some data mining techniques, such as clustering algorithms can be used to discover customers’ hidden behaviors from their shopping paths. However, shopping path data has peculiar challenges, including variable length, sequential data, and the need for a special distance measure. Due to these challenges, traditional clustering algorithms cannot be applied to shopping path data. In this paper, we analyze customer behavior from their shopping path data by using a clustering algorithm. We propose a new distance measure for shopping path data, called the Operation edit distance, to solve the aforementioned problems. The proposed distance method enables the RFID customer shopping path data to be processed effectively using clustering algorithms. We have collected a real-world shopping path data from a retail store and applied our method to the dataset. The proposed method effectively determined customers’ shopping patterns from the data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

References

  1. Amine A, Cadenat S (2003) Efficient retailer assortment: a consumer choice evaluation perspective. Int J Retail Distrib Manag 31(10):486–497

    Article  Google Scholar 

  2. Dubey A, Shandilya S (2010) A novel J2ME service for mining incremental patterns in mobile computing. Commun Comput Inf Sci 101(Part 1):157–164

    Google Scholar 

  3. Gersho A, Gray R (1992) Vector quantization and signal compression. Kluwer Academic, Boston

    Book  MATH  Google Scholar 

  4. Jain A, Dube R (1988) Algorithms for clustering data. Prentice-Hall, Englewood Cliffs

    Google Scholar 

  5. Marzal A, Vidal E (1993) Computation of normalized edit distance and applications. IEEE Trans Pattern Anal Mach Intell 15(9):926–932

    Article  Google Scholar 

  6. Newman A, Yu D, Oulton D (2002) New insights into retail space and format planning from customer-tracking data. J Retail Cust Serv 9(5):254–258

    Google Scholar 

  7. Pandit A, Talreja J, Agarwal M, Prasad D, Baheti S, Khalsa G (2010) Intelligent recommender system using shopper’s path and purchase analysis. In: Proceedings of international conference on computational intelligence and communication networks, pp 597–602

  8. Barat C, Ducottet C, Fromont E, Legrand A, Sebban M (2010) Weighted symbols-based edit distance for string-structured image classification. Lect Notes Comput Sci 6321:72– 86

    Article  Google Scholar 

  9. Manning C, Raghavan P, Schütze H (2008) Introduction to information retrieval. Cambridge University Press, pp 58–59. ISBN: 0521865719

  10. Yun C, Chen M (2000) Using pattern-join and purchase combination for mining web transaction patterns in an electronic commerce environment. In: Proceedings of 24th IEEE annuint computer software and application conference, pp 99–104

  11. Besanko D, Dube J, Gupta S (2003) Competitive price discrimination strategies in a vertical channel using aggregate retail data. Manag Sci 49(9):1121–1138

    Article  Google Scholar 

  12. Liu D, Lai C, Lee W (2009) A hybrid of sequential rules and collaborative filtering for product recommendation. Inf Sci 179(20):3505–3519

    Article  Google Scholar 

  13. Sankoff D (1985) Simultaneous solution of the RNA folding alignment and protosequence problems. Siam. J Appl Math 45:810–825

    MathSciNet  MATH  Google Scholar 

  14. Sankoff D, Krustal J (1983) Time warps, string edits, and macromolecules; the theory and practice of sequence comparison. Addiison Wesley Publ. Co

  15. Anderson E, Fornell C, Lehmann D (1994) Customer satisfaction, market share, and profitability: findings from Sweden

  16. Oliveira-Neto F, Han L, Jeong M (2012) Online license plate matching procedures using license-plate recognition machines and new weighted edit distance. J. Transp Res Part C 21:306–320

    Article  Google Scholar 

  17. Papadimitriou F (2009) Modeling spatial landscape complexity using the Levenshtein algorithm. J Ecol Inf 4:48–55

    Article  Google Scholar 

  18. Allenby G, Rossi P (1998) Marketing models of consumer heterogeneity. J Econ 89(1–2):57–78

    Article  MATH  Google Scholar 

  19. Kushwaha G, Sharma N (2010) A survey on mining services for better enhancement in small handheld devices. Intl J Comput Appl 6(1):40–43

    Google Scholar 

  20. Navarro G (1998) Approximate text searching. Dissertation, Dept. of Computer Science, University of Chile

  21. Navarro G (2001) A guided tour to approximate string matching. ACM Comput Surv 33(1):31–88

    Article  Google Scholar 

  22. Bunke H, Csirik J (1992) Edit distance of run-length coded strings. Applied computing: technological challenges of the 1990. Kansas City

  23. Kim H, Park J-H (2011) Evaluating the regularity of human behavior from mobile phone usage logs. In: Proceedings of 2011 workshop on behavior informatics

  24. Jung I, Syaekhoni M, Kwon Y (2015) A practical approach to the shopping path clustering. Current Approaches in Applied Artificial Intelligence Volume 9101 of the series Lecture Notes in Computer Science, pp 675–682

    Google Scholar 

  25. Jung I, Kwon Y, Lee Y (2012) A sequence pattern matching approach to shopping path clustering. In: Proceedings of international conference on industrial engineering and operations management. Istanbul

  26. Droppo J, Acero A (2010) Context dependent phonetic string edit distance for automatic speech recognition. In: IEEE international conference on acoustics, speech and signal processing. Dallas, pp 4358–4361

  27. Farley J, Ring L (1996) A stochastic model of supermarket traffic flow. J Oper Res 14(4):555–567

    Article  Google Scholar 

  28. Gattorna J (2010) Dynamic supply chains. Delivering value through people. Pearson Education Limited

  29. Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In: Proceedings of the ACM-SIGMOD international conference on management of data, pp 1– 12

  30. Larson J, Bradlow E, Fader P (2005) An exploratory look at supermarket shopping paths. Int J Res Mark 22:359–414

    Article  Google Scholar 

  31. Rice J (1995) Mathematical statistics and data analysis, 2nd edn. Duxbury Press, pp 166–173. ISBN 0-534-20934-3

  32. Shim J, Lee S (2008) A study on the decision making for location selection of Large-scale discount stores. Korea Society of Civil Engineers (D) 28(No. 5D):705–712

    Google Scholar 

  33. Hjort K, Lantz B, Ericsson D, Gattorna J (2013) Customer segmentation based on buying and returning behaviour. Int J Phys Distrib Logist Manag 43(10):852–865

    Article  Google Scholar 

  34. Yada K (2011) String analysis technique for shopping path in a supermarket. J Intell Inf Syst 36(3):385–402

    Article  Google Scholar 

  35. Clark L (2003) Going for growth. Chem Drug 15(March):42

    Google Scholar 

  36. Adnan M, Alhajj R (2009) DRFP-tree: disk-resident frequent pattern tree. J Appl Intell 30(2):84–97

    Article  Google Scholar 

  37. Anderberg M (1973) Cluster analysis for applications. Academic Press, New York

    MATH  Google Scholar 

  38. Chen M, Park J, Yu P (1998) Efficient data mining for path traversal patterns. IEEE Trans Knowl Data Eng 10(2):209–221

    Article  Google Scholar 

  39. Jones M, Mothersbaugh D, Beatty S (2003) The effects of locational convenience on customer repurchase intentions across service types. J. Serv Mark 17(7):701–712

    Article  Google Scholar 

  40. Levy M, Weitz B (2007) Retail management, 7th edn. McGraw-Hill Irwin, New York

    Google Scholar 

  41. Rousseeuw P (1987) Silhouette: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20:53–65

    Article  MATH  Google Scholar 

  42. Somboonsak P, Munlin M (2011) A new edit distance method for finding similarityin DNA sequence. Int Scholarly Sci Res Innov 5(10):623–626

    Google Scholar 

  43. Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: Proceedings of the 20th international conference on very large data bases, pp 487–499

  44. Boland R, Brown E, Day W (1983) Approximating minimum-lengthsequence metrics: a cautionary note. Math Social ScL 4:261–270

    Article  MATH  Google Scholar 

  45. Duda R, Hart P (1973) Pattern classification and scene analysis. Wiley, New York

    MATH  Google Scholar 

  46. Burkhardt S, Karkkainen J (2002) One-gapped q-gram filters for levenshtein distance. LNCS 2373:225–234

    MathSciNet  MATH  Google Scholar 

  47. Kurtz S (1996) Approximate string searching under weighted edit distance. In: Proceedings of 3rd South American workshop on string processing. Recife

  48. Lee S, Cho Y, Kim S (2010) Collaborative filtering with ordinal scale-based implicit ratings for mobile music recommendations. Inf Sci 180(11):2142–2155

    Article  Google Scholar 

  49. Jiang T, Lin G, Ma B, Zhang K (2002) A general edit distance between RNA structures. J Comput Biol 9(2):371–88

    Article  Google Scholar 

  50. Fayyad U, Piatetsky-Shapiro G, Smyth P, Uthurusamy R (1996) Advances in knowledge discovery and data mining. AAAI/MIT Press

  51. Levenshtein V (1966) Binary codes capable of correcting deletions, insertions, and reversals. Sov Phys Dokl 10(8):707–710

    MathSciNet  Google Scholar 

  52. Day W (1984) Properties of levenshtein metrics on sequences. Bull Math Biol 46(2):327–332

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Young S. Kwon.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Syaekhoni, M.A., Lee, C. & Kwon, Y.S. Analyzing customer behavior from shopping path data using operation edit distance. Appl Intell 48, 1912–1932 (2018). https://doi.org/10.1007/s10489-016-0839-2

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-016-0839-2

Keywords

Navigation