Abstract
Radio frequency identification (RFID) technology has been successfully applied to gather customers’ shopping habits from their motion paths and other behavioral data. The customers’ behavioral data can be used for marketing purposes, such as improving the store layout or optimizing targeted promotions to specific customers. Some data mining techniques, such as clustering algorithms can be used to discover customers’ hidden behaviors from their shopping paths. However, shopping path data has peculiar challenges, including variable length, sequential data, and the need for a special distance measure. Due to these challenges, traditional clustering algorithms cannot be applied to shopping path data. In this paper, we analyze customer behavior from their shopping path data by using a clustering algorithm. We propose a new distance measure for shopping path data, called the Operation edit distance, to solve the aforementioned problems. The proposed distance method enables the RFID customer shopping path data to be processed effectively using clustering algorithms. We have collected a real-world shopping path data from a retail store and applied our method to the dataset. The proposed method effectively determined customers’ shopping patterns from the data.
Similar content being viewed by others
References
Amine A, Cadenat S (2003) Efficient retailer assortment: a consumer choice evaluation perspective. Int J Retail Distrib Manag 31(10):486–497
Dubey A, Shandilya S (2010) A novel J2ME service for mining incremental patterns in mobile computing. Commun Comput Inf Sci 101(Part 1):157–164
Gersho A, Gray R (1992) Vector quantization and signal compression. Kluwer Academic, Boston
Jain A, Dube R (1988) Algorithms for clustering data. Prentice-Hall, Englewood Cliffs
Marzal A, Vidal E (1993) Computation of normalized edit distance and applications. IEEE Trans Pattern Anal Mach Intell 15(9):926–932
Newman A, Yu D, Oulton D (2002) New insights into retail space and format planning from customer-tracking data. J Retail Cust Serv 9(5):254–258
Pandit A, Talreja J, Agarwal M, Prasad D, Baheti S, Khalsa G (2010) Intelligent recommender system using shopper’s path and purchase analysis. In: Proceedings of international conference on computational intelligence and communication networks, pp 597–602
Barat C, Ducottet C, Fromont E, Legrand A, Sebban M (2010) Weighted symbols-based edit distance for string-structured image classification. Lect Notes Comput Sci 6321:72– 86
Manning C, Raghavan P, Schütze H (2008) Introduction to information retrieval. Cambridge University Press, pp 58–59. ISBN: 0521865719
Yun C, Chen M (2000) Using pattern-join and purchase combination for mining web transaction patterns in an electronic commerce environment. In: Proceedings of 24th IEEE annuint computer software and application conference, pp 99–104
Besanko D, Dube J, Gupta S (2003) Competitive price discrimination strategies in a vertical channel using aggregate retail data. Manag Sci 49(9):1121–1138
Liu D, Lai C, Lee W (2009) A hybrid of sequential rules and collaborative filtering for product recommendation. Inf Sci 179(20):3505–3519
Sankoff D (1985) Simultaneous solution of the RNA folding alignment and protosequence problems. Siam. J Appl Math 45:810–825
Sankoff D, Krustal J (1983) Time warps, string edits, and macromolecules; the theory and practice of sequence comparison. Addiison Wesley Publ. Co
Anderson E, Fornell C, Lehmann D (1994) Customer satisfaction, market share, and profitability: findings from Sweden
Oliveira-Neto F, Han L, Jeong M (2012) Online license plate matching procedures using license-plate recognition machines and new weighted edit distance. J. Transp Res Part C 21:306–320
Papadimitriou F (2009) Modeling spatial landscape complexity using the Levenshtein algorithm. J Ecol Inf 4:48–55
Allenby G, Rossi P (1998) Marketing models of consumer heterogeneity. J Econ 89(1–2):57–78
Kushwaha G, Sharma N (2010) A survey on mining services for better enhancement in small handheld devices. Intl J Comput Appl 6(1):40–43
Navarro G (1998) Approximate text searching. Dissertation, Dept. of Computer Science, University of Chile
Navarro G (2001) A guided tour to approximate string matching. ACM Comput Surv 33(1):31–88
Bunke H, Csirik J (1992) Edit distance of run-length coded strings. Applied computing: technological challenges of the 1990. Kansas City
Kim H, Park J-H (2011) Evaluating the regularity of human behavior from mobile phone usage logs. In: Proceedings of 2011 workshop on behavior informatics
Jung I, Syaekhoni M, Kwon Y (2015) A practical approach to the shopping path clustering. Current Approaches in Applied Artificial Intelligence Volume 9101 of the series Lecture Notes in Computer Science, pp 675–682
Jung I, Kwon Y, Lee Y (2012) A sequence pattern matching approach to shopping path clustering. In: Proceedings of international conference on industrial engineering and operations management. Istanbul
Droppo J, Acero A (2010) Context dependent phonetic string edit distance for automatic speech recognition. In: IEEE international conference on acoustics, speech and signal processing. Dallas, pp 4358–4361
Farley J, Ring L (1996) A stochastic model of supermarket traffic flow. J Oper Res 14(4):555–567
Gattorna J (2010) Dynamic supply chains. Delivering value through people. Pearson Education Limited
Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In: Proceedings of the ACM-SIGMOD international conference on management of data, pp 1– 12
Larson J, Bradlow E, Fader P (2005) An exploratory look at supermarket shopping paths. Int J Res Mark 22:359–414
Rice J (1995) Mathematical statistics and data analysis, 2nd edn. Duxbury Press, pp 166–173. ISBN 0-534-20934-3
Shim J, Lee S (2008) A study on the decision making for location selection of Large-scale discount stores. Korea Society of Civil Engineers (D) 28(No. 5D):705–712
Hjort K, Lantz B, Ericsson D, Gattorna J (2013) Customer segmentation based on buying and returning behaviour. Int J Phys Distrib Logist Manag 43(10):852–865
Yada K (2011) String analysis technique for shopping path in a supermarket. J Intell Inf Syst 36(3):385–402
Clark L (2003) Going for growth. Chem Drug 15(March):42
Adnan M, Alhajj R (2009) DRFP-tree: disk-resident frequent pattern tree. J Appl Intell 30(2):84–97
Anderberg M (1973) Cluster analysis for applications. Academic Press, New York
Chen M, Park J, Yu P (1998) Efficient data mining for path traversal patterns. IEEE Trans Knowl Data Eng 10(2):209–221
Jones M, Mothersbaugh D, Beatty S (2003) The effects of locational convenience on customer repurchase intentions across service types. J. Serv Mark 17(7):701–712
Levy M, Weitz B (2007) Retail management, 7th edn. McGraw-Hill Irwin, New York
Rousseeuw P (1987) Silhouette: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20:53–65
Somboonsak P, Munlin M (2011) A new edit distance method for finding similarityin DNA sequence. Int Scholarly Sci Res Innov 5(10):623–626
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: Proceedings of the 20th international conference on very large data bases, pp 487–499
Boland R, Brown E, Day W (1983) Approximating minimum-lengthsequence metrics: a cautionary note. Math Social ScL 4:261–270
Duda R, Hart P (1973) Pattern classification and scene analysis. Wiley, New York
Burkhardt S, Karkkainen J (2002) One-gapped q-gram filters for levenshtein distance. LNCS 2373:225–234
Kurtz S (1996) Approximate string searching under weighted edit distance. In: Proceedings of 3rd South American workshop on string processing. Recife
Lee S, Cho Y, Kim S (2010) Collaborative filtering with ordinal scale-based implicit ratings for mobile music recommendations. Inf Sci 180(11):2142–2155
Jiang T, Lin G, Ma B, Zhang K (2002) A general edit distance between RNA structures. J Comput Biol 9(2):371–88
Fayyad U, Piatetsky-Shapiro G, Smyth P, Uthurusamy R (1996) Advances in knowledge discovery and data mining. AAAI/MIT Press
Levenshtein V (1966) Binary codes capable of correcting deletions, insertions, and reversals. Sov Phys Dokl 10(8):707–710
Day W (1984) Properties of levenshtein metrics on sequences. Bull Math Biol 46(2):327–332
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Syaekhoni, M.A., Lee, C. & Kwon, Y.S. Analyzing customer behavior from shopping path data using operation edit distance. Appl Intell 48, 1912–1932 (2018). https://doi.org/10.1007/s10489-016-0839-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-016-0839-2