Abstract
Recently, there is a growing number of off-line stores that are willing to conduct customer behavior analysis. In particular, predicting revisit intention is of prime importance, because converting first-time visitors to loyal customers is very profitable. Thanks to noninvasive monitoring, shopping behaviors and revisit statistics become available from a large proportion of customers who turn on their mobile devices. In this paper, we propose a systematic framework to predict the revisit intention of customers using Wi-Fi signals captured by in-store sensors. Using data collected from seven flagship stores in downtown Seoul, we achieved 67–80% prediction accuracy for all customers and 64–72% prediction accuracy for first-time visitors. The performance improvement by considering customer mobility was 4.7–24.3%. Furthermore, we provide an in-depth analysis regarding the effect of data collection period as well as visit frequency on the prediction performance and present the robustness of our model on missing customers. We released some tutorials and benchmark datasets for revisit prediction at https://github.com/kaist-dmlab/revisit.
Similar content being viewed by others
Notes
The proportion of users in their twenties who keep their Wi-Fi on is 29.2%, according to a survey by Korea Telecom (July 2015).
In Fig. 2, the ratio of the first-time visitors in store E_GN is over 70%. We made a few assumptions to interpret the data as it is and will discuss them in “Appendix D”.
Owing to a nondisclosure agreement, additional store information cannot be disclosed. We encourage readers to think that dozens of sensors cover the other stores in a similar manner.
As a result of Sect. 4.2.3, our model is considered to be safe to perform cross-validation.
For this experiment, we included visit count and date to our feature set, so the overall accuracy is slightly higher than the values reported from Table 4.
Scikit-learn 0.20, which is the latest version at the time of this submission, was used for the experiments.
References
Baumann P, Kleiminger W, Santini S (2013) The influence of temporal and spatial features on the performance of next-place prediction algorithms. In: Proceedings of the 2013 ACM international joint conference on pervasive and ubiquitous computing. ACM, pp 449–458
Besse PC, Guillouet B, Loubes J-M, Royer F (2017) Destination prediction by trajectory distribution based model. IEEE Trans Intell Transp Syst 99:1–12
Brébisson A, Simon É, Auvolat A, Vincent P, Bengio Y (2015) Artificial neural networks applied to taxi destination prediction. In: Proceedings of the 2015 ECML/PKDD discovery challenge. Springer, pp 40–51
Chen T, Guestrin C (2016) XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 785–794
Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29(5):1189–1232
Geng W, Yang G (2017) Partial correlation between spatial and temporal regularities of human mobility. Sci Rep 7:6249
Giannotti F, Nanni M, Pinelli F, Pedreschi D (2007) Trajectory pattern mining. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 330–339
Hui SK, Bradlow ET, Fader PS (2009) Testing behavioral hypotheses using an integrated model of grocery store shopping path and purchase behavior. J Consum Res 36(3):478–493
Hwang I, Jang Y (2017) Process mining to discover shoppers’ pathways at a fashion retail store using a wifi-base indoor positioning system. IEEE Trans Autom Sci Eng 14:1786–1792
Jung S, Lim C, Yoon S (2011) Study on selecting process of visitor’s movements in exhibition space. J Archit Inst Korea Plan Des 27(12):53–62
Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, Ye Q, Liu T-Y (2017) LightGBM: a highly efficient gradient boosting decision tree. In: Advances in neural information processing systems, vol 30. Curran Associates, Inc, pp 3146–3154
Kim S, Lee J-G (2018) Utilizing in-store sensors for revisit prediction. In: IEEE international conference on data mining. IEEE, pp 217–226
Kim T, Chu M, Brdiczka O, Begole J (2009) Predicting shoppers’ interest from social interactions using sociometric sensors. In: CHI’09 extended abstracts on human factors in computing systems. ACM, pp 4513–4518
Lee J-G, Han J, Li X (2011) Mining discriminative patterns for classifying trajectories on road networks. IEEE Trans Knowl Data Eng 23(5):713–726
Lemaître G, Nogueira F, Aridas CK (2017) Imbalanced-learn: a Python toolbox to tackle the curse of imbalanced datasets in machine learning. J Mach Learn Res 18(17):1–5
Lim C, Park H, Yoon S (2013) A study of an exhibitions space analysis according to visitor’s cognition. J Archit Inst Korea Plan Des 29(8):69–78
Lim C, Yoon S (2010) Development of visual perception effects model for exhibition space. J Archit Inst Korea Plan Des 26(5):131–138
Liu G, Nguyen TT, Zhao G, Zha W, Yang J, Cao J, Wu M, Zhao P, Chen W (2016) Repeat buyer prediction for E-commerce. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 155–164
Lu X, Wetter E, Bharti N, Tatem AJ, Bengtsson L (2013) Approaching the limit of predictability in human mobility. Sci Rep 3:2923
Lv J, Li Q, Sun Q, Wang X (2018) T-CONV: a convolutional neural network for multi-scale taxi trajectory prediction. In: Proceedings of the 2018 IEEE international conference on big data and smart computing. IEEE, pp 82–89
Martin J, Mayberry T, Donahue C, Foppe L, Brown L, Riggins C, Rye EC, Brown D (2017) A study of MAC address randomization in mobile devices and when it fails. Proc Priv Enhanc Technol 2017(4):365–383
Mathew W, Raposo R, Martins B (2012) Predicting future locations with hidden Markov models. In: Proceedings of the 2012 ACM conference on ubiquitous computing. ACM, pp 911–918
Monreale A, Pinelli F, Trasarti R, Giannotti F (2012) WhereNext: a location predictor on trajectory pattern mining. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 637–646
OpenSignal, Inc (2016) Global state of mobile networks (August 2016). Technical report
Park S, Jung S, Lim C (2001) A study on the pedestrian path choice in clothing outlets. Korean Inst Inter Des J 28:140–148
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay É (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830
Peppers D, Rogers M (2016) Managing customer experience and relationships. Wiley, New York
Prokhorenkova L, Gusev G, Vorobev A, Dorogush AV, Gulin A (2018) CatBoost: unbiased boosting with categorical features support. In: Advances in neural information processing systems, vol 31. Curran Associates, Inc, pp 6639–6649
Ren Y, Tomko M, Salim FD, Ong K, Sanderson M (2017) Analyzing web behavior in indoor retail spaces. J Assoc Inf Sci Technol 68(1):62–76
Sapiezynski P, Stopczynski A, Gatej R, Lehmann S (2015) Tracking human mobility using WiFi signals. PLoS ONE 10(7):e0130824
Scellato S, Musolesi M, Mascolo C, Latora V, Campbell AT (2011) Nextplace: a spatio-temporal prediction framework for pervasive systems. In: Proceedings of the 9th international conference on pervasive computing. Springer, pp 152–169
Sheth A, Seshan S, Wetherall D (2009) Geo-fencing: confining Wi-Fi coverage to physical boundaries. In: Proceedings of the 7th international conference on pervasive computing, pp 274–290
Song C, Qu Z, Blumm N, Barabási A-L (2010) Limits of predictability in human mobility. Science 327(5968):1018–1021
Stanković RS, Falkowskib BJ (2003) The Haar wavelet transform: its status and achievements. Comput Electr Eng 29(1):25–44
Syaekhoni A, Lee C, Kwon Y (2018) Analyzing customer behavior from shopping path data using operation edit distance. Appl Intell 48:1912–1932
Tomko M, Ren Y, Ong K, Salim F, Sanderson M (2014) Large-scale indoor movement analysis: the data, context and analytical challenges. In: Proceedings of analysis of movement data, GIScience 2014 workshop
Um S, Chon K, Ro Y (2006) Antecedents of revisit intention. Ann Tour Res 33(4):1141–1158
Wolpert DH (1992) Stacked generalization. Neural Netw 5:241–259
Xue AY, Zhang R, Zheng Y, Xie X, Huang J, Xu Z (2013) Destination prediction by sub-trajectory synthesis and privacy protection against such prediction. In: Proceedings of the 29th IEEE international conference on data engineering. IEEE, pp 254–265
Yada K (2011) String analysis technique for shopping path in a supermarket. J Intell Inf Syst 36(3):385–402
Yalowitz SS, Bronnenkant K (2009) Timing and tracking: unlocking visitor behavior. Visit Stud 12(1):47–64
Yan X, Wang J, Chau M (2015) Customer revisit intention to restaurants: evidence from online reviews. Inf Syst Front 17:645–657
Yan Z, Chakraborty D, Parent C, Spaccapietra S, Aberer K (2013) Semantic trajectories: mobility data computation and annotation. ACM Trans Intell Syst Technol 4(3):1–38
Ying JJC, Lee WC, Weng TC, Tseng VS (2011) Semantic trajectory mining for location prediction. In: Proceedings of the 19th ACM SIGSPATIAL international conference on advances in geographic information systems. ACM, pp 34–43
Yoshimura Y, Krebs A, Ratti C (2017) Noninvasive bluetooth monitoring of visitors’ length of stay at the louvre. IEEE Perv Comput 16(2):26–34
Acknowledgements
This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (Ministry of Science and ICT) (No. 2017R1E1A1A01075927). We appreciate Minseok Kim for helping surveys on off-line stores and drawing floor plans. We also thank ZOYI for providing active discussion in regard to the datasets.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
A. Comparison on various classifiers
We compared the performances of eight classifiers. We used default parameter settings for classifiers and some tuned parameters are listed below.
Classifiers provided by Scikit-learn [26].Footnote 11 The parameters used are summarized as follows.
LR (Logistic Regression): default settings.
DT (Decision Tree): max_depth = 4.
RF (Random Forests): n_estimator = 10.
AB (AdaBoost): default settings.
GB (Gradient Boosting): max_depth = 4.
Up-to-date boosting classifiers:
CAB (CatBoost): depth = 4, learning_rate = 0.1, iterations = 30.
XGB (XGBoost): max_depth = 4, learning_rate = 0.1.
LGB (LightGBM): max_depth = 4, learning_rate = 0.1.
Figure 13 summarizes the comparison results for the eight classifiers in terms of prediction accuracy and running time. To obtain stable results, we repeated fivefold cross-validation 25 times and then reported the averages by aggregating the results of the seven stores. As a result, LGB turned out to be the fastest classifier among the three best-performing classifiers—GB, XGB, and LGB. CAB was very fast as well as gave comparable results. Interestingly, DT took more time than RF and showed a better result in the default setting. Table 8 shows the details of Fig. 13 by showing the accuracy for each of the seven stores. The mean and standard deviation were calculated from the average accuracies of 25 different fivefold cross-validations.
B. Comparison on stacking models
To achieve additional performance improvement, we applied stacking (meta ensembling) with eight strategies. Stacking is a model ensembling technique used to combine multiple predictive models to generate a better model [38]. Usually, the stacked model is known to outperform each of the individual models owing to its smoothing nature and its ability to highlight each base model. The main point of the stacking is to utilize the prediction results of the base models as features for the stacking model in the second layer.
To do this, we selected CAB, XGB, and LGB as the base models. We further separated a training set into three subsets and used two subsets to make the prediction labels for the remaining subset. The prediction labels for the testing set were also calculated together three (\(=_3\!C_2\)) times, and the three sets of the labels for the testing set were averaged for the final use. In this way, we generated the label features for both training and testing sets. These additional features are fed to the final LGB stacking model. We followed a general procedure from the referenceFootnote 12 and added three options. Figure 14 illustrates the process of creating eight stacking models (\(M_1\)–\(M_8\)) through the choice of the three options. The description of the three options is as follows.
Sampling strategy: A parameter that determines whether to use either random oversampling [21] or downsampling. This option is not directly related to the stacking, but we added it to improve the accuracy by treating the class imbalance problem.
# of predictions: A parameter that determines whether to use one model or multiple models for each fold. The former case generates a single additional feature, and the latter case generates three additional features.
Using only labels: A parameter that determines whether to use only the prediction labels (one or three features) or to use all existing features with the prediction labels (n+1 or n+3 features where n is the total number of hand-engineered features used).
Table 9 shows the average accuracy results obtained for each of the seven stores in details.Footnote 13 We observed that the performance improvement was not so high despite the long running time of the stacking model. Thus, we conjecture that each of the best-performing classifiers achieved almost the highest accuracy by itself.
C. Lower bounds of prediction accuracy
The visit logs \(v_k\) with the same visit count k are considered to have the same information. To maximize the accuracy, we must predict the label l of \(v_k\) by the following criteria:
Considering each proportion \(p_k = |v_k|/\sum _{k}{|v_k|}\) and simplifying \(E[RV_{\mathrm{bin}}(v_k)]\) as \(r_k\), the lower bound accuracy of a model can be represented as \(LB = \sum _{k}p_k \cdot \max (r_k, 1-r_k)\). In the experiment of only first-time visitors, \(LB = 1/2\) since \(p_1 = 1\) and \(r_1 = 1/2\).
The interpretation with the lower bound is as follows. For higher predictability, the revisit tendency of each \(v_k\) should be homogeneous. In Fig. 15, we can notice that store L_MD is more predictable than A_GN, because \(|r_k-0.5|\) of L_MD is larger than that of A_GN for the majority of k.
D. Assumptions to interpret the data
Here, we would like to clarify how we count the first-time visitors and explain several underlying assumptions to consider.
Assumption 1: Because we do not know whether customers visited a store before data was collected, we simply assume that the customers did not visit before the collection period. We believe that this assumption is reasonable because the stores in which we collected the data were relatively new at that time we began data collection.
Assumption 2: Because customers are captured only when they turn on the Wi-Fi of their mobile device, we assume that the customers’ Wi-Fi turn on behavior is consistent when they visit the store. Also, we assume that there is no correlation between Wi-Fi usage and customer groups (first-time visitors and VIP customers).
Assumption 3: We assume that customers visit the store with a device having the same MAC address. For this purpose, we retained only Android devices but removed Apple devices in the preprocessing step, because the later versions of iOS 8.0 follow a MAC-address randomization policy [21] which makes infeasible to identify the same customer.
Rigorously speaking, the proportion of true first-time visitors would be less than 70% by considering all the effects explained above. Nevertheless, these customers are also likely to be early stage visitors.
E. Deciding the group movement threshold
We decided 30 s group movement threshold by the following logic. According to our observation at store E_GN in the afternoon of June 24 and June 26, 2017, 56% of 105 customers entered the store with their companions, which was more than half. Considering \(p_x=39.2\%\) as the on-site Wi-Fi turn on rate (Always-on: 29.2%, Conditionally-on: 10%) [24] and \(p_y = 56\%\) as the actual proportion of customers in a group, we expected that \(p_{yo}=15.5\%\) of the total visitors were represented as having companions in our collected data of store E_GN (by Eq. 1 in Sect. 5.3.2). By setting 30 s as a threshold of accompaniment, we also obtained 15% of the total visitors were considered as having companions in the same data. By considering a gap between actual group ratio and observed group ratio, we claim that 30 s is an appropriate threshold to distinguish group movement.
Rights and permissions
About this article
Cite this article
Kim, S., Lee, JG. A systematic framework of predicting customer revisit with in-store sensors. Knowl Inf Syst 62, 1005–1035 (2020). https://doi.org/10.1007/s10115-019-01373-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-019-01373-y