Skip to main content

Temporal Pattern Mining for E-commerce Dataset

  • Chapter
  • First Online:
Transactions on Large-Scale Data- and Knowledge-Centered Systems XLVI

Part of the book series: Lecture Notes in Computer Science ((TLDKS,volume 12410))

Abstract

Over the last few years, several data mining algorithms have been developed to understand customers’ behaviors in e-commerce platforms. They aim to extract knowledge and predict future actions on the website. In this paper we present three algorithms: SEPM−, SEPM+ and SEPM++ (Sequential Event Pattern Mining), for mining sequential frequent patterns. Our goal is to mine clickstream data to extract and analyze useful sequential patterns of clicks. For this purpose, we augment the vertical representation of patterns with additional information about the items’ duration. Then based on this representation, we propose the necessary algorithms to mine sequential frequent patterns with the average duration of each of their items. Also, the direction of durations’ variation in the sequence is taken into account by the algorithms. This duration is used as a proxy of the interest of the user in the content of the page. Finally, we categorize the resulting patterns and we prove that they are more discriminating than the standard ones. Our approach is tested on real data, and patterns found are analyzed to extract users’ discriminatory behaviors. The experimental results on both real and synthetic datasets indicate that our algorithms are efficient and scalable.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 16.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    These mining techniques are not addressed in our work because of the lack of public data containing this information.

  2. 2.

    Any other classic sequential pattern mining algorithm (like SPADE, SPAM, etc.) that can be used since they all have the same output/patterns.

References

  1. Agrawal, R., Srikant, R.: Mining sequential patterns. In: ICDE, p. 3. IEEE (1995)

    Google Scholar 

  2. Alborzi, M., Khanbabaei, M.: Using data mining and neural networks techniques to propose a new hybrid customer behaviour analysis and credit scoring model in banking services based on a developed RFM analysis method. Int. J. Bus. Inf. Syst. 23(1), 1–22 (2016)

    Google Scholar 

  3. Alibaba: (dataset) user behavior data from Taobao for recommendation (2018). https://tianchi.aliyun.com/dataset/dataDetail?dataId=649

  4. Allen, J.F.: Maintaining knowledge about temporal intervals. Commun. ACM 26, 832–843 (1983)

    Article  MATH  Google Scholar 

  5. Alzahrani, M.Y., Mazarbhuiya, F.A.: Discovering sequential patterns from medical datasets. In: 2016 International Conference on Computational Science and Computational Intelligence (CSCI), pp. 70–74. IEEE (2016)

    Google Scholar 

  6. Ansari, A., Riasi, A.: Customer clustering using a combination of fuzzy c-means and genetic algorithms. Int. J. Bus. Manage. 11(7), 59 (2016)

    Article  Google Scholar 

  7. Ayres, J., Flannick, J., Gehrke, J., Yiu, T.: Sequential pattern mining using a bitmap representation. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 429–435. ACM (2002)

    Google Scholar 

  8. Ben-Shimon, D., Tsikinovsky, A., Friedmann, M., Shapira, B., Rokach, L., Hoerle, J.: RecSys challenge 2015 and the YOOCHOOSE dataset. In Proceedings of the 9th ACM Conference on Recommender Systems, pp. 357–358. ACM (2015)

    Google Scholar 

  9. Chen, K.-Y., Jaysawal, B.P., Huang, J.-W., Wu, Y.-B.: Mining frequent time interval-based event with duration patterns from temporal database. In: 2014 International Conference on Data Science and Advanced Analytics (DSAA), pp. 548–554. IEEE (2014)

    Google Scholar 

  10. CIKM: (dataset) CIKM CUP 2016 track 2: Personalized e-commerce search challenge (2016). https://competitions.codalab.org/competitions/11161

  11. Dursun, A., Caber, M.: Using data mining techniques for profiling profitable hotel customers: an application of RFM analysis. Tour. Manage. Perspect. 18, 153–160 (2016)

    Article  Google Scholar 

  12. Fournier-Viger, P., Gomariz, A., Campos, M., Thomas, R.: Fast vertical mining of sequential patterns using co-occurrence information. In: Tseng, V.S., Ho, T.B., Zhou, Z.-H., Chen, A.L.P., Kao, H.-Y. (eds.) PAKDD 2014. LNCS (LNAI), vol. 8443, pp. 40–52. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-06608-0_4

    Chapter  Google Scholar 

  13. Fournier-Viger, P., Wu, C.-W., Gomariz, A., Tseng, V.S.: VMSP: efficient vertical mining of maximal sequential patterns. In: Sokolova, M., van Beek, P. (eds.) AI 2014. LNCS (LNAI), vol. 8436, pp. 83–94. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-06483-3_8

    Chapter  Google Scholar 

  14. Fournier-Viger, P., et al.: The SPMF open-source data mining library version 2. In: Berendt, B., et al. (eds.) ECML PKDD 2016. LNCS (LNAI), vol. 9853, pp. 36–40. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46131-1_8

    Chapter  Google Scholar 

  15. Fournier-Viger, P., Lin, C.-W., Kiran, R.U., Koh, Y.S., Thomas, R.: A survey of sequential pattern mining. Data Sci. Pattern Recogn. 1(1), 54–77 (2017)

    Google Scholar 

  16. García-Hernández, R.A., Martínez-Trinidad, J.F., Carrasco-Ochoa, J.A.: A new algorithm for fast discovery of maximal sequential patterns in a document collection. In: Gelbukh, A. (ed.) CICLing 2006. LNCS, vol. 3878, pp. 514–523. Springer, Heidelberg (2006). https://doi.org/10.1007/11671299_53

    Chapter  Google Scholar 

  17. Gomariz, A., Campos, M., Marin, R., Goethals, B.: ClaSP: an efficient algorithm for mining frequent closed sequences. In: Pei, J., Tseng, V.S., Cao, L., Motoda, H., Xu, G. (eds.) PAKDD 2013. LNCS (LNAI), vol. 7818, pp. 50–61. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37453-1_5

    Chapter  Google Scholar 

  18. Huaulmé, A., Voros, S., Riffaud, L., Forestier, G., Moreau-Gaudry, A., Jannin, P.: Distinguishing surgical behavior by sequential pattern discovery. J. Biomed. Inf. 67, 34–41 (2017)

    Article  Google Scholar 

  19. Jagan, S., Rajagopalan, S.P.: A survey on web personalization of web usage mining. Int. Res. J. Eng. Technol. 2(1), 6–12 (2015)

    Google Scholar 

  20. Jia, R., Li, R., Yu, M., Wang, S.: E-commerce purchase prediction approach by user behavior data. In: 2017 International Conference on Computer, Information and Telecommunication Systems (CITS), pp. 1–5. IEEE (2017)

    Google Scholar 

  21. Kanaan, M., Kheddouci, H.: Mining patterns with durations from e-commerce dataset. In: Aiello, L.M., Cherifi, C., Cherifi, H., Lambiotte, R., Lió, P., Rocha, L.M. (eds.) COMPLEX NETWORKS 2018. SCI, vol. 812, pp. 603–615. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-05411-3_49

    Chapter  Google Scholar 

  22. Kontostathis, A., Galitsky, L.M., Pottenger, W.M., Roy, S., Phelps, D.J.: A survey of emerging trend detection in textual data mining. In: Berry, M.W. (eds.) Survey of Text Mining, pp. 185–224. Springer, New York (2004). https://doi.org/10.1007/978-1-4757-4305-0_9

  23. Liao, V.C.-C., Chen, M.-S.: DFSP: a Depth-First SPelling algorithm for sequential pattern mining of biological sequences. Knowl. Inf. Syst. 38(3), 623–639 (2013). https://doi.org/10.1007/s10115-012-0602-x

    Article  Google Scholar 

  24. Lin, N.P., Hao, W.-H., Chen, H.-J., Chueh, H.-E., Chang, C.-I., et al.: Fast mining of closed sequential patterns. WSEAS Trans. Comput. 7(3), 1–7 (2008)

    Google Scholar 

  25. Mabroukeh, N.R., Ezeife, C.I.: A taxonomy of sequential pattern mining algorithms. ACM Comput. Surv. (CSUR) 43(1), 3 (2010)

    Article  Google Scholar 

  26. Martínez, A., Schmuck, C., Pereverzyev Jr., S., Pirker, C., Haltmeier, M.: A machine learning framework for customer purchase prediction in the non-contractual setting. Eur. J. Oper. Res. 281(3), 588–596 (2018)

    Article  Google Scholar 

  27. Mobasher, B., Dai, H., Luo, T., Nakagawa, M.: Using sequential and non-sequential patterns in predictive web usage mining tasks. In: 2002 IEEE International Conference on Data Mining. Proceedings, pp. 669–672. IEEE (2002)

    Google Scholar 

  28. Najafabadi, M.K., Mahrin, M.N.R., Chuprat, S., Sarkan, H.M.: Improving the accuracy of collaborative filtering recommendations using clustering and association rules mining on implicit data. Comput. Hum. Behav. 67, 113–128 (2017)

    Article  Google Scholar 

  29. Neysiani, B.S., Soltani, N., Mofidi, R., Nadimi-Shahraki, M.H.: Improve performance of association rule-based collaborative filtering recommendation systems using genetic algorithm. Int. J. Inf. Technol. Comput. Sci. 2, 48–55 (2019)

    Google Scholar 

  30. Patel, D., Hsu, W., Lee, M.L.: Mining relationships among interval-based events for classification. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp. 393–404. ACM (2008)

    Google Scholar 

  31. Patil, S.S., Khandagale, H.P.: Survey paper on enhancing web navigation usability using web usage mining techniques. Int. J. Mod. Trends Eng. Res. (IJMTER) 3(02), 594–599 (2016)

    Google Scholar 

  32. Pei, J., et al.: Mining sequential patterns by pattern-growth: the PrefixSpan approach. IEEE Trans. Knowl. Data Eng. 16(11), 1424–1440 (2004)

    Article  Google Scholar 

  33. Srikant, R., Agrawal, R.: Mining sequential patterns: generalizations and performance improvements. In: Apers, P., Bouzeghoub, M., Gardarin, G. (eds.) EDBT 1996. LNCS, vol. 1057, pp. 1–17. Springer, Heidelberg (1996). https://doi.org/10.1007/BFb0014140

    Chapter  Google Scholar 

  34. Srivastava, J., Cooley, R., Deshpande, M., Tan, P.-N.: Web usage mining: discovery and applications of usage patterns from web data. ACM SIGKDD Explor. Newslett. 1(2), 12–23 (2000)

    Article  Google Scholar 

  35. Tóth, K., Kósa, I., Vathy-Fogarassy, Á.: Frequent treatment sequence mining from medical databases. Stud. Health Technol. Inf. 236, 211–218 (2017)

    Google Scholar 

  36. Wang, K., Xu, Y., Yu, J.X.: Scalable sequential pattern mining for biological sequences. In Proceedings of the Thirteenth ACM International Conference on Information and Knowledge Management, pp. 178–187. ACM (2004)

    Google Scholar 

  37. Wu, Y., Ester, M.: FLAME: a probabilistic model combining aspect based opinion mining and collaborative filtering. In: Proceedings of the Eighth ACM International Conference on Web Search and Data Mining, pp. 199–208. ACM (2015)

    Google Scholar 

  38. Yates, A., Kolcz, A., Goharian, N., Frieder, O.: Effects of sampling on Twitter trend detection. In: Proceedings of the Tenth International Conference on Language Resources and Evaluation, LREC 2016, pp. 2998–3005 (2016)

    Google Scholar 

  39. Zaki, M.J.: SPADE: an efficient algorithm for mining frequent sequences. Machine learning 42(1–2), 31–60 (2001). https://doi.org/10.1023/A:1007652502315

    Article  MATH  Google Scholar 

  40. Zaman, T.S., Islam, N., Ahmed, C.F., Jeong, B.S.: iWAP: a single pass approach for web access sequential pattern mining. GSTF J. Comput. (JoC) 2(1), 1–6 (2018)

    Google Scholar 

  41. Zeng, M., Cao, H., Chen, M., Li, Y.: User behaviour modeling, recommendations, and purchase prediction during shopping festivals. Electron. Markets 29(2), 263–274 (2018). https://doi.org/10.1007/s12525-018-0311-8

    Article  Google Scholar 

  42. Zignani, M., Quadri, C., Del Vicario, M., Gaito, S., Rossi, G.P.: Temporal communication motifs in mobile cohesive groups. In: Cherifi, C., Cherifi, H., Karsai, M., Musolesi, M. (eds.) COMPLEX NETWORKS 2017 2017. SCI, vol. 689, pp. 490–501. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-72150-7_40

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohamad Kanaan .

Editor information

Editors and Affiliations

A Appendix

A Appendix

figure c
figure d
figure e
figure f
figure g
figure h

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer-Verlag GmbH Germany, part of Springer Nature

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Kanaan, M., Cazabet, R., Kheddouci, H. (2020). Temporal Pattern Mining for E-commerce Dataset. In: Hameurlain, A., Tjoa, A.M. (eds) Transactions on Large-Scale Data- and Knowledge-Centered Systems XLVI. Lecture Notes in Computer Science(), vol 12410. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-62386-2_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-62386-2_3

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-62385-5

  • Online ISBN: 978-3-662-62386-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics