Skip to main content

On Inferring a Meaningful Similarity Metric for Customer Behaviour

  • Conference paper
  • First Online:
Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track (ECML PKDD 2021)

Abstract

In omnichannel customer service environments, where no real process is enforced, a wide variety of customer journey variants exists. This variety makes it complex to find process improvement opportunities. Modeling the journeys as traces is an essential step before discovering an explainable model of various behaviours. Trace clustering helps improvement efforts by separating the journeys into homogeneous subsets in terms of behaviour and purpose. For this, a one-size-fits-all distance metric has been used so far in the literature. This paper shows that a domain-informed similarity metric will improve customer journey clustering compared to a generic one. We propose SIMPRIM framework, which uses clustering quality metrics to develop a similarity metric that maximizes the separability of the journeys in a low dimensional space while agreeing with existing process knowledge. Experimental evaluation on real life use cases of a large telecom company and a benchmark dataset show that, compared to a generic metric, respectively a 46% and 39% improvement can be obtained in terms of the internal clustering quality while keeping the external clustering quality equal. We also show that the inferred metric can be useful for prediction applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bernard, G., Andritsos, P.: A process mining based model for customer journey mapping. AISE, vol. 1848, pp. 49—56 (2017)

    Google Scholar 

  2. Bose, R., Van der Aalst, W.: Context Aware Trace Clustering: Towards Improving Process Mining Results. SDM, pp. 401–412 (2009)

    Google Scholar 

  3. Breiman, L.: Classification and Regression Trees. Routledge, Milton Park (1984)

    Google Scholar 

  4. Chierichetti, F., Kumar, R., Pandey, S., Vassilvitskii, S.: Finding the Jaccard median. In: 21st ACM-SIAM Symposium on Discrete Algorithms, pp 293—311 (2010)

    Google Scholar 

  5. De Weerdt, J., Van den Broucke, S., Van Thienen, J., Baesens, B.: Active trace clustering for improved process discovery. TKDE 25(12), 2708—2720 (2013)

    Google Scholar 

  6. Evermann, J., Thaler, T., Fettke, P.: Clustering traces using sequence alignment. In: Reichert, M., Reijers, H.A. (eds.) BPM 2015. LNBIP, vol. 256, pp. 179–190. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-42887-1_15

    Chapter  Google Scholar 

  7. Greco, G., Guzzo, A., Pontieri, L., Sacca, D.: Discovering expressive process models by clustering log traces. TKDE 18(8), 1010–1027 (2006)

    Google Scholar 

  8. Hompes, B., Buijs, J., Van der Aalst, W., Dixit, P., Buurman, J.: Discovering deviating cases and process variants using trace clustering. BNAIC (2015)

    Google Scholar 

  9. Huang, J., Ng, M., Rong, H., Li, Z.: Automated variable weighting in k-means type clustering. Patt. Anal. Mach. Intell. 27(5), 657–668 (2005)

    Article  Google Scholar 

  10. Lacoste, A., Larochelle, H., Laviolette, F., Marchand, M.: Sequential Model-Based Ensemble Optimization. UAI (2014)

    Google Scholar 

  11. Liu, Y., Li, Z., Xiong, H., Gao, X., Wu, J.: Understanding of internal clustering validation measures. IEEE International Conference on Data Mining (2010)

    Google Scholar 

  12. Song, M., Günther, C., Van der Aalst, W.: Trace Clustering in Process Mining. Lecture Notes in Business Information Processing, vol. 17, pp. 109–120 (2008)

    Google Scholar 

  13. Spenrath, Y., Hassani, M., Van Dongen, B., Tariq, H.: Why did my consumer shop? Learning an efficient distance metric for retailer transaction data. In: ECML PKDD 2020, vol. 12461, pp. 323—338 (2020)

    Google Scholar 

  14. Terragni, A., Hassani, M.: Optimizing Customer Journey Using Process Mining and Sequence-Aware Recommendation. Ass. for Computing Machinery, NY (2019)

    Google Scholar 

  15. Thaler, T., Ternis, S., Fettke, P., Loos, P.: A Comparative Analysis of Process Instance Cluster Techniques (2015)

    Google Scholar 

  16. van der Aalst, W.M.P., Pesic, M., Song, M.: Beyond process mining: from the past to present and future. In: Pernici, B. (ed.) CAiSE 2010. LNCS, vol. 6051, pp. 38–52. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-13094-6_5

    Chapter  Google Scholar 

  17. Xing, E., Ng, A., Jordan, M., Russell, S.: Distance metric learning with application to clustering with side-information. NIPS, vol. 15, MIT Press (2003)

    Google Scholar 

  18. Van der Aalst, W.: Data Science in Action. In: Process Mining, pp. 3–23. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-49851-4_1

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Sophie van den Berg or Marwan Hassani .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

van den Berg, S., Hassani, M. (2021). On Inferring a Meaningful Similarity Metric for Customer Behaviour. In: Dong, Y., Kourtellis, N., Hammer, B., Lozano, J.A. (eds) Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track. ECML PKDD 2021. Lecture Notes in Computer Science(), vol 12979. Springer, Cham. https://doi.org/10.1007/978-3-030-86517-7_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-86517-7_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-86516-0

  • Online ISBN: 978-3-030-86517-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics