A Generic Framework for Attribute-Driven Hierarchical Trace Clustering

van Zelst, Sebastiaan J.; Cao, Yukun

doi:10.1007/978-3-030-66498-5_23

Sebastiaan J. van Zelst^9,10 &
Yukun Cao¹⁰

Part of the book series: Lecture Notes in Business Information Processing ((LNBIP,volume 397))

Included in the following conference series:

International Conference on Business Process Management

1215 Accesses
3 Citations

Abstract

The execution of business processes often entails a specific process execution context, e.g. a customer, service or product. Often, the corresponding event data logs indicators of such an execution context, e.g., a customer type (bronze, silver, gold or platinum). Typically, variations in the execution of a process exist for the different execution context of a process. To gain a better understanding of the global process execution, it is interesting to study the behavioral (dis)similarity between different execution contexts of a process. However, in real business settings, the exact number of execution contexts might be too large to analyze manually. At the same time, current trace clustering techniques do not take process type information into account, i.e., they are solely behaviorally driven. Hence, in this paper, we present a hierarchical data-attribute-driven trace clustering framework that allows us to compare the behavior of different groups of traces. Our evaluation shows that the incorporation of data-attributes in trace clustering yields interesting novel process insights.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://github.com/caoyukun0430/pm4py-source/tree/yukun_paper and https://github.com/caoyukun0430/pm4py-ws/tree/dev-yukun.
2.
We formalize the abstractions on sets of sequences over \(\mathcal {A}\), rather than elements \(c{\in }\mathcal {C}\). Note that, given \(c{\in }\mathcal {C}\), we are able to access the trace-view by means of \(\pi _{\texttt {trace}}(c)\).

References

Aalst, W.: Data science in action. Process Mining, pp. 3–23. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-49851-4_1
Chapter Google Scholar
Augusto, A., et al.: Automated discovery of process models from event logs: review and benchmark. IEEE Trans. Knowl. Data Eng. 31(4), 686–705 (2019)
Article Google Scholar
Bae, J., Caverlee, J., Liu, L., Yan, H.: Process mining by measuring process block similarity. In: Eder, J., Dustdar, S. (eds.) BPM 2006. LNCS, vol. 4103, pp. 141–152. Springer, Heidelberg (2006). https://doi.org/10.1007/11837862_15
Chapter Google Scholar
Berti, A., van Zelst, S.J., van der Aalst, W.M.P.: Process mining for python (PM4Py): bridging the gap between process-and data science. In: Proceedings of the ICPM Demo Track 2019, co-located with 1st International Conference on Process Mining (ICPM 2019), Aachen, Germany, June 24–26, 2019, pp. 13–16 (2019)
Google Scholar
Bose, R.J.C., Van der Aalst, W.M.: Context aware trace clustering: towards improving process mining results. In: Proceedings of the 2009 SIAM International Conference on Data Mining, pp. 401–412. SIAM (2009)
Google Scholar
Bose, R.P.J.C., van der Aalst, W.M.P.: Trace clustering based on conserved patterns: towards achieving better process models. In: Rinderle-Ma, S., Sadiq, S., Leymann, F. (eds.) BPM 2009. LNBIP, vol. 43, pp. 170–181. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-12186-9_16
Chapter Google Scholar
Buijs, J.: Receipt phase of an environmental permit application process (‘wabo’), coselog project (2014). https://doi.org/10.4121/UUID:A07386A5-7BE3-4367-9535-70BC9E77DBE6
Buijs, J.: Environmental permit application process (‘WABO’), CoSeLoG project. Eindhoven Univ. Technol. Dataset (2014). https://doi.org/10.4121/uuid:26aba40d-8b2d-435b-b5af-6d4bfbd7a270
Article Google Scholar
Cadez, I., Heckerman, D., Meek, C., Smyth, P., White, S.: Model-based clustering and visualization of navigation patterns on a web site. Data Mining Knowl. Disc. 7(4), 399–424 (2003)
Article MathSciNet Google Scholar
Conforti, R., La Rosa, M., ter Hofstede, A.H.M.: Filtering out infrequent behavior from business process event logs. IEEE Trans. Knowl. Data Eng. 29(2), 300–314 (2017)
Article Google Scholar
De Koninck, P., Nelissen, K., Baesens, B., vanden Broucke, S., Snoeck, M., De Weerdt, J.: An approach for incorporating expert knowledge in trace clustering. In: Dubois, E., Pohl, K. (eds.) CAiSE 2017. LNCS, vol. 10253, pp. 561–576. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59536-8_35
Chapter Google Scholar
de Medeiros, A.K.A., Guzzo, A., Greco, G., van der Aalst, W.M.P., Weijters, A.J.M.M., van Dongen, B.F., Saccà, D.: Process mining based on clustering: a quest for precision. In: ter Hofstede, A., Benatallah, B., Paik, H.-Y. (eds.) BPM 2007. LNCS, vol. 4928, pp. 17–29. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-78238-4_4
Chapter Google Scholar
van Dongen, B.F.: Bpi challenge 2012 (2012). https://doi.org/10.4121/UUID:3926DB30-F712-4394-AEBC-75976070E91F
van Dongen, B.F.: Bpi challenge 2017 (2017). https://doi.org/10.4121/UUID:5F3067DF-F10B-45DA-B98B-86AE4C7A310B
van Dongen, B.F., Borchert, F.: Bpi challenge 2018 (2018). https://doi.org/10.4121/uuid:3301445f-95e8-4ff0-98a4-901f1f204972
Sani, M.F., van Zelst, S.J., van der Aalst, W.M.P.: Improving process discovery results by filtering outliers using conditional behavioural probabilities. In: Teniente, E., Weidlich, M. (eds.) BPM 2017. LNBIP, vol. 308, pp. 216–229. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-74030-0_16
Chapter Google Scholar
Fani Sani, M., van Zelst, S.J., van der Aalst, W.M.P.: Applying sequence mining for outlier detection in process mining. In: CoopIS, C&TC, and ODBASE 2018, Valletta, Malta, October 22–26, 2018, Proceedings, Part II, pp. 98–116 (2018)
Google Scholar
Fani Sani, M., van Zelst, S.J., van der Aalst, W.M.P.: Repairing outlier behaviour in event logs. In: Abramowicz, W., Paschke, A. (eds.) BIS 2018. LNBIP, vol. 320, pp. 115–131. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93931-5_9
Chapter Google Scholar
Ferreira, D., Zacarias, M., Malheiros, M., Ferreira, P.: Approaching process mining with sequence clustering: experiments and findings. In: Alonso, G., Dadam, P., Rosemann, M. (eds.) BPM 2007. LNCS, vol. 4714, pp. 360–374. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-75183-0_26
Chapter Google Scholar
Greco, G., Guzzo, A., Pontieri, L., Sacca, D.: Discovering expressive process models by clustering log traces. IEEE Trans. Knowl. Data Eng. 18(8), 1010–1027 (2006)
Article Google Scholar
Hompes, B., Buijs, J., Van der Aalst, W., Dixit, P., Buurman, J.: Discovering deviating cases and process variants using trace clustering. In: 27th Benelux Conference on Artificial Intelligence (BNAIC), November, pp. 5–6 (2015)
Google Scholar
Jung, J.-Y., Bae, J.: Workflow clustering method based on process similarity. In: Gavrilova, M.L., Gervasi, O., Kumar, V., Tan, C.J.K., Taniar, D., Laganá, A., Mun, Y., Choo, H. (eds.) ICCSA 2006. LNCS, vol. 3981, pp. 379–389. Springer, Heidelberg (2006). https://doi.org/10.1007/11751588_40
Chapter Google Scholar
Jung, J.Y., Bae, J., Liu, L.: Hierarchical clustering of business process models. Int. J. Innovative Comput. Inf. Control 5(12), 1349–4198 (2009)
Google Scholar
Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Soviet Phys. Doklady. 10, 707–710 (1966)
MathSciNet Google Scholar
Lu, X., Tabatabaei, S.A., Hoogendoorn, M., Reijers, H.A.: Trace clustering on very large event data in healthcare using frequent sequence patterns. In: Hildebrandt, T., van Dongen, B.F., Röglinger, M., Mendling, J. (eds.) BPM 2019. LNCS, vol. 11675, pp. 198–215. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-26619-6_14
Chapter Google Scholar
Luengo, D., Sepúlveda, M.: Applying clustering in process mining to find different versions of a business process that changes over time. In: Daniel, F., Barkaoui, K., Dustdar, S. (eds.) BPM 2011. LNBIP, vol. 99, pp. 153–158. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28108-2_15
Chapter Google Scholar
Murtagh, F., Contreras, P.: Algorithms for hierarchical clustering: an overview, II. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 7(6) (2017)
Google Scholar
Song, M., Günther, C.W., van der Aalst, W.M.P.: Trace clustering in process mining. In: Ardagna, D., Mecella, M., Yang, J. (eds.) BPM 2008. LNBIP, vol. 17, pp. 109–120. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-00328-8_11
Chapter Google Scholar
Song, M., Yang, H., Siadat, S.H., Pechenizkiy, M.: A comparative study of dimensionality reduction techniques to enhance trace clustering performances. Expert Syst. Appl. 40(9), 3722–3737 (2013)
Article Google Scholar
Sun, Y., Bauer, B.: A novel top-down approach for clustering traces. In: Zdravkovic, J., Kirikova, M., Johannesson, P. (eds.) CAiSE 2015. LNCS, vol. 9097, pp. 331–345. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-19069-3_21
Chapter Google Scholar
De Weerdt, J., vanden Broucke, S.K.L.M., Vanthienen, J., Baesens, B., : Active trace clustering for improved process discovery. IEEE Trans. Knowl. Data Eng. 25(12), 2708–2720 (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

Fraunhofer Institute for Applied Information Technology (FIT), Sankt Augustin, Germany
Sebastiaan J. van Zelst
Chair of Process and Data Science, RWTH Aachen University, Aachen, Germany
Sebastiaan J. van Zelst & Yukun Cao

Authors

Sebastiaan J. van Zelst
View author publications
You can also search for this author in PubMed Google Scholar
Yukun Cao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sebastiaan J. van Zelst .

Editor information

Editors and Affiliations

Universidad de Sevilla, Seville, Spain
Adela Del Río Ortega
Kühne Logistics University, Hamburg, Germany
Henrik Leopold
Universidade do Estado do Rio de Janeiro – UERJ, Rio de Janeiro, Brazil
Flávia Maria Santoro

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

van Zelst, S.J., Cao, Y. (2020). A Generic Framework for Attribute-Driven Hierarchical Trace Clustering. In: Del Río Ortega, A., Leopold, H., Santoro, F.M. (eds) Business Process Management Workshops. BPM 2020. Lecture Notes in Business Information Processing, vol 397. Springer, Cham. https://doi.org/10.1007/978-3-030-66498-5_23

Download citation

DOI: https://doi.org/10.1007/978-3-030-66498-5_23
Published: 19 January 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-66497-8
Online ISBN: 978-3-030-66498-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics