Abstract
To enable process analysis based on an event log without compromising the privacy of individuals involved in process execution, a log may be anonymized. Such anonymization strives to transform a log so that it satisfies provable privacy guarantees, while largely maintaining its utility for process analysis. Existing techniques perform anonymization using simple, syntactic measures to identify suitable transformation operations. This way, the semantics of the activities referenced by the events in a trace are neglected, potentially leading to transformations in which events of unrelated activities are merged. To avoid this and incorporate the semantics of activities during anonymization, we propose to instead incorporate a distance measure based on feature learning. Specifically, we show how embeddings of events enable the definition of a distance measure for traces to guide event log anonymization. Our experiments with real-world data indicate that anonymization using this measure, compared to a syntactic one, yields logs that are closer to the original log in various dimensions and, hence, have higher utility for process analysis.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
BPI challenge 2020: Prepaid travel costs. https://data.4tu.nl/articles/dataset/BPI_Challenge_2020_Prepaid_Travel_Costs/12696722. Accessed 12 May 2020
Receipt phase of an environmental permit application process (‘wabo’), coselog project. https://data.4tu.nl/collections/Environmental_permit_application_process_WABO_CoSeLoG_project/5065529. Accessed 11 May 2020
Sepsis cases - event log. https://data.4tu.nl/articles/dataset/Sepsis_Cases_-_Event_Log/12707639. Accessed 03 Apr 2020
Batista, E., Solanas, A.: A uniformization-based approach to preserve individuals’ privacy during process mining analyses. Peer Peer Netw. Appl. 14, 1–20 (2021). https://doi.org/10.1007/s12083-020-01059-1
Bauer, M., Fahrenkrog-Petersen, S.A., Koschmider, A., Mannhardt, F., van der Aa, H., Weidlich, M.: ELPaaS: event log privacy as a service. In: BPM Demos, pp. 159–163 (2019)
De Koninck, P., vanden Broucke, S., De Weerdt, J.: act2vec, trace2vec, log2vec, and model2vec: representation learning for business processes. In: Weske, M., Montali, M., Weber, I., vom Brocke, J. (eds.) BPM 2018. LNCS, vol. 11080, pp. 305–321. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98648-7_18
Elkoumy, G., Fahrenkrog-Petersen, S.A., Dumas, M., Laud, P., Pankova, A., Weidlich, M.: Secure multi-party computation for inter-organizational process mining. In: Nurcan, S., Reinhartz-Berger, I., Soffer, P., Zdravkovic, J. (eds.) BPMDS/EMMSAD -2020. LNBIP, vol. 387, pp. 166–181. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-49418-6_11
Elkoumy, G., Fahrenkrog-Petersen, S.A., Dumas, M., Laud, P., Pankova, A., Weidlich, M.: Shareprom: a tool for privacy-preserving inter-organizational process mining. In: BPM Demos, pp. 72–76 (2020)
Elkoumy, G., et al.: Privacy and confidentiality in process mining-threats and research challenges. arXiv:2106.00388 (2021)
Elkoumy, G., Pankova, A., Dumas, M.: Mine me but don’t single me out: differentially private event logs for process mining. arXiv:2103.11739 (2021)
Fahrenkrog-Petersen, S., van der Aa, H., Weidlich, M.: PRETSA: event log sanitization for privacy-aware process discovery. In: ICPM (2019)
Fahrenkrog-Petersen, S.A.: Providing privacy guarantees in process mining. In: CAiSE (Doctoral Consortium), pp. 23–30 (2019)
Fahrenkrog-Petersen, S.A., van der Aa, H., Weidlich, M.: PRIPEL: privacy-preserving event log publishing including contextual information. In: Fahland, D., Ghidini, C., Becker, J., Dumas, M. (eds.) BPM 2020. LNCS, vol. 12168, pp. 111–128. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58666-9_7
Kabierski, M., Fahrenkrog-Petersen, S.A., Weidlich, M.: Privacy-aware process performance indicators: framework and release mechanisms. In: La Rosa, M., Sadiq, S., Teniente, E. (eds.) CAiSE 2021. LNCS, vol. 12751, pp. 19–36. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-79382-1_2
Knols, B., van der Werf, J.M.E.M.: Measuring the behavioral quality of log sampling. In: ICPM. pp. 97–104 (2019)
Li, N., Li, T., Venkatasubramanian, S.: t-closeness: privacy beyond k-anonymity and l-diversity. In: ICDE. IEEE (2007)
Liu, C., Duan, H., Zeng, Q., Zhou, M., Lu, F., Cheng, J.: Towards comprehensive support for privacy preservation cross-organization business process mining. IEEE Trans. Serv. Comput. 12(4), 639–653 (2016)
Mannhardt, F., Koschmider, A., Baracaldo, N., Weidlich, M., Michael, J.: Privacy-preserving process mining. BISE 61(5), 595–614 (2019). https://doi.org/10.1007/s12599-019-00613-3
Mikolov, T., Yih, W.t., Zweig, G.: Linguistic regularities in continuous space word representations. In: NAACL, pp. 746–751 (2013)
Pika, A., Wynn, M.T., Budiono, S., Ter Hofstede, A.H., van der Aalst, W., Reijers, H.A.: Privacy-preserving process mining in healthcare. Int. J. Environ. Res. Public Health 17(5), 1612 (2020)
Rafiei, M., van der Aalst, W.M.P.: Mining roles from event logs while preserving privacy. In: Di Francescomarino, C., Dijkman, R., Zdun, U. (eds.) BPM 2019. LNBIP, vol. 362, pp. 676–689. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-37453-2_54
Rafiei, M., van der Aalst, W.: Practical aspect of privacy-preserving data publishing in process mining. In: BPM Demos, pp. 92–96 (2020)
Rafiei, M., van der Aalst, W.: Group-based privacy preservation techniques for process mining. arXiv preprint arXiv:2105.11983 (2021)
Rafiei, M., Wagner, M., van der Aalst, W.M.P.: TLKC-privacy model for process mining. In: Dalpiaz, F., Zdravkovic, J., Loucopoulos, P. (eds.) RCIS 2020. LNBIP, vol. 385, pp. 398–416. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-50316-1_24
Rozinat, A., Aalst, W.: Conformance checking of processes based on monitoring real behavior. Inf. Syst. 33, 64–95 (2008)
Stefanini, A., Aloini, D., Benevento, E., Dulmin, R., Mininno, V.: Performance analysis in emergency departments: a data-driven approach. Measuring Bus. Excell. (2018)
Sweeney, L.: k-anonymity: a model for protecting privacy. Int. J. Uncertain. Fuzz. Knowl.-Based Syst. 10(05), 557–570 (2002)
Van Der Aalst, W.: Process mining: overview and opportunities. ACM Trans. Manag. Inf. Syst. (TMIS) 3(2), 1–17 (2012)
Nuñez von Voigt, S., et al.: Quantifying the re-identification risk of event logs for process mining. In: Dustdar, S., Yu, E., Salinesi, C., Rieu, D., Pant, V. (eds.) CAiSE 2020. LNCS, vol. 12127, pp. 252–267. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-49435-3_16
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Rösel, F., Fahrenkog-Petersen, S.A., van der Aa, H., Weidlich, M. (2022). A Distance Measure for Privacy-Preserving Process Mining Based on Feature Learning. In: Marrella, A., Weber, B. (eds) Business Process Management Workshops. BPM 2021. Lecture Notes in Business Information Processing, vol 436. Springer, Cham. https://doi.org/10.1007/978-3-030-94343-1_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-94343-1_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-94342-4
Online ISBN: 978-3-030-94343-1
eBook Packages: Computer ScienceComputer Science (R0)