Abstract
Process Mining is getting a growing interest in many contexts where performance bottlenecks are critical for the business. Unfortunately, real cyber-physical systems are usually not implemented to easily address these techniques. One of the most frequent problems to face is transforming acquired data, often heterogeneous and unlabeled to allow the application of Process Mining technique. In this study, we propose an automatised and unsupervised methodology for extracting CaseIDs from an unlabeled event log. The proposed detection of CaseIDs is based on the definition of appropriate heuristic metrics, able to highlight the correlation between events that are part of the same process instance, according to temporal and topological features (e.g., kinds of functionally-related devices, topological distance, etc.). These features constitute the inputs for a clustering technique, which has been used to extract different cases. The performance of the proposed methodology was evaluated on a real diagnostic management system to support the decisions in maintenance operations in railway infrastructures. The system has been reproduced and tested in Gematica’s laboratory for simulating the data used in this work.
References
Al-Mhairat, A.M., Alabbadi, R., Shaban, R., AlQudah, A.: Performance evaluation of clustering algorithms (2019)
Bayomie, D., Awad, A., Ezat, E.: Correlating unlabeled events from cyclic business processes execution. In: Nurcan, S., Soffer, P., Bajec, M., Eder, J. (eds.) CAiSE 2016. LNCS, vol. 9694, pp. 274–289. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-39696-5_17
Bayomie, D., Di Ciccio, C., La Rosa, M., Mendling, J.: A probabilistic approach to event-case correlation for process mining. In: Laender, A.H.F., Pernici, B., Lim, E.-P., de Oliveira, J.P.M. (eds.) ER 2019. LNCS, vol. 11788, pp. 136–152. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-33223-5_12
Bayomie, D., Revoredo, K., Di Ciccio, C., Mendling, J.: Improving accuracy and explainability in event-case correlation via rule mining. In: 2022 4th International Conference on Process Mining (ICPM), pp. 24–31 (2022)
Burattin, A., Vigo, R.: A framework for semi-automated process instance discovery from decorative attributes. In: 2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM), pp. 176–183 (2011)
Carroll, J.D., Arabie, P.: Multidimensional scaling. Measur. Judgment Decis. Mak. 179–250 (1998)
Emamjome, F.F., Andrews, R., ter Hofstede, A.H., Reijers, H.A.: Alohomora: unlocking data quality causes through event log context. In: European Conference on Information Systems (2020)
Ferreira, D.R., Gillblad, D.: Discovering process models from unlabelled event logs. In: Dayal, U., Eder, J., Koehler, J., Reijers, H.A. (eds.) BPM 2009. LNCS, vol. 5701, pp. 143–158. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-03848-8_11
dos Santos Garcia, C., et al.: Process mining techniques and applications - a systematic mapping study. Expert Syst. Appl. 133, 260–295 (2019)
Gayo-Avello, D.: A survey on session detection methods in query logs and a proposal for future evaluation. Inf. Sci. 179(12), 1822–1843 (2009)
Lichtenstein, T., Bano, D., Weske, M.: Attribute-driven case notion discovery for unlabeled event logs. In: Marrella, A., Weber, B. (eds.) BPM 2021. LNBIP, vol. 436, pp. 111–122. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-94343-1_9
Likas, A., Vlassis, N., Verbeek, J.J.: The global k-means clustering algorithm. Pattern Recogn. 36(2), 451–461 (2003)
Marin-Castro, H.M., Tello-Leal, E.: Event log preprocessing for process mining: a review. Appl. Sci. 11(22) (2021)
Myers, D., Suriadi, S., Radke, K., Foo, E.: Anomaly detection for industrial control systems using process mining. Comput. Secur. 78, 103–125 (2018)
Pourmirza, S., Dijkman, R., Grefen, P.: Correlation mining: mining process orchestrations without case identifiers. In: Barros, A., Grigori, D., Narendra, N.C., Dam, H.K. (eds.) ICSOC 2015. LNCS, vol. 9435, pp. 237–252. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-48616-0_15
Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987)
Suriadi, S., Andrews, R., ter Hofstede, A., Wynn, M.: Event log imperfection patterns for process mining: towards a systematic approach to cleaning event logs. Inf. Syst. 64, 132–150 (2017)
Thorndike, R.L.: Who belongs in the family? Psychometrika 18(4), 267–276 (1953)
Tibshirani, R., Walther, G., Hastie, T.: Estimating the number of clusters in a data set via the gap statistic. J. R. Stat. Soc.: Ser. B 63 (2001)
van der Aalst, W., et al.: Process mining manifesto. In: Daniel, F., Barkaoui, K., Dustdar, S. (eds.) BPM 2011. LNBIP, vol. 99, pp. 169–194. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28108-2_19
Acknowledgements
The research has been supported by the DARWINIST project, funded by Universitá della Campania “L. Vanvitelli”, D.R. 834 30-09-2022. The work of Roberta De Fazio is granted by PON Ricerca e Innovazione 2014/2020 MUR—Ministero dell’Università e della Ricerca (Italy)—with the PhD program XXXVII cycle D.M. N.1061 “Dottorati e contratti di ricerca su tematiche dell’Innovazione”. The work of Laura Verde is granted by the “Predictive Maintenance Multidominio (Multidomain predictive maintenance)” project, PON “Ricerca e Innovazione” 2014–2020, Asse IV-Azione IV.4 #B61B21005470007.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
De Fazio, R. et al. (2024). CaseID Detection for Process Mining: A Heuristic-Based Methodology. In: De Smedt, J., Soffer, P. (eds) Process Mining Workshops. ICPM 2023. Lecture Notes in Business Information Processing, vol 503. Springer, Cham. https://doi.org/10.1007/978-3-031-56107-8_4
Download citation
DOI: https://doi.org/10.1007/978-3-031-56107-8_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-56106-1
Online ISBN: 978-3-031-56107-8
eBook Packages: Computer ScienceComputer Science (R0)