CaseID Detection for Process Mining: A Heuristic-Based Methodology

De Fazio, Roberta; Balzanella, Antonio; Marrone, Stefano; Marulli, Fiammetta; Verde, Laura; Reccia, Vincenzo; Valletta, Paolo

doi:10.1007/978-3-031-56107-8_4

Part of the book series: Lecture Notes in Business Information Processing ((LNBIP,volume 503))

Included in the following conference series:

International Conference on Process Mining

59 Accesses

Abstract

Process Mining is getting a growing interest in many contexts where performance bottlenecks are critical for the business. Unfortunately, real cyber-physical systems are usually not implemented to easily address these techniques. One of the most frequent problems to face is transforming acquired data, often heterogeneous and unlabeled to allow the application of Process Mining technique. In this study, we propose an automatised and unsupervised methodology for extracting CaseIDs from an unlabeled event log. The proposed detection of CaseIDs is based on the definition of appropriate heuristic metrics, able to highlight the correlation between events that are part of the same process instance, according to temporal and topological features (e.g., kinds of functionally-related devices, topological distance, etc.). These features constitute the inputs for a clustering technique, which has been used to extract different cases. The performance of the proposed methodology was evaluated on a real diagnostic management system to support the decisions in maintenance operations in railway infrastructures. The system has been reproduced and tested in Gematica’s laboratory for simulating the data used in this work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Notes

References

Al-Mhairat, A.M., Alabbadi, R., Shaban, R., AlQudah, A.: Performance evaluation of clustering algorithms (2019)
Google Scholar
Bayomie, D., Awad, A., Ezat, E.: Correlating unlabeled events from cyclic business processes execution. In: Nurcan, S., Soffer, P., Bajec, M., Eder, J. (eds.) CAiSE 2016. LNCS, vol. 9694, pp. 274–289. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-39696-5_17
Chapter Google Scholar
Bayomie, D., Di Ciccio, C., La Rosa, M., Mendling, J.: A probabilistic approach to event-case correlation for process mining. In: Laender, A.H.F., Pernici, B., Lim, E.-P., de Oliveira, J.P.M. (eds.) ER 2019. LNCS, vol. 11788, pp. 136–152. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-33223-5_12
Chapter Google Scholar
Bayomie, D., Revoredo, K., Di Ciccio, C., Mendling, J.: Improving accuracy and explainability in event-case correlation via rule mining. In: 2022 4th International Conference on Process Mining (ICPM), pp. 24–31 (2022)
Google Scholar
Burattin, A., Vigo, R.: A framework for semi-automated process instance discovery from decorative attributes. In: 2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM), pp. 176–183 (2011)
Google Scholar
Carroll, J.D., Arabie, P.: Multidimensional scaling. Measur. Judgment Decis. Mak. 179–250 (1998)
Google Scholar
Emamjome, F.F., Andrews, R., ter Hofstede, A.H., Reijers, H.A.: Alohomora: unlocking data quality causes through event log context. In: European Conference on Information Systems (2020)
Google Scholar
Ferreira, D.R., Gillblad, D.: Discovering process models from unlabelled event logs. In: Dayal, U., Eder, J., Koehler, J., Reijers, H.A. (eds.) BPM 2009. LNCS, vol. 5701, pp. 143–158. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-03848-8_11
Chapter Google Scholar
dos Santos Garcia, C., et al.: Process mining techniques and applications - a systematic mapping study. Expert Syst. Appl. 133, 260–295 (2019)
Article Google Scholar
Gayo-Avello, D.: A survey on session detection methods in query logs and a proposal for future evaluation. Inf. Sci. 179(12), 1822–1843 (2009)
Article Google Scholar
Lichtenstein, T., Bano, D., Weske, M.: Attribute-driven case notion discovery for unlabeled event logs. In: Marrella, A., Weber, B. (eds.) BPM 2021. LNBIP, vol. 436, pp. 111–122. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-94343-1_9
Chapter Google Scholar
Likas, A., Vlassis, N., Verbeek, J.J.: The global k-means clustering algorithm. Pattern Recogn. 36(2), 451–461 (2003)
Article Google Scholar
Marin-Castro, H.M., Tello-Leal, E.: Event log preprocessing for process mining: a review. Appl. Sci. 11(22) (2021)
Google Scholar
Myers, D., Suriadi, S., Radke, K., Foo, E.: Anomaly detection for industrial control systems using process mining. Comput. Secur. 78, 103–125 (2018)
Article Google Scholar
Pourmirza, S., Dijkman, R., Grefen, P.: Correlation mining: mining process orchestrations without case identifiers. In: Barros, A., Grigori, D., Narendra, N.C., Dam, H.K. (eds.) ICSOC 2015. LNCS, vol. 9435, pp. 237–252. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-48616-0_15
Chapter Google Scholar
Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987)
Article Google Scholar
Suriadi, S., Andrews, R., ter Hofstede, A., Wynn, M.: Event log imperfection patterns for process mining: towards a systematic approach to cleaning event logs. Inf. Syst. 64, 132–150 (2017)
Article Google Scholar
Thorndike, R.L.: Who belongs in the family? Psychometrika 18(4), 267–276 (1953)
Article Google Scholar
Tibshirani, R., Walther, G., Hastie, T.: Estimating the number of clusters in a data set via the gap statistic. J. R. Stat. Soc.: Ser. B 63 (2001)
Google Scholar
van der Aalst, W., et al.: Process mining manifesto. In: Daniel, F., Barkaoui, K., Dustdar, S. (eds.) BPM 2011. LNBIP, vol. 99, pp. 169–194. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28108-2_19
Chapter Google Scholar

Download references

Acknowledgements

The research has been supported by the DARWINIST project, funded by Universitá della Campania “L. Vanvitelli”, D.R. 834 30-09-2022. The work of Roberta De Fazio is granted by PON Ricerca e Innovazione 2014/2020 MUR—Ministero dell’Università e della Ricerca (Italy)—with the PhD program XXXVII cycle D.M. N.1061 “Dottorati e contratti di ricerca su tematiche dell’Innovazione”. The work of Laura Verde is granted by the “Predictive Maintenance Multidominio (Multidomain predictive maintenance)” project, PON “Ricerca e Innovazione” 2014–2020, Asse IV-Azione IV.4 #B61B21005470007.

Author information

Authors and Affiliations

Dipartimento di Matematica e Fisica, Università della Campania “Luigi Vanvitelli”, viale Lincoln, 7, Caserta, Italy
Roberta De Fazio, Antonio Balzanella, Stefano Marrone, Fiammetta Marulli & Laura Verde
Gematica srl, via Diocleziano 107, Naples, Italy
Vincenzo Reccia & Paolo Valletta

Authors

Roberta De Fazio
View author publications
You can also search for this author in PubMed Google Scholar
Antonio Balzanella
View author publications
You can also search for this author in PubMed Google Scholar
Stefano Marrone
View author publications
You can also search for this author in PubMed Google Scholar
Fiammetta Marulli
View author publications
You can also search for this author in PubMed Google Scholar
Laura Verde
View author publications
You can also search for this author in PubMed Google Scholar
Vincenzo Reccia
View author publications
You can also search for this author in PubMed Google Scholar
Paolo Valletta
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Roberta De Fazio or Laura Verde .

Editor information

Editors and Affiliations

KU Leuven, Leuven, Belgium
Johannes De Smedt
University of Haifa, Haifa, Israel
Pnina Soffer

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

De Fazio, R. et al. (2024). CaseID Detection for Process Mining: A Heuristic-Based Methodology. In: De Smedt, J., Soffer, P. (eds) Process Mining Workshops. ICPM 2023. Lecture Notes in Business Information Processing, vol 503. Springer, Cham. https://doi.org/10.1007/978-3-031-56107-8_4

Download citation

DOI: https://doi.org/10.1007/978-3-031-56107-8_4
Published: 13 April 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-56106-1
Online ISBN: 978-3-031-56107-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

CaseID Detection for Process Mining: A Heuristic-Based Methodology