Skip to main content

CaseID Detection for Process Mining: A Heuristic-Based Methodology

  • Conference paper
  • First Online:
Process Mining Workshops (ICPM 2023)

Abstract

Process Mining is getting a growing interest in many contexts where performance bottlenecks are critical for the business. Unfortunately, real cyber-physical systems are usually not implemented to easily address these techniques. One of the most frequent problems to face is transforming acquired data, often heterogeneous and unlabeled to allow the application of Process Mining technique. In this study, we propose an automatised and unsupervised methodology for extracting CaseIDs from an unlabeled event log. The proposed detection of CaseIDs is based on the definition of appropriate heuristic metrics, able to highlight the correlation between events that are part of the same process instance, according to temporal and topological features (e.g., kinds of functionally-related devices, topological distance, etc.). These features constitute the inputs for a clustering technique, which has been used to extract different cases. The performance of the proposed methodology was evaluated on a real diagnostic management system to support the decisions in maintenance operations in railway infrastructures. The system has been reproduced and tested in Gematica’s laboratory for simulating the data used in this work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Notes

  1. 1.

    https://gematica.com/.

  2. 2.

    https://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html.

References

  1. Al-Mhairat, A.M., Alabbadi, R., Shaban, R., AlQudah, A.: Performance evaluation of clustering algorithms (2019)

    Google Scholar 

  2. Bayomie, D., Awad, A., Ezat, E.: Correlating unlabeled events from cyclic business processes execution. In: Nurcan, S., Soffer, P., Bajec, M., Eder, J. (eds.) CAiSE 2016. LNCS, vol. 9694, pp. 274–289. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-39696-5_17

    Chapter  Google Scholar 

  3. Bayomie, D., Di Ciccio, C., La Rosa, M., Mendling, J.: A probabilistic approach to event-case correlation for process mining. In: Laender, A.H.F., Pernici, B., Lim, E.-P., de Oliveira, J.P.M. (eds.) ER 2019. LNCS, vol. 11788, pp. 136–152. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-33223-5_12

    Chapter  Google Scholar 

  4. Bayomie, D., Revoredo, K., Di Ciccio, C., Mendling, J.: Improving accuracy and explainability in event-case correlation via rule mining. In: 2022 4th International Conference on Process Mining (ICPM), pp. 24–31 (2022)

    Google Scholar 

  5. Burattin, A., Vigo, R.: A framework for semi-automated process instance discovery from decorative attributes. In: 2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM), pp. 176–183 (2011)

    Google Scholar 

  6. Carroll, J.D., Arabie, P.: Multidimensional scaling. Measur. Judgment Decis. Mak. 179–250 (1998)

    Google Scholar 

  7. Emamjome, F.F., Andrews, R., ter Hofstede, A.H., Reijers, H.A.: Alohomora: unlocking data quality causes through event log context. In: European Conference on Information Systems (2020)

    Google Scholar 

  8. Ferreira, D.R., Gillblad, D.: Discovering process models from unlabelled event logs. In: Dayal, U., Eder, J., Koehler, J., Reijers, H.A. (eds.) BPM 2009. LNCS, vol. 5701, pp. 143–158. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-03848-8_11

    Chapter  Google Scholar 

  9. dos Santos Garcia, C., et al.: Process mining techniques and applications - a systematic mapping study. Expert Syst. Appl. 133, 260–295 (2019)

    Article  Google Scholar 

  10. Gayo-Avello, D.: A survey on session detection methods in query logs and a proposal for future evaluation. Inf. Sci. 179(12), 1822–1843 (2009)

    Article  Google Scholar 

  11. Lichtenstein, T., Bano, D., Weske, M.: Attribute-driven case notion discovery for unlabeled event logs. In: Marrella, A., Weber, B. (eds.) BPM 2021. LNBIP, vol. 436, pp. 111–122. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-94343-1_9

    Chapter  Google Scholar 

  12. Likas, A., Vlassis, N., Verbeek, J.J.: The global k-means clustering algorithm. Pattern Recogn. 36(2), 451–461 (2003)

    Article  Google Scholar 

  13. Marin-Castro, H.M., Tello-Leal, E.: Event log preprocessing for process mining: a review. Appl. Sci. 11(22) (2021)

    Google Scholar 

  14. Myers, D., Suriadi, S., Radke, K., Foo, E.: Anomaly detection for industrial control systems using process mining. Comput. Secur. 78, 103–125 (2018)

    Article  Google Scholar 

  15. Pourmirza, S., Dijkman, R., Grefen, P.: Correlation mining: mining process orchestrations without case identifiers. In: Barros, A., Grigori, D., Narendra, N.C., Dam, H.K. (eds.) ICSOC 2015. LNCS, vol. 9435, pp. 237–252. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-48616-0_15

    Chapter  Google Scholar 

  16. Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987)

    Article  Google Scholar 

  17. Suriadi, S., Andrews, R., ter Hofstede, A., Wynn, M.: Event log imperfection patterns for process mining: towards a systematic approach to cleaning event logs. Inf. Syst. 64, 132–150 (2017)

    Article  Google Scholar 

  18. Thorndike, R.L.: Who belongs in the family? Psychometrika 18(4), 267–276 (1953)

    Article  Google Scholar 

  19. Tibshirani, R., Walther, G., Hastie, T.: Estimating the number of clusters in a data set via the gap statistic. J. R. Stat. Soc.: Ser. B 63 (2001)

    Google Scholar 

  20. van der Aalst, W., et al.: Process mining manifesto. In: Daniel, F., Barkaoui, K., Dustdar, S. (eds.) BPM 2011. LNBIP, vol. 99, pp. 169–194. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28108-2_19

    Chapter  Google Scholar 

Download references

Acknowledgements

The research has been supported by the DARWINIST project, funded by Universitá della Campania “L. Vanvitelli”, D.R. 834 30-09-2022. The work of Roberta De Fazio is granted by PON Ricerca e Innovazione 2014/2020 MUR—Ministero dell’Università e della Ricerca (Italy)—with the PhD program XXXVII cycle D.M. N.1061 “Dottorati e contratti di ricerca su tematiche dell’Innovazione”. The work of Laura Verde is granted by the “Predictive Maintenance Multidominio (Multidomain predictive maintenance)” project, PON “Ricerca e Innovazione” 2014–2020, Asse IV-Azione IV.4 #B61B21005470007.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Roberta De Fazio or Laura Verde .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

De Fazio, R. et al. (2024). CaseID Detection for Process Mining: A Heuristic-Based Methodology. In: De Smedt, J., Soffer, P. (eds) Process Mining Workshops. ICPM 2023. Lecture Notes in Business Information Processing, vol 503. Springer, Cham. https://doi.org/10.1007/978-3-031-56107-8_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-56107-8_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-56106-1

  • Online ISBN: 978-3-031-56107-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics