Skip to main content

Event Log Cleaning for Business Process Analytics

  • Living reference work entry
  • First Online:
Encyclopedia of Big Data Technologies
  • 355 Accesses

Synonyms

Event log preprocessing; Event log repair

Definition

Event log cleaning is a data preparation phase that turns event data into event logs to enable or improve the quality of business process analytics methods like process mining, model enrichment, and conformance checking. Event data might have to be collected from different sources and formats, filtered, transformed, and assigned to the corresponding processes and cases.

Overview

The goal of business process analytics projects is to gain insights into the execution of business processes. It can help to know which questions should be answered by the analysis. Some typical questions are what is done (activities), when is it done or how long does it take (time stamps), in which order (relations), and by whom(resources). In contrast to traditional questionnaires, the process participants do not need to be personally asked about their perception of the process. In business process analytics, the event logs containing process...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  • Baier T, Mendling J, Weske M (2014) Bridging abstraction layers in process mining. Inf Syst 46:123–139. https://doi.org/10.1016/j.is.2014.04.004

  • Bayomie D, Awad A, Ezat E (2016) Correlating unlabeled events from cyclic business processes execution. In: Advanced information systems engineering – 28th international conference, CAiSE 2016, Ljubljana, 13–17 June 2016. Proceedings, pp 274–289. https://doi.org/10.1007/978-3-319-39696-5_17

  • Bertoli P, Francescomarino CD, Dragoni M, Ghidini C (2013) Reasoning-based techniques for dealing with incomplete business process execution traces. In: AI*IA 2013: advances in artificial intelligence – XIIIth international conference of the Italian association for artificial intelligence, Turin, 4–6 Dec 2013. Proceedings, pp 469–480. https://doi.org/10.1007/978-3-319-03524-6_40

  • Bose JCJC, Mans RS, van der Aalst WMP (2013) Wanna improve process mining results? In: IEEE symposium on computational intelligence and data mining, CIDM 2013, Singapore, 16–19 Apr 2013, pp 127–134. https://doi.org/10.1109/CIDM.2013.6597227

  • Bose RPJC, van der Aalst WMP, Zliobaite I, Pechenizkiy M (2014) Dealing with concept drifts in process mining. IEEE Trans Neural Netw Learn Syst 25(1):154–171. https://doi.org/10.1109/TNNLS.2013.2278313

  • Conforti R, Rosa ML, ter Hofstede AHM (2017) Filtering out infrequent behavior from business process event logs. IEEE Trans Knowl Data Eng 29(2):300–314. https://doi.org/10.1109/TKDE.2016.2614680

  • de Leoni M, Maggi FM, van der Aalst WMP (2015) An alignment-based framework to check the conformance of declarative process models and to preprocess event-log data. Inf Syst 47:258–277. https://doi.org/10.1016/j.is.2013.12.005

  • de Leoni M, van der Aalst WMP, Dees M (2016) A general process mining framework for correlating, predicting and clustering dynamic behavior based on event logs. Inf Syst 56:235–257. https://doi.org/10.1016/j.is.2015.07.003

  • de Lima Bezerra F, Wainer J (2013) Algorithms for anomaly detection of traces in logs of process aware information systems. Inf Syst 38(1):33–44. https://doi.org/10.1016/j.is.2012.04.004

  • de San Pedro J, Cortadella J (2016) Discovering duplicate tasks in transition systems for the simplification of process models. In: Business process management – 14th international conference, BPM 2016, Rio de Janeiro, 18–22 Sept 2016. Proceedings, pp 108–124. https://doi.org/10.1007/978-3-319-45348-4_7

  • Diamantini C, Genga L, Potena D, van der Aalst WMP (2016) Building instance graphs for highly variable processes. Expert Syst Appl 59:101–118. https://doi.org/10.1016/j.eswa.2016.04.021

  • Dumas M, Rosa ML, Mendling J, Reijers HA (2013) Fundamentals of business process management. Springer, https://doi.org/10.1007/978-3-642-33143-5

  • Francescomarino CD, Ghidini C, Tessaris S, Sandoval IV (2015) Completing workflow traces using action languages. In: Advanced information systems engineering – 27th international conference, CAiSE 2015, Stockholm, 8–12 June 2015, Proceedings, pp 314–330. https://doi.org/10.1007/978-3-319-19069-3_20

  • Gama J, Zliobaite I, Bifet A, Pechenizkiy M, Bouchachia A (2014) A survey on concept drift adaptation. ACM Comput Surv 46(4):44:1–44:37. http://doi.acm.org/10.1145/2523813

  • Greco G, Guzzo A, Pontieri L, Saccà D (2006) Discovering expressive process models by clustering log traces. IEEE Trans Knowl Data Eng 18(8):1010–1027. https://doi.org/10.1109/TKDE.2006.123

  • Han J, Kamber M, Pei J (2011) Data mining: concepts and techniques, 3rd edn. Morgan Kaufmann, http://hanj.cs.illinois.edu/bk3/

  • International Organization for Standardization (2011) Software engineering – Software Product Quality Requirements and Evaluation (SQuaRE) – Guide to SQuaRE

    Google Scholar 

  • Leemans SJ (2017) Robust process mining with guarantees. Ph.D thesis, Eindhoven University of Technology. https://pure.tue.nl/ws/files/63890938/20170509_Leemans.pdf

  • Lu X, Fahland D, van der Aalst WMP (2014) Conformance checking based on partially ordered event data. In: Business process management workshops – BPM 2014 international workshops, Eindhoven, 7–8 Sept 2014, Revised papers, pp 75–88. https://doi.org/10.1007/978-3-319-15895-2_7

  • Mannhardt F, de Leoni M, Reijers HA, van der Aalst WMP (2016a) Balanced multi-perspective checking of process conformance. Computing 98(4):407–437. https://doi.org/10.1007/s00607-015-0441-1

  • Mannhardt F, de Leoni M, Reijers HA, van der Aalst WMP, Toussaint PJ (2016b) From low-level events to activities – a pattern-based approach. In: Business process management – 14th international conference, BPM 2016, Rio de Janeiro, 18–22 Sept 2016. Proceedings, pp 125–141. https://doi.org/10.1007/978-3-319-45348-4_8

  • Mans RS, Schonenberg H, Song M, van der Aalst WMP, Bakker PJM (2008) Application of process mining in healthcare – a case study in a Dutch hospital. In: Biomedical engineering systems and technologies, international joint conference, BIOSTEC 2008, Funchal, Madeira, 28–31 Jan 2008, Revised selected papers, pp 425–438. https://doi.org/10.1007/978-3-540-92219-3_32

  • Nezhad HRM, Saint-Paul R, Casati F, Benatallah B (2011) Event correlation for process discovery from web service interaction logs. VLDB J 20(3):417–444. https://doi.org/10.1007/s00778-010-0203-9

  • Ostovar A, Maaradji A, Rosa ML, ter Hofstede AHM (2017) Characterizing drift from event streams of business processes. In: Advanced information systems engineering – 29th international conference, CAiSE 2017, Essen, 12–16 Jun 2017, Proceedings, pp 210–228. https://doi.org/10.1007/978-3-319-59536-8_14

  • Pourmirza S, Dijkman RM, Grefen P (2017) Correlation miner: mining business process models and event correlations without case identifiers. Int J Coop Inf Syst 26(2):1–32. https://doi.org/10.1142/S0218843017420023

  • Rahm E, Bernstein PA (2001) A survey of approaches to automatic schema matching. VLDB J 10(4):334–350. https://doi.org/10.1007/s007780100057

  • Reichert M, Weber B (2012) Enabling flexibility in process-aware information systems – challenges, methods, technologies. Springer, https://doi.org/10.1007/978-3-642-30409-5

  • Rogge-Solti A, Kasneci G (2014) Temporal anomaly detection in business processes. In: Business process management – 12th international conference, BPM 2014, Haifa, 7–11 Sept 2014. Proceedings, pp 234–249. https://doi.org/10.1007/978-3-319-10172-9_15

  • Rogge-Solti A, Mans R, van der Aalst WMP, Weske M (2013) Improving documentation by repairing event logs. In: The practice of enterprise modeling – 6th IFIP WG 8.1 working conference, PoEM 2013, Riga, 6–7 Nov 2013, Proceedings, pp 129–144. https://doi.org/10.1007/978-3-642-41641-5_10

  • Senderovich A, Rogge-Solti A, Gal A, Mendling J, Mandelbaum A (2016) The ROAD from sensor data to process instances via interaction mining. In: Advanced information systems engineering – 28th international conference, CAiSE 2016, Ljubljana, 13–17 Jun 2016. Proceedings, pp 257–273. https://doi.org/10.1007/978-3-319-39696-5_16

  • Song JL, Luo TJ, Chen S, Liu W (2009) A clustering based method to solve duplicate tasks problem. J Grad School Chin Acad Sci 26(1):107–113

    Google Scholar 

  • Song S, Cao Y, Wang J (2016) Cleaning timestamps with temporal constraints. PVLDB 9(10):708–719. http://www.vldb.org/pvldb/vol9/p708-song.pdf

  • Suriadi S, Andrews R, ter Hofstede AHM, Wynn MT (2017) Event log imperfection patterns for process mining: towards a systematic approach to cleaning event logs. Inf Syst 64:132–150. https://doi.org/10.1016/j.is.2016.07.011

  • van der Aalst WMP (2016) Process mining – data science in action, 2nd edn. Springer, https://doi.org/10.1007/978-3-662-49851-4

  • van der Aalst WMP, Adriansyah A, de Medeiros AKA, Arcieri F, Baier T, Blickle T, Bose RPJC, van den Brand P, Brandtjen R, Buijs JCAM, Burattin A, Carmona J, Castellanos M, Claes J, Cook J, Costantini N, Curbera F, Damiani E, de Leoni M, Delias P, van Dongen BF, Dumas M, Dustdar S, Fahland D, Ferreira DR, Gaaloul W, van Geffen F, Goel S, Günther CW, Guzzo A, Harmon P, ter Hofstede AHM, Hoogland J, Ingvaldsen JE, Kato K, Kuhn R, Kumar A, Rosa ML, Maggi FM, Malerba D, Mans RS, Manuel A, McCreesh M, Mello P, Mendling J, Montali M, Nezhad HRM, zur Muehlen M, Munoz-Gama J, Pontieri L, Ribeiro J, Rozinat A, Pérez HS, Pérez RS, Sepúlveda M, Sinur J, Soffer P, Song M, Sperduti A, Stilo G, Stoel C, Swenson KD, Talamo M, Tan W, Turner C, Vanthienen J, Varvaressos G, Verbeek E, Verdonk M, Vigo R, Wang J, Weber B, Weidlich M, Weijters T, Wen L, Westergaard M, Wynn MT (2011) Process mining manifesto. In: Business process management workshops – BPM 2011 international workshops, Clermont-Ferrand, 29 Aug 2011, Revised selected papers, part I, pp 169–194, https://doi.org/10.1007/978-3-642-28108-2_19

  • Wang J, Song S, Zhu X, Lin X, Sun J (2016) Efficient recovery of missing events. IEEE Trans Knowl Data Eng 28(11):2943–2957. https://doi.org/10.1109/TKDE.2016.2594785

    Article  Google Scholar 

  • Yakout M, Berti-Équille L, Elmagarmid AK (2013) Don’t be scared: use scalable automatic repairing with maximal likelihood and bounded changes. In: Proceedings of the ACM SIGMOD international conference on management of data, SIGMOD 2013, New York, 22–27 Jun 2013, pp 553–564. http://doi.acm.org/10.1145/2463676.2463706

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andreas Solti .

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Solti, A. (2018). Event Log Cleaning for Business Process Analytics. In: Sakr, S., Zomaya, A. (eds) Encyclopedia of Big Data Technologies. Springer, Cham. https://doi.org/10.1007/978-3-319-63962-8_87-1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-63962-8_87-1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-63962-8

  • Online ISBN: 978-3-319-63962-8

  • eBook Packages: Springer Reference MathematicsReference Module Computer Science and Engineering

Publish with us

Policies and ethics