Synonyms
Definition
Event log cleaning is a data preparation phase that turns event data into event logs to enable or improve the quality of business process analytics methods like process mining, model enrichment, and conformance checking. Event data might have to be collected from different sources and formats, filtered, transformed, and assigned to the corresponding processes and cases.
Overview
The goal of business process analytics projects is to gain insights into the execution of business processes. It can help to know which questions should be answered by the analysis. Some typical questions are what is done (activities), when is it done or how long does it take (time stamps), in which order (relations), and by whom(resources). In contrast to traditional questionnaires, the process participants do not need to be personally asked about their perception of the process. In business process analytics, the event logs containing process...
References
Baier T, Mendling J, Weske M (2014) Bridging abstraction layers in process mining. Inf Syst 46:123–139. https://doi.org/10.1016/j.is.2014.04.004
Bayomie D, Awad A, Ezat E (2016) Correlating unlabeled events from cyclic business processes execution. In: Advanced information systems engineering – 28th international conference, CAiSE 2016, Ljubljana, 13–17 June 2016. Proceedings, pp 274–289. https://doi.org/10.1007/978-3-319-39696-5_17
Bertoli P, Francescomarino CD, Dragoni M, Ghidini C (2013) Reasoning-based techniques for dealing with incomplete business process execution traces. In: AI*IA 2013: advances in artificial intelligence – XIIIth international conference of the Italian association for artificial intelligence, Turin, 4–6 Dec 2013. Proceedings, pp 469–480. https://doi.org/10.1007/978-3-319-03524-6_40
Bose JCJC, Mans RS, van der Aalst WMP (2013) Wanna improve process mining results? In: IEEE symposium on computational intelligence and data mining, CIDM 2013, Singapore, 16–19 Apr 2013, pp 127–134. https://doi.org/10.1109/CIDM.2013.6597227
Bose RPJC, van der Aalst WMP, Zliobaite I, Pechenizkiy M (2014) Dealing with concept drifts in process mining. IEEE Trans Neural Netw Learn Syst 25(1):154–171. https://doi.org/10.1109/TNNLS.2013.2278313
Conforti R, Rosa ML, ter Hofstede AHM (2017) Filtering out infrequent behavior from business process event logs. IEEE Trans Knowl Data Eng 29(2):300–314. https://doi.org/10.1109/TKDE.2016.2614680
de Leoni M, Maggi FM, van der Aalst WMP (2015) An alignment-based framework to check the conformance of declarative process models and to preprocess event-log data. Inf Syst 47:258–277. https://doi.org/10.1016/j.is.2013.12.005
de Leoni M, van der Aalst WMP, Dees M (2016) A general process mining framework for correlating, predicting and clustering dynamic behavior based on event logs. Inf Syst 56:235–257. https://doi.org/10.1016/j.is.2015.07.003
de Lima Bezerra F, Wainer J (2013) Algorithms for anomaly detection of traces in logs of process aware information systems. Inf Syst 38(1):33–44. https://doi.org/10.1016/j.is.2012.04.004
de San Pedro J, Cortadella J (2016) Discovering duplicate tasks in transition systems for the simplification of process models. In: Business process management – 14th international conference, BPM 2016, Rio de Janeiro, 18–22 Sept 2016. Proceedings, pp 108–124. https://doi.org/10.1007/978-3-319-45348-4_7
Diamantini C, Genga L, Potena D, van der Aalst WMP (2016) Building instance graphs for highly variable processes. Expert Syst Appl 59:101–118. https://doi.org/10.1016/j.eswa.2016.04.021
Dumas M, Rosa ML, Mendling J, Reijers HA (2013) Fundamentals of business process management. Springer, https://doi.org/10.1007/978-3-642-33143-5
Francescomarino CD, Ghidini C, Tessaris S, Sandoval IV (2015) Completing workflow traces using action languages. In: Advanced information systems engineering – 27th international conference, CAiSE 2015, Stockholm, 8–12 June 2015, Proceedings, pp 314–330. https://doi.org/10.1007/978-3-319-19069-3_20
Gama J, Zliobaite I, Bifet A, Pechenizkiy M, Bouchachia A (2014) A survey on concept drift adaptation. ACM Comput Surv 46(4):44:1–44:37. http://doi.acm.org/10.1145/2523813
Greco G, Guzzo A, Pontieri L, Saccà D (2006) Discovering expressive process models by clustering log traces. IEEE Trans Knowl Data Eng 18(8):1010–1027. https://doi.org/10.1109/TKDE.2006.123
Han J, Kamber M, Pei J (2011) Data mining: concepts and techniques, 3rd edn. Morgan Kaufmann, http://hanj.cs.illinois.edu/bk3/
International Organization for Standardization (2011) Software engineering – Software Product Quality Requirements and Evaluation (SQuaRE) – Guide to SQuaRE
Leemans SJ (2017) Robust process mining with guarantees. Ph.D thesis, Eindhoven University of Technology. https://pure.tue.nl/ws/files/63890938/20170509_Leemans.pdf
Lu X, Fahland D, van der Aalst WMP (2014) Conformance checking based on partially ordered event data. In: Business process management workshops – BPM 2014 international workshops, Eindhoven, 7–8 Sept 2014, Revised papers, pp 75–88. https://doi.org/10.1007/978-3-319-15895-2_7
Mannhardt F, de Leoni M, Reijers HA, van der Aalst WMP (2016a) Balanced multi-perspective checking of process conformance. Computing 98(4):407–437. https://doi.org/10.1007/s00607-015-0441-1
Mannhardt F, de Leoni M, Reijers HA, van der Aalst WMP, Toussaint PJ (2016b) From low-level events to activities – a pattern-based approach. In: Business process management – 14th international conference, BPM 2016, Rio de Janeiro, 18–22 Sept 2016. Proceedings, pp 125–141. https://doi.org/10.1007/978-3-319-45348-4_8
Mans RS, Schonenberg H, Song M, van der Aalst WMP, Bakker PJM (2008) Application of process mining in healthcare – a case study in a Dutch hospital. In: Biomedical engineering systems and technologies, international joint conference, BIOSTEC 2008, Funchal, Madeira, 28–31 Jan 2008, Revised selected papers, pp 425–438. https://doi.org/10.1007/978-3-540-92219-3_32
Nezhad HRM, Saint-Paul R, Casati F, Benatallah B (2011) Event correlation for process discovery from web service interaction logs. VLDB J 20(3):417–444. https://doi.org/10.1007/s00778-010-0203-9
Ostovar A, Maaradji A, Rosa ML, ter Hofstede AHM (2017) Characterizing drift from event streams of business processes. In: Advanced information systems engineering – 29th international conference, CAiSE 2017, Essen, 12–16 Jun 2017, Proceedings, pp 210–228. https://doi.org/10.1007/978-3-319-59536-8_14
Pourmirza S, Dijkman RM, Grefen P (2017) Correlation miner: mining business process models and event correlations without case identifiers. Int J Coop Inf Syst 26(2):1–32. https://doi.org/10.1142/S0218843017420023
Rahm E, Bernstein PA (2001) A survey of approaches to automatic schema matching. VLDB J 10(4):334–350. https://doi.org/10.1007/s007780100057
Reichert M, Weber B (2012) Enabling flexibility in process-aware information systems – challenges, methods, technologies. Springer, https://doi.org/10.1007/978-3-642-30409-5
Rogge-Solti A, Kasneci G (2014) Temporal anomaly detection in business processes. In: Business process management – 12th international conference, BPM 2014, Haifa, 7–11 Sept 2014. Proceedings, pp 234–249. https://doi.org/10.1007/978-3-319-10172-9_15
Rogge-Solti A, Mans R, van der Aalst WMP, Weske M (2013) Improving documentation by repairing event logs. In: The practice of enterprise modeling – 6th IFIP WG 8.1 working conference, PoEM 2013, Riga, 6–7 Nov 2013, Proceedings, pp 129–144. https://doi.org/10.1007/978-3-642-41641-5_10
Senderovich A, Rogge-Solti A, Gal A, Mendling J, Mandelbaum A (2016) The ROAD from sensor data to process instances via interaction mining. In: Advanced information systems engineering – 28th international conference, CAiSE 2016, Ljubljana, 13–17 Jun 2016. Proceedings, pp 257–273. https://doi.org/10.1007/978-3-319-39696-5_16
Song JL, Luo TJ, Chen S, Liu W (2009) A clustering based method to solve duplicate tasks problem. J Grad School Chin Acad Sci 26(1):107–113
Song S, Cao Y, Wang J (2016) Cleaning timestamps with temporal constraints. PVLDB 9(10):708–719. http://www.vldb.org/pvldb/vol9/p708-song.pdf
Suriadi S, Andrews R, ter Hofstede AHM, Wynn MT (2017) Event log imperfection patterns for process mining: towards a systematic approach to cleaning event logs. Inf Syst 64:132–150. https://doi.org/10.1016/j.is.2016.07.011
van der Aalst WMP (2016) Process mining – data science in action, 2nd edn. Springer, https://doi.org/10.1007/978-3-662-49851-4
van der Aalst WMP, Adriansyah A, de Medeiros AKA, Arcieri F, Baier T, Blickle T, Bose RPJC, van den Brand P, Brandtjen R, Buijs JCAM, Burattin A, Carmona J, Castellanos M, Claes J, Cook J, Costantini N, Curbera F, Damiani E, de Leoni M, Delias P, van Dongen BF, Dumas M, Dustdar S, Fahland D, Ferreira DR, Gaaloul W, van Geffen F, Goel S, Günther CW, Guzzo A, Harmon P, ter Hofstede AHM, Hoogland J, Ingvaldsen JE, Kato K, Kuhn R, Kumar A, Rosa ML, Maggi FM, Malerba D, Mans RS, Manuel A, McCreesh M, Mello P, Mendling J, Montali M, Nezhad HRM, zur Muehlen M, Munoz-Gama J, Pontieri L, Ribeiro J, Rozinat A, Pérez HS, Pérez RS, Sepúlveda M, Sinur J, Soffer P, Song M, Sperduti A, Stilo G, Stoel C, Swenson KD, Talamo M, Tan W, Turner C, Vanthienen J, Varvaressos G, Verbeek E, Verdonk M, Vigo R, Wang J, Weber B, Weidlich M, Weijters T, Wen L, Westergaard M, Wynn MT (2011) Process mining manifesto. In: Business process management workshops – BPM 2011 international workshops, Clermont-Ferrand, 29 Aug 2011, Revised selected papers, part I, pp 169–194, https://doi.org/10.1007/978-3-642-28108-2_19
Wang J, Song S, Zhu X, Lin X, Sun J (2016) Efficient recovery of missing events. IEEE Trans Knowl Data Eng 28(11):2943–2957. https://doi.org/10.1109/TKDE.2016.2594785
Yakout M, Berti-Équille L, Elmagarmid AK (2013) Don’t be scared: use scalable automatic repairing with maximal likelihood and bounded changes. In: Proceedings of the ACM SIGMOD international conference on management of data, SIGMOD 2013, New York, 22–27 Jun 2013, pp 553–564. http://doi.acm.org/10.1145/2463676.2463706
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Section Editor information
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this entry
Cite this entry
Solti, A. (2018). Event Log Cleaning for Business Process Analytics. In: Sakr, S., Zomaya, A. (eds) Encyclopedia of Big Data Technologies. Springer, Cham. https://doi.org/10.1007/978-3-319-63962-8_87-1
Download citation
DOI: https://doi.org/10.1007/978-3-319-63962-8_87-1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-63962-8
Online ISBN: 978-3-319-63962-8
eBook Packages: Springer Reference MathematicsReference Module Computer Science and Engineering