Abstract
Over the last decade process mining techniques have matured and more and more organizations started to use process mining to analyze their operational processes. The current hype around “big data” illustrates the desire to analyze ever-growing data sets. Process mining starts from event logs—multisets of traces (sequences of events)—and for the widespread application of process mining it is vital to be able to handle “big event logs”. Some event logs are “big” because they contain many traces. Others are big in terms of different activities. Most of the more advanced process mining algorithms (both for process discovery and conformance checking) scale very badly in the number of activities. For these algorithms, it could help if we could split the big event log (containing many activities) into a collection of smaller event logs (which each contain fewer activities), run the algorithm on each of these smaller logs, and merge the results into a single result. This paper introduces a generic framework for doing exactly that, and makes this concrete by implementing algorithms for decomposed process discovery and decomposed conformance checking using Integer Linear Programming (ILP) based algorithms. ILP-based process mining techniques provide precise results and formal guarantees (e.g., perfect fitness), but are known to scale badly in the number of activities. A small case study shows that we can gain orders of magnitude in run-time. However, in some cases there is tradeoff between run-time and quality.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The DivideAndConquer package is available through https://svn.win.tue.nl/trac/prom/browser/Packages/DivideAndConquer.
- 2.
ProM 6 is available through http://www.promtools.org/prom6.
- 3.
The event logs used for this case study can be downloaded from https://svn.win.tue.nl/trac/prom/browser/Documentation/DivideAndConquer.
- 4.
Please note that, by changing the cost structure as suggested in [14], we can accumulate costs when merging subalignments into a single alignment. However, we do not have a way yet to accumulate fitness when merging subalignments. For this reason, we restrict ourselves to costs here.
References
Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., Byers, A.H.: Big Data: The Next Frontier for Innovation, Competition, and Productivity. Technical report, McKinsey Global Institute (2011)
Hilbert, M., López, P.: The World’s Technological Capacity to Store, Communicate, and Compute Information. Sci. 332(6025), 60–65 (2011)
van der Aalst, W.M.P.: Process Mining: Discovery, Conformance and Enhancement of Business Processes, 1st edn. Springer Publishing Company Incorporated, Heidelberg (2011)
van der Aalst, W.M.P., Rubin, V., Verbeek, H.M.W., van Dongen, B.F., Kindler, E., Günther, C.W.: Process mining: a two-step approach to balance between underfitting and overfitting. Softw. Syst. Model. 9(1), 87–111 (2010)
Bergenthum, R., Desel, J., Lorenz, R., Mauser, S.: Process mining based on regions of languages. In: Alonso, G., Dadam, P., Rosemann, M. (eds.) BPM 2007. LNCS, vol. 4714, pp. 375–383. Springer, Heidelberg (2007)
Solé, M., Carmona, J.: Process mining from a basis of state regions. In: Lilius, J., Penczek, W. (eds.) PETRI NETS 2010. LNCS, vol. 6128, pp. 226–245. Springer, Heidelberg (2010)
Carmona, J.A., Cortadella, J., Kishinevsky, M.: A region-based algorithm for discovering petri nets from event logs. In: Dumas, M., Reichert, M., Shan, M.-C. (eds.) BPM 2008. LNCS, vol. 5240, pp. 358–373. Springer, Heidelberg (2008)
van der Werf, J.M.E.M., van Dongen, B.F., Hurkens, C.A.J., Serebrenik, A.: Process Discovery using Integer Linear Programming. Fundam. Inform. 94(3–4), 387–412 (2009)
van der Aalst, W.M.P., Weijters, A.J.M.M., Maruster, L.: Workflow Mining: Discovering process models from event logs. IEEE Trans. Knowl. Data Eng. 16(9), 1128–1142 (2004)
Adriansyah, A., van Dongen, B.F., van der Aalst, W.M.P.: Conformance Checking using Cost-Based Fitness Analysis. In: Chi, C., Johnson, P., eds.: IEEE International Enterprise Computing Conference (EDOC 2011), pp. 55–64. IEEE Computer Society (2011)
Muñoz-Gama, J., Carmona, J.: A fresh look at precision in process conformance. In: Hull, R., Mendling, J., Tai, S. (eds.) BPM 2010. LNCS, vol. 6336, pp. 211–226. Springer, Heidelberg (2010)
Muñoz-Gama, J., Carmona, J.: Enhancing precision in Process Conformance: Stability, confidence and severity. In: CIDM, pp. 184–191. IEEE (2011)
Rozinat, A., van der Aalst, W.M.P.: Conformance checking of processes based on monitoring real behavior. Inf. Syst. 33(1), 64–95 (2008)
van der Aalst, W.M.P.: Decomposing Petri nets for process mining: A generic approach. Distrib. Parallel Databases 31(4), 471–507 (2013)
Verbeek, H.M.W., van der Aalst, W.M.P.: Decomposing Replay Problems: A Case Study. In: Moldt, D., (ed.) PNSE+ModPE. vol. 989 of CEUR Workshop Proceedings, CEUR-WS.org, pp. 219–235 (2013)
van der Wiel, T.: Process mining using integer linear programming. Master’s thesis, Eindhoven University of Technology, Department of Mathematics and Computer Science (2010). http://alexandria.tue.nl/extra1/afstversl/wsk-i/wiel2010.pdf
van Dongen, B.F.: BPI Challenge 2012 (2012). http://dx.doi.org/10.4121/uuid:3926db30-f712-4394-aebc-75976070e91f
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Verbeek, H.M.W., van der Aalst, W.M.P. (2015). Decomposed Process Mining: The ILP Case. In: Fournier, F., Mendling, J. (eds) Business Process Management Workshops. BPM 2014. Lecture Notes in Business Information Processing, vol 202. Springer, Cham. https://doi.org/10.1007/978-3-319-15895-2_23
Download citation
DOI: https://doi.org/10.1007/978-3-319-15895-2_23
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-15894-5
Online ISBN: 978-3-319-15895-2
eBook Packages: Computer ScienceComputer Science (R0)