Abstract
The practical relevance of process mining is increasing as more and more event data become available. Process mining techniques aim to discover, monitor and improve real processes by extracting knowledge from event logs. The two most prominent process mining tasks are: (i) process discovery: learning a process model from example behavior recorded in an event log, and (ii) conformance checking: diagnosing and quantifying discrepancies between observed behavior and modeled behavior. The increasing volume of event data provides both opportunities and challenges for process mining. Existing process mining techniques have problems dealing with large event logs referring to many different activities. Therefore, we propose a generic approach to decompose process mining problems. The decomposition approach is generic and can be combined with different existing process discovery and conformance checking techniques. It is possible to split computationally challenging process mining problems into many smaller problems that can be analyzed easily and whose results can be combined into solutions for the original problems.











Similar content being viewed by others
References
Adriansyah, A., Sidorova, N., van Dongen, B.F.: Cost-based fitness in conformance checking. In: International Conference on Application of Concurrency to System Design (ACSD 2011), pp. 57–66. IEEE Comput. Soc., Los Alamitos (2011)
Adriansyah, A., van Dongen, B., van der Aalst, W.M.P.: Conformance checking using cost-based fitness analysis. In: Chi, C.H., Johnson, P. (eds.) IEEE International Enterprise Computing Conference (EDOC 2011), pp. 55–64. IEEE Comput. Soc., Los Alamitos (2011)
Adriansyah, A., van Dongen, B.F., van der Aalst, W.M.P.: Towards robust conformance checking. In: zur Muehlen, M., Su, J. (eds.) BPM 2010 Workshops, Proceedings of the Sixth Workshop on Business Process Intelligence (BPI 2010). Lecture Notes in Business Information Processing, vol. 66, pp. 122–133. Springer, Berlin (2011)
Adriansyah, A., Munoz-Gama, J., Carmona, J., van Dongen, B.F., van der Aalst, W.M.P.: Alignment based precision checking. In: Weber, B., Ferreira, D.R., van Dongen, B. (eds.) Workshop on Business Process Intelligence (BPI 2012), Tallinn, Estonia (2012)
Agrawal, R., Shafer, J.C.: Parallel mining of association rules. IEEE Trans. Knowl. Data Eng. 8(6), 962–969 (1996)
Agrawal, R., Gunopulos, D., Leymann, F.: Mining process models from workflow logs. In: Sixth International Conference on Extending Database Technology. Lecture Notes in Computer Science, vol. 1377, pp. 469–483. Springer, Berlin (1998)
Alves de Medeiros, A.K.: Genetic Process Mining. Ph.D. thesis, Eindhoven University of Technology (2006)
Alves de Medeiros, A.K., Weijters, A.J.M.M., van der Aalst, W.M.P.: Genetic process mining: an experimental evaluation. Data Min. Knowl. Discov. 14(2), 245–304 (2007)
Bergenthum, R., Desel, J., Lorenz, R., Mauser, S.: Process mining based on regions of languages. In: Alonso, G., Dadam, P., Rosemann, M. (eds.) International Conference on Business Process Management (BPM 2007). Lecture Notes in Computer Science, vol. 4714, pp. 375–383. Springer, Berlin (2007)
Berthelot, G.: Transformations and decompositions of nets. In: Brauer, W., Reisig, W., Rozenberg, G. (eds.) Advances in Petri Nets 1986 Part I: Petri Nets, Central Models and Their Properties. Lecture Notes in Computer Science, vol. 254, pp. 360–376. Springer, Berlin (1987)
Boukala, M.C., Petrucci, L.: Towards distributed verification of Petri nets properties. In: Proceedings of the International Workshop on Verification and Evaluation of Computer and Communication Systems (VECOS’07), pp. 15–26. British Computer Society, London (2007)
Bratosin, C., Sidorova, N., van der Aalst, W.M.P.: Distributed genetic process mining. In: Ishibuchi, H. (ed.) IEEE World Congress on Computational Intelligence (WCCI 2010), Barcelona, Spain, July 2010, pp. 1951–1958. IEEE Press, New York (2010)
Buijs, J.C.A.M., van Dongen, B.F., van der Aalst, W.M.P.: On the role of fitness, precision, generalization and simplicity in process discovery. In: Meersman, R., Rinderle, S., Dadam, P., Zhou, X. (eds.) OTM Federated Conferences, 20th International Conference on Cooperative Information Systems (CoopIS 2012). Lecture Notes in Computer Science, vol. 7565, pp. 305–322. Springer, Berlin (2012)
Calders, T., Guenther, C., Pechenizkiy, M., Rozinat, A.: Using minimum description length for process mining. In: ACM Symposium on Applied Computing (SAC 2009), pp. 1451–1455. ACM, New York (2009)
Cannataro, M., Congiusta, A., Pugliese, A., Talia, D., Trunfio, P.: Distributed data mining on grids: services, tools, and applications. IEEE Trans. Syst. Man Cybern., Part B, Cybern. 34(6), 2451–2465 (2004)
Carmona, J., Cortadella, J.: Process mining meets abstract interpretation. In: Balcazar, J.L. (ed.) ECML/PKDD 2010. Lecture Notes in Artificial Intelligence, vol. 6321, pp. 184–199. Springer, Berlin (2010)
Carmona, J., Cortadella, J., Kishinevsky, M.: A region-based algorithm for discovering Petri nets from event logs. In: Business Process Management (BPM 2008), pp. 358–373 (2008)
Carmona, J., Cortadella, J., Kishinevsky, M.: Divide-and-conquer strategies for process mining. In: Dayal, U., Eder, J., Koehler, J., Reijers, H. (eds.) Business Process Management (BPM 2009). Lecture Notes in Computer Science, vol. 5701, pp. 327–343. Springer, Berlin (2009)
Castellanos, M., Casati, F., Dayal, U., Shan, M.C.: A comprehensive and automated approach to intelligent business processes execution analysis. Distrib. Parallel Databases 16(3), 239–273 (2009)
Cook, J.E., Wolf, A.L.: Discovering models of software processes from event-based data. ACM Trans. Softw. Eng. Methodol. 7(3), 215–249 (1998)
Cook, J.E., Wolf, A.L.: Software process validation: quantitatively measuring the correspondence of a process to a model. ACM Trans. Softw. Eng. Methodol. 8(2), 147–176 (1999)
IEEE Task Force on Process Mining: Process mining manifesto. In: Daniel, F., Barkaoui, K., Dustdar, S. (eds.) Business Process Management Workshops. Lecture Notes in Business Information Processing, vol. 99, pp. 169–194. Springer, Berlin (2012)
Darondeau, P.: Unbounded Petri net synthesis. In: Desel, J., Reisig, W., Rozenberg, G. (eds.) Lectures on Concurrency and Petri Nets. Lecture Notes in Computer Science, vol. 3098, pp. 413–438. Springer, Berlin (2004)
De Weerdt, J., De Backer, M., Vanthienen, J., Baesens, B.: A robust F-measure for evaluating discovered process models. In: Chawla, N., King, I., Sperduti, A. (eds.) IEEE Symposium on Computational Intelligence and Data Mining (CIDM 2011), Paris, France, April 2011, pp. 148–155. IEEE Press, New York (2011)
Dhama, H.: Quantitative models of cohesion and coupling in software. J. Syst. Softw. 29(1), 65–74 (1995)
Fahland, D., de Leoni, M., van Dongen, B.F., van der Aalst, W.M.P.: Conformance checking of interacting processes with overlapping instances. In: Rinderle, S., Toumani, F., Wolf, K. (eds.) Business Process Management (BPM 2011). Lecture Notes in Computer Science, vol. 6896, pp. 345–361. Springer, Berlin (2011)
Feige, U., Hajiaghayi, M., Lee, J.: Improved approximation algorithms for minimum-weight vertex separators. In: Proceedings of the Thirty-Seventh Annual ACM Symposium on Theory of Computing, pp. 563–572. ACM, New York (2005)
Gaaloul, W., Gaaloul, K., Bhiri, S., Haller, A., Hauswirth, M.: Log-based transactional workflow mining. Distrib. Parallel Databases 25(3), 193–240 (2009)
Georgakopoulos, D., Hornick, M., Sheth, A.: An overview of workflow management: from process modeling to workflow automation infrastructure. Distrib. Parallel Databases 3, 119–153 (1995)
Goedertier, S., Martens, D., Vanthienen, J., Baesens, B.: Robust process discovery with artificial negative events. J. Mach. Learn. Res. 10, 1305–1340 (2009)
Grigori, D., Casati, F., Castellanos, M., Dayal, U., Sayal, M., Shan, M.C.: Business process intelligence. Comput. Ind. 53(3), 321–343 (2004)
Günther, C.W., van der Aalst, W.M.P.: Fuzzy mining: adaptive process simplification based on multi-perspective metrics. In: Alonso, G., Dadam, P., Rosemann, M. (eds.) International Conference on Business Process Management (BPM 2007). Lecture Notes in Computer Science, vol. 4714, pp. 328–343. Springer, Berlin (2007)
Hand, D., Mannila, H., Smyth, P.: Principles of Data Mining. MIT Press, Cambridge (2001)
Herbst, J.: Ein induktiver Ansatz zur Akquisition und Adaption von Workflow-Modellen. Ph.D. thesis, Universität Ulm (November 2001)
Hilbert, M., Lopez, P.: The World’s technological capacity to store, communicate, and compute information. Science 332(6025), 60–65 (2011)
Karpis, G., Kumar, V.: A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput. 20(1), 359–392 (1998)
Kernighan, B.W., Lin, S.: An efficient heuristic procedure for partitioning graphs. Bell Syst. Tech. J. 49(2), 291–307 (1970)
Kim, M., Candan, K.: SBV-cut: vertex-cut based graph partitioning using structural balance vertices. Data Knowl. Eng. 72, 285–303 (2012)
Lakos, C., Petrucci, L.: Modular analysis of systems composed of semiautonomous subsystems. In: Application of Concurrency to System Design (ACSD’2004), pp. 185–194. IEEE Comput. Soc., Los Alamitos (2004)
Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., Byers, A.: Big Data: the Next Frontier for Innovation, Competition, and Productivity. McKinsey Global Institute, San Francisco (2011)
Mitchell, T.M.: Machine Learning. McGraw-Hill, New York (1997)
Munoz-Gama, J., Carmona, J.: A fresh look at precision in process conformance. In: Hull, R., Mendling, J., Tai, S. (eds.) Business Process Management (BPM 2010). Lecture Notes in Computer Science, vol. 6336, pp. 211–226. Springer, Berlin (2010)
Munoz-Gama, J., Carmona, J.: Enhancing precision in process conformance: stability, confidence and severity. In: Chawla, N., King, I., Sperduti, A. (eds.) IEEE Symposium on Computational Intelligence and Data Mining (CIDM 2011), Paris, France, April 2011, pp. 184–191. IEEE Press, New York (2011)
Munoz-Gama, J., Carmona, J., van der Aalst, W.M.P.: Conformance Checking in the Large: Partitioning and Topology. BPM Center Report BPM-13-10, BPMcenter.org, (2013)
Munoz-Gama, J., Carmona, J., van der Aalst, W.M.P.: Hierarchical conformance checking of process models based on event logs. In: Desel, J., Colom, J.M. (eds.) Applications and Theory of Petri Nets 2013. Lecture Notes in Computer Science Springer, Berlin (2013)
Polyvyanyy, A., Vanhatalo, J., Völzer, H.: Simplified computation and generalization of the refined process structure tree. In: Bravetti, M., Bultan, T. (eds.) WS-FM 2010. Lecture Notes in Computer Science, vol. 6551, pp. 25–41. Springer, Berlin (2011)
Reguieg, H., Toumani, F., Motahari Nezhad, H., Benatallah, B.: Using MapReduce to scale events correlation discovery for business processes mining. In: Barros, A., Gal, A., Kindler, E. (eds.) International Conference on Business Process Management (BPM 2012). Lecture Notes in Computer Science, vol. 7481, pp. 279–284. Springer, Berlin (2012)
Rozinat, A., van der Aalst, W.M.P.: Decision mining in ProM. In: Dustdar, S., Fiadeiro, J.L., Sheth, A. (eds.) International Conference on Business Process Management (BPM 2006). Lecture Notes in Computer Science, vol. 4102, pp. 420–425. Springer, Berlin (2006)
Rozinat, A., van der Aalst, W.M.P.: Conformance checking of processes based on monitoring real behavior. Inf. Syst. J. 33(1), 64–95 (2008)
Sheth, A.: A new landscape for distributed and parallel data management. Distrib. Parallel Databases 30(2), 101–103 (2012)
Sole, M., Carmona, J.: Process mining from a basis of regions. In: Lilius, J., Penczek, W. (eds.) Applications and Theory of Petri Nets 2010. Lecture Notes in Computer Science, vol. 6128, pp. 226–245. Springer, Berlin (2010)
van der Aalst, W.M.P.: Process Mining: Discovery, Conformance and Enhancement of Business Processes. Springer, Berlin (2011)
van der Aalst, W.M.P.: Decomposing process mining problems using passages. In: Haddad, S., Pomello, L. (eds.) Applications and Theory of Petri Nets 2012. Lecture Notes in Computer Science, vol. 7347, pp. 72–91. Springer, Berlin (2012)
van der Aalst, W.M.P.: Distributed process discovery and conformance checking. In: de Lara, J., Zisman, A. (eds.) International Conference on Fundamental Approaches to Software Engineering (FASE 2012). Lecture Notes in Computer Science, vol. 7212, pp. 1–25. Springer, Berlin (2012)
van der Aalst, W.M.P.: Passages in Graphs. BPM Center Report BPM-12-19, BPMcenter.org, (2012)
van der Aalst, W.M.P., ter Hofstede, A.H.M., Kiepuszewski, B., Barros, A.P.: Workflow patterns. Distrib. Parallel Databases 14(1), 5–51 (2003)
van der Aalst, W.M.P., Weijters, A.J.M.M., Maruster, L.: Workflow mining: discovering process models from event logs. IEEE Trans. Knowl. Data Eng. 16(9), 1128–1142 (2004)
van der Aalst, W.M.P., Rubin, V., Verbeek, H.M.W., van Dongen, B.F., Kindler, E., Günther, C.W.: Process mining: a two-step approach to balance between underfitting and overfitting. Softw. Syst. Model. 9(1), 87–111 (2010)
van der Aalst, W.M.P., van Hee, K.M., van der Werf, J.M., Verdonk, M.: Auditing 2.0: using process mining to support tomorrow’s auditor. IEEE Comput. Soc. 43(3), 90–93 (2010)
van der Aalst, W.M.P., Adriansyah, A., van Dongen, B.: Replaying history on process models for conformance checking and performance analysis. Data Min. Knowl. Discov. 2(2), 182–192 (2012)
van der Werf, J.M.E.M., van Dongen, B.F., Hurkens, C.A.J., Serebrenik, A.: Process discovery using integer linear programming. Fundam. Inform. 94, 387–412 (2010)
Vanhatalo, J., Völzer, H., Koehler, J.: The refined process structure tree. Data Knowl. Eng. 68(9), 793–818 (2009)
Verbeek, H.M.W., van der Aalst, W.M.P.: Decomposing Replay Problems: A Case Study. BPM Center Report BPM-13-09, BPMcenter.org, (2013)
Weijters, A.J.M.M., van der Aalst, W.M.P.: Rediscovering workflow models from event-based data using little thumb. Integr. Comput.-Aided Eng. 10(2), 151–162 (2003)
Weske, M.: Business Process Management: Concepts, Languages, Architectures. Springer, Berlin (2007)
Acknowledgements
This work was supported by the Basic Research Program of the National Research University Higher School of Economics (HSE) in Moscow.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Divyakant Agrawal.
Rights and permissions
About this article
Cite this article
van der Aalst, W.M.P. Decomposing Petri nets for process mining: A generic approach. Distrib Parallel Databases 31, 471–507 (2013). https://doi.org/10.1007/s10619-013-7127-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10619-013-7127-5