Matching observed behavior and modeled behavior: An approach based on Petri nets and integer programming
Introduction
For many processes in practice there exist models. These models are descriptive or prescriptive, i.e., they are used to describe a process or they are used to control or guide the system. Typical examples are the so-called reference models in the context of Enterprise Resource Planning (ERP) systems like SAP [15]. The SAP reference models are expressed in terms of so-called Event-driven Process Chains (EPCs) [14] describing how people should/could use the SAP R/3 system. Similar models are used in the workflow domain [25], and also in many other domains ranging from flexible manufacturing and telecommunication to operating systems and software components [17]. In some domains these models are referred to as specifications or blueprints. In reality, the real process may deviate from the modeled process, e.g., the implementation is not consistent with the specification or people use SAP R/3 in a way not modeled in any of the EPCs.
Clearly, the problem of checking whether the modeled behavior and the observed behavior match is not new. However, when we applied our process mining techniques [28] to SAP R/3 we were confronted with the following interesting problem: The logs of SAP do not allow for monitoring individual cases (e.g., purchase orders). Instead SAP only logs the fact that a specific transaction has been executed (without referring to the corresponding case). Hence, tools like the SAP Reverse Business Engineer (RBE) report on the frequencies of transaction types and not on the cases themselves. These transactions can be linked to functions in the EPCs, but, as indicated, not to individual cases. Moreover, some functions in the EPC do not correspond to a transaction code, and therefore, are not logged at all. This raises the following interesting question: Do the modeled behavior (i.e., the EPC) and the observed behavior (i.e., the transaction frequencies) match?
The problem of checking whether the modeled behavior and the observed behavior match is not only relevant in the context of SAP. In a wide variety of applications only frequencies are being recorded and/or it is impossible to link events to specific cases. Therefore, we consider an abstraction of the problem. Consider a Petri net with some initial marking [18], [19] and a frequency profile which is a partial function indicating how many times certain transitions fired. Consider for example the marked Petri net shown Fig. 1. A frequency profile fp could be fp(a) = 3, fp(b) = 2, fp(c) = 2, fp(d) = 2, and fp(e) = 3, thus indicating the number of times each transition occurred. However, the modeled behavior (i.e., the marked Petri net) and the observed behavior (the frequency profile fp) do not match. It is easy to see that fp(b) + fp(c) cannot exceed fp(a) since b and c depend on the tokens produced by a. Now consider another frequency profile fp: fp(a) = 3, fp(b) = 2, fp(d) = 2, and fp(e) = 3, i.e., the number of times c occurred is unknown. Now the modeled behavior and the observed behavior match, i.e., the observed transition frequencies are consistent with the Petri net model. Moreover, it is clear that in this situation c occurred precisely once.
In the remainder we will focus on this problem and propose an approach based on Integer Programming (IP) [23], [35]. Using a marked Petri net and a frequency profile, an IP problem is formulated to check whether the modeled behavior and the observed behavior match and, if so, the frequencies of transitions not recorded in the profile are determined. First, we introduce some preliminaries, i.e., process mining, Petri nets, and integer programming, and discuss related work. Then we focus on the core problem and formulate the IP problem. We demonstrate the applicability of our approach using an example. Moreover, we show in more detail why the problem is relevant in the context of SAP and apply the approach to a SAP process model. Finally, we conclude the paper by summarizing the results and discussing future work.
Section snippets
Preliminaries
This section presents some preliminaries needed in the remainder of the paper. We first discuss the concept of process mining and then introduce the two techniques used in this paper: Petri nets and Integer Programming. Finally, we present some related work.
Matching a marked Petri net and a frequency profile
As indicated in the introduction, we use Petri nets to model processes. However, other types of models, e.g., the EPCs used by the SAP reference model, can be mapped onto Petri nets.4 Petri nets may be used to model a wide variety of processes. A Petri net can model what we think the process is (i.e., a descriptive
Example
After showing a number of abstract examples, we now use the more realistic example shown in Fig. 7. It describes the workflow [25] of handling orders. The upper half models the logistical subprocess while the lower half models the financial subprocess. Most of the workflow should be self-explanatory except perhaps for the construct involving c7 and t10 (reminder): A reminder can only be sent if the goods have been shipped.
Unlike the other two Petri nets, the initial marking is empty. Instead a
Extensions
A Linear Programming (LP) problem can be solved in polynomial time while an IP problem is NP complete [23], [35]. Therefore, it may be interesting to consider the LP relaxation of IP(PN, M, fp). We expect that in some cases this will provide good results. Note that often the rounded LP relaxation provides a feasible but non-optimal solution (but not always, cf. the example net shown on page 269 in [6]). Since the objective function is of less interest, this is not a problem. Also note that if
Application in the context of SAP
The problem addressed in this paper applies to a wide variety of systems. However, the first time we were confronted with this phenomenon was when we started to apply process mining in the context of SAP R/3 [10], [15]. Given the widespread use of SAP, this has been the main motivation for the research reported in this paper. Based on a detailed analysis of the various SAP logs we discovered that there is no event log that allows for the type of log as shown in Table 1 [32]. There are two
Conclusion
Inspired by a problem encountered when applying process mining techniques to SAP transaction logs, the paper tackled the problem of checking whether a Petri net and a frequency profile match. An IP problem was proposed to efficiently implement a necessary but not sufficient condition. The approach allows for extensions not possible in the traditional linear algebraic approaches [17], [6], [24]. Clearly, the application is not limited to SAP transaction logs but is applicable in any situation
Acknowledgments
The author would like to thank Eric Verbeek for proof-reading an early version of the paper and Monique Jansen-Vullers and Michael Rosemann for their joint work on mining SAP and configurable process models which uncovered the problem addressed in this paper. Moreover, Martijn van Giessel contributed with his Master thesis on mining SAP logs.
Wil van der Aalst is a full professor of Information Systems and head of the Information Systems department of the Faculty of Technology Management at Eindhoven University of Technology. Currently he is also an adjunct professor at Queensland University of Technology (QUT) working within the Centre for Information Technology Innovation (CITI). His research interests include information systems, simulation, process mining, Petri nets, process models, workflow management systems, verification
References (36)
- et al.
Mining Configurable Enterprise Information Systems
Data and Knowledge Engineering
(2006) - et al.
Business Process Cockpit
Formalization and verification of event-driven process chains
Information and Software Technology
(1999)- et al.
Workflow mining: A survey of issues and approaches
Data and Knowledge Engineering
(2003) - et al.
Mining Process Models from Workflow Logs
- et al.
Discovering models of software processes from event-based data
ACM Transactions on Software Engineering and Methodology
(1998) - et al.
SAP R/3 Business Blueprint: Understanding the Business Process Reference Model
(1997) - et al.
Bridging the gap between business models and workflow specifications
International Journal of Cooperative Information Systems
(2004) Basic Linear Algebraic Techniques of Place/Transition Nets
Free Choice Petri Nets, Volume 40 of Cambridge Tracts in Theoretical Computer Science
A Machine Learning Approach to Workflow Management
The SAP R/3 Handbook
A new polynomial-time algorithm for linear programming
Combinatorica
Semantische Processmodellierung auf der Grundlage Ereignisgesteuerter Processketten (EPK)
Veröffentlichungen des Instituts für Wirtschaftsinformatik, Heft 89 (in German)
SAP R/3 Process Oriented Implementation
Cited by (14)
Process mining techniques and applications – A systematic mapping study
2019, Expert Systems with ApplicationsA business process gap detecting mechanism between information system process flow and internal control flow
2009, Decision Support SystemsA Purpose-Guided Log Generation Framework
2022, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)IT Availability Risks in Smart Factory Networks – Analyzing the Effects of IT Threats on Production Processes Using Petri Nets
2022, Information Systems FrontiersProcess Profiling based Synthetic Event Log Generation
2019, IC3K 2019 - Proceedings of the 11th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge ManagementA comparative study and evaluation of ERP reference models in the context of ERP IT-driven implementation: SAP ERP as a case study
2018, Business Process Management Journal
Wil van der Aalst is a full professor of Information Systems and head of the Information Systems department of the Faculty of Technology Management at Eindhoven University of Technology. Currently he is also an adjunct professor at Queensland University of Technology (QUT) working within the Centre for Information Technology Innovation (CITI). His research interests include information systems, simulation, process mining, Petri nets, process models, workflow management systems, verification techniques, enterprise resource planning systems, computer supported cooperative work, and interorganizational business processes.