Elsevier

Information Systems

Volume 47, January 2015, Pages 258-277
Information Systems

An alignment-based framework to check the conformance of declarative process models and to preprocess event-log data

https://doi.org/10.1016/j.is.2013.12.005Get rights and content

Abstract

Process mining can be seen as the “missing link” between data mining and business process management. The lion's share of process mining research has been devoted to the discovery of procedural process models from event logs. However, often there are predefined constraints that (partially) describe the normative or expected process, e.g., “activity A should be followed by B” or “activities A and B should never be both executed”. A collection of such constraints is called a declarative process model. Although it is possible to discover such models based on event data, this paper focuses on aligning event logs and predefined declarative process models. Discrepancies between log and model are mediated such that observed log traces are related to paths in the model. The resulting alignments provide sophisticated diagnostics that pinpoint where deviations occur and how severe they are. Moreover, selected parts of the declarative process model can be used to clean and repair the event log before applying other process mining techniques. Our alignment-based approach for preprocessing and conformance checking using declarative process models has been implemented in ProM and has been evaluated using both synthetic logs and real-life logs from a Dutch hospital.

Introduction

Traditional Workflow Management (WFM) and Business Process Management (BPM) systems are based on the idea that processes can be described by procedural languages where the completion of one task may enable the execution of other tasks, i.e., procedural models are used to “drive” operational processes. While such a high degree of support and guidance is certainly an advantage when processes are repeatedly executed in the same way, in dynamic and less structured settings (e.g., healthcare) these systems are often considered to be too restrictive. Users need to react to exceptional situations and execute the process in the most appropriate manner. It is difficult, if not impossible, to encode this human flexibility and decision making in procedural models.

Declarative process models acknowledge this and aim at providing freedom without unnecessarily restricting users in their actions. Procedural process models take an “inside-to-outside” approach, i.e., all execution alternatives need to be explicitly specified and new alternatives need to be incorporated in the model. Declarative models use an “outside-to-inside” approach: anything is possible unless explicitly forbidden. Hence, a declarative process model can be viewed as a set of constraints rather than as a procedure.

WFM and BPM systems tend to force people to work in a particular way. When using a declarative WFM and BPM system, more freedom can be offered. However, in most dynamic and less structured settings no system is enforcing users to work in a particular way. This may result in undesirable deviations and inefficiencies. Sometimes there may be good reasons to do things differently. Consider the “breaking the glass” functionality in many systems as a means to deal with exceptions, e.g., using the emergency breaks in case of an accident, unauthorized access to private patient data in case of an emergency and bypassing an administrative check to help an important customer.

Even though process models are typically not enforced, many events are recorded by today's information systems. As information systems are becoming more and more intertwined with the operational processes they support, “torrents of event data” become available. Therefore, it is interesting to compare observed behavior with modeled behavior.

This paper proposes the implementation of a framework for the analysis of the execution of declarative processes. It is based on the principle of creating an alignment of an event log and a process model. Each trace in the event log is related to a possible path in the process model. Ideally, every event in the log trace corresponds to the execution of an activity in the model. However, it may be the case that the log trace does not fit completely. Therefore, there may be “moves” in the event log that are not followed by “moves” in the model or vice versa.

The alignment concept has successfully been used in the context of procedural models (e.g., [1], [2], [3]); here, we adapt it for declarative models. Similarly to what has been proposed for procedural models, in our approach, events in the log are mapped to executions of activities in the process model. A cost/weight is assigned to every potential deviation. We use the A algorithm [4] to find, for each trace in the event log, an optimal alignment, i.e., an alignment that minimizes the cost of the deviations. The application of the A algorithm is more challenging for declarative models than for procedural models. This is due to the fact that, since in a declarative model everything is allowed unless constrained otherwise, the set of admissible behaviors is generally far larger than the set of behaviors allowed by procedural models. This implies that the search space to find an optimal alignment of a log and a declarative model is much larger. Therefore, for this type of models, it is essential to avoid exploring search-space portions that certainly lead to non-optimal solutions.

The log-model alignment can be the main input of a wide range of techniques for the analysis of declarative processes. On this concern, Section 3 shows the three main use cases that are considered in this paper. The first use case is concerned with cleaning the event logs by removing log traces that should not be used for further analysis (e.g., incomplete traces). The second use case is about checking the conformance of the event logs against a given declarative model, which can be regarded and measured from diverse dimensions, highlighting where deviations occur. The third and last use case concerns repairing event logs to make sure that the essential constraints are satisfied before further analysis. These use cases are supported by functionalities that are available in ProM, a generic open-source framework for implementing process mining tools in a standard environment [5].

In this paper, we use Declare as an example of declarative language. Section 2 introduces the basic aspects of the Declare language along with the working example that is used throughout the paper, while Section 4 introduces some background knowledge. Section 5 describes the notion of log–model alignment and some diagnostics that can be computed using alignments. Section 6 describes the application of the A algorithm to find an optimal alignment. Here, we also introduce an optimization of the algorithm to prune large irrelevant portions of the search space that certainly lead to non-optimal solutions. Section 7 discusses the second use case in detail, i.e., how the alignments can be used to check the conformance of an event log with respect to a Declare model. Section 8 focuses on the first and third use case, i.e., how event logs can be cleaned and repaired. Section 9 reports an evaluation of the different techniques, which is based on synthetic and real-life logs. Section 10 discusses related work, whereas Section 11 concludes the paper and highlights potential future work.

Section snippets

Declare and basic notation

Declare is a declarative language with an intuitive graphical representation to describe constraints and activities [6], [7], [8]. Its formal semantics is based on Linear Temporal Logic (LTL) for finite traces [9] where each constraint is defined through an LTL formula.1 The Declare toolset includes a graphical designer, a workflow engine, a worklist handler and various analysis tools [10].2

A framework based on log–model alignment for the analysis of declarative processes

This paper proposes a framework for the analysis of declarative processes that is based on three use cases. All three use cases are supported by new functionalities added to ProM. The application of the use cases heavily relies on the computation of log–model alignments, which is also part of this paper.3 The starting point is a raw event log Lraw and a so-called whole model Dwhole=(A

Checking declare constraints

To identify potential deviations of a log from a reference Declare model, we need to map each event in the log to an activity in the model. Each process instance in a log follows a trace (a sequence) of events and different instances may follow the same trace. Therefore, an event log can be seen as a multi-set of traces, i.e., LB(AL), where AL is the set of activities in the log. Since Declare is an “open” language, it allows for the execution of any activity which is not in the model.

Alignment of event logs and Declare models

To check the conformance of an event log L with respect to a Declare model D, we adopt an approach where we search for an alignment of the log and the model. An alignment relates moves in log and moves in model as explained in the following definition. Here, we explicitly indicate no move with . We indicate set Σ{} with Σ, where Σ denotes the input alphabet of each constraint automaton in D.

Definition 2 Alignment and complete alignment

A pair (s',s)(Σ×Σ)⧹{(,)} is

  • a move in log if s'Σ and s=;

  • a move in model if s'= and sΣ;

  • a

The A algorithm for computing log–model alignments

Let us suppose to have a graph V with costs associated with arcs. The A algorithm, initially proposed in [4], is a pathfinding search in V. It starts at a given source node v0V and explores adjacent nodes until one node of a given target set VTrgV is reached, aiming at finding the path with the overall lowest cost.

Every node vV is also associated with a cost, which is determined by an evaluation function f(v)=g(v)+h(v), where

  • g:VR0+ is a function that returns the smallest path cost from v0

Conformance checking using alignments

There are four basic quality dimensions for checking the conformance of event logs with respect to process models: (a) fitness, (b) precision, (c) generalization and (d) simplicity [18]. A model with good fitness allows for most of the behavior seen in the event log. A model has a perfect fitness if all traces in the log can be replayed by the model from the beginning to the end.

A model is precise if it does not allow for “too much” behavior. A model that is not precise is “underfitting”, i.e.,

Usage of the optimal alignments to clean and repair event logs

This section discusses the first and third use case of the framework described in Section 3, i.e., how to clean and repair event logs. Let L=(L,Λ) be an event log and let Dwhole=(Awhole,Πwhole) be a Declare (whole) model. For each trace σL, let γσ be an optimal alignment of σ and a core model Dcore=(Acore,Πcore) such that ΠcoreΠwhole.

A cleaned log Lclean of L with respect to Dcore is an event log (Lc,Λc), where Lc only contains the log traces σL such that γσ contains no move in the log or the

Implementation and evaluation

In order to evaluate the three use cases described in Section 3, we have implemented a series of plug-ins of ProM, a generic open-source framework for implementing process mining functionalities [5]:

    Declare Replayer

    It takes a Declare model and an event log as input and, using the algorithm described in Section 6, finds a multi-set of optimal alignments, i.e., one alignment for each trace in the event log. This plug-in is also in charge of computing the score of the different dimensions of

Related work

Over the last decade process mining evolved into an active research field relating data analysis (data mining, machine learning, etc.) and process analysis. See [18] for an introduction to process mining. We refer to the recently released Process Mining Manifesto [23] for the main challenges in process mining. The focus of this paper is on the analysis of the execution of Declare models; therefore we do not elaborate on process discovery techniques.

Conformance checking is highly relevant for

Conclusion

This paper presents a novel log preprocessing and conformance checking approach tailored towards declarative models. Most conformance checking techniques defined for procedural models (e.g., Petri nets) are not directly applicable to declarative models since they are based on playing the “token game” while counting missing and remaining tokens. Moreover, these techniques tend to provide poor diagnostics, e.g., just reporting the fraction of fitting cases. We adapted alignment-based approaches

References (36)

  • S. Zugal, J. Pinggera, B. Weber, The impact of testcases on the maintainability of declarative process models, in:...
  • P. Pichler, B. Weber, S. Zugal, J. Pinggera, J. Mendling, H.A. Reijers, Imperative versus declarative process modeling...
  • D. Giannakopoulou, K. Havelund, Automata-based verification of temporal properties on running programs, in: Proceedings...
  • M. Westergaard, F.M. Maggi, Declare: a tool suite for declarative workflow modeling and enactment, in: Proceedings of...
  • W.M.P. van der Aalst et al.

    Declarative workflowsbalancing between flexibility and support

    Comput. Sci.—Res. Dev.

    (2009)
  • M. Westergaard, Better algorithms for analyzing and enacting declarative workflow languages using LTL, in: Proceedings...
  • A. Bauer, M. Leucker, C. Schallhart, Runtime verification for LTL and TLTL, ACM Trans. Softw. Eng. Methodol. 20 (4)...
  • A. Awad, G. Decker, M. Weske, Efficient compliance checking using BPMN-Q and temporal logic, in: Proceedings of the 6th...
  • Cited by (0)

    View full text