1 Introduction

Execution of complex business processes that are specifically knowledge driven, generally leads to significant amounts of event records corresponding to the execution of activities in the processes. Dealing with large scale process variations [9] is a significant challenge in process centric enterprise organizations. This is typically the case in Business Process Outsourcing (BPO) support organizations, as they support multiple business processes for different clients. Given the strictness and penalty aspects of violating service contracts governing such business processes, these organizations tend to improve the performance aspects of individual process executions Most of the current literature assumes that the performance of a process instance is entirely determined by what happens over the course of the execution of the process instance. We see limitations in such assumptions [7], when applied in knowledge intense process models, where the specific instance executions are dictated by other factors that are not part of process executions. Mining such factors and discovering correlations with process performance and validity of execution remains a significant challenge. Research works [8, 11] in the field of process flexibility management, treats goal models as one of such factors that dictate execution of processes, in addition to contexts. Goal models [10] provides a natural underpinning for both validating and classifying process executions as different variants. In this paper, we argue that the correlation of semantic effect traces with goal alignments and associated context facilitates a comprehensive approach for predicting performances. In our earlier work [8], we discussed the notion of process effect logs as a series of time stamped ticket description entries, along with the semantic trace (effects from process execution), task details, performance time spent and the process instance identifier. The process performance time is defined as the time interval between the start and completion of execution of a process instance.

2 Motivating Example

Fig. 1.
figure 1

Annotated incident resolution process in VAGAI tool

Fig. 2.
figure 2

Incident management goal model

In this paper, we consider an Incident management process design depicted in Fig. 1 as our running example. A process logFootnote 1 containing 1400 executed instances of this process design is considered for the evaluation of our proposed approach. A total of over 25000 task execution records is available as part of this process log. Each process instance in this log indicates how after receiving a complaint from a customer, an incident ticket is created, resolved and closed. We leverage annotated goal models with end effects as illustrated in Fig. 2. Such a goal model can be constructed through a goal refinement machinery as discussed in  [2].

A variety of outcome predicting process monitoring techniques have been proposed in the literature [6]. In [4], the authors clearly establish the need for a general framework for mining and correlating business process characteristics from event logs. In  [1], the authors discuss construction of a configurable process model as a family of process variants discovered from a collection of event logs. The existing works in the area of contextual correlation of business processes have addressed different challenges related to collaboration, contract conformance, process flexibility [5]. In comparison our work uses contextual factors and semantic effect traces on both partial and completed executions to correlate and predict execution deviation based on goal alignments. Works such as [3] focuses on generating performance predictions leveraging process simulation data. Works such as [12] focus on generating hybrid process model creation by leveraging event log clusters. In comparison, we focus on an orthogonal approach of discovering multiple process designs that are goal aligned variants of the original process design.

3 Identifying Process Context, Goals and Process State

Contextual information can be traced from process instances to a range of time-stamped information sources, such as statements being made on enterprise social media, financial market data, weather data and so on. Process log time-stamps can be correlated with time-stamps in these repositories of information to derive a wealth of information about the context within which a process instance was executed. In our proposed approach, we leverage this specific category of contextual information.

The performance indicators associated with process effect assertions are typically influenced with the entailment to specific OR-refinement sub goals (Email confirmation or Telephonic confirmation with customer) in the goal model. Given a state S and a set of effect assertions e obtained from events accruing from the execution of a task, the resulting partial state is given by \(S \oplus e\), where \(\oplus \) is a state update operator [8]. Similarly, given a normative state \(S_N\) and a set of effect assertions \(e_N\) obtained from events accruing from the execution of a task in a process, the resulting partial state is given by \(S_N \oplus e_N\) where \(\oplus \) is a state update operator. We also use a knowledge-base KB of domain constraints. If \(S \cup e\cup KB\) is consistent, then \(S\oplus e = S\cup e\). Otherwise, \(S\oplus e = e\cup \{s\mid s \subseteq S, s\cup e\cup KB\) is consistent, and there does not exist any \(s'\) where \(s \subset s' \subseteq S\) such that \(s'\cup e\cup KB\) is consistent\(\}\). We start with an initial partial state description (which may potentially be empty) and incrementally update it (using \(\oplus \)) until we reach the partial state immediately following the final task in the process instance. Towards achieving this, the proposed machinery leverage the OR-refinement goal correlations associated with each state transition from the process event log. For generating goal correlations based on the end effects (at the process or task levels), we have leveraged the Process Instance Goal Alignment Model (PIGA) discussed in our previous work [8]. Therefore, given a goal-realizing effect group S, finding correlation with a goal G in formal terms is simply finding the truth assignments in the CNF expression of G using the cumulative end effects of S. Towards generating PIGA, the list of state transitions and the goal decomposition model as input are considered. Then, for each event group in the process log, the truth assignments of all goals in the goal model are validated. This is repeated for all event groups in the process log to identify the “valid process instances”. The representation of each process instance as a list of maximally refined correlated goals constitutes the completion of generating Process Instance Goal Alignment (PIGA).

Table 1. Context correlated goal models (CCGM)

The CCGM generated for our running example is illustrated in Table 1. For example as observed in row 3, 11 process instances are partially executed without a resolution to a reported incident due to a collection of contextual factors (CM3). To support predictions both at the process and individual task levels, we have leveraged two categories of effect log data sets: Process Data Set (PD), where record in this data set is a tuple { Process Instance Identifier, a semantic trace, process execution time, context, aligned OR-refinement sub-goals } and Task Data Set (TD) : Each record in this data set is a tuple { Process Instance Identifier, Task Identifier, semantic trace from the execution of task, task execution time, total process execution time, context, task aligned goals, process aligned goals}.

For our evaluation in this paper, we used Watson Analytics Engine’s Deep QA pipeline, to generate insights for some very interesting questions. The training data set belongs to two categories of process log data sets PD and TD. The questions that were asked using both these data sets are listed in Table 2.

Table 2. Questions to Watson Analytic Engine

4 Empirical Evaluation

In this section, we will evaluate our proposed approach using the event log data set, discussed in Sect. 2. Our evaluation is conducted in two phases : Phase 1 : This is basically a pre-processing step that enables generation of effect logs, which are provided as input data to the Watson Analytics Engine (discussed in Phase 2). The VAGAI tool [8] annotates semantic traces from process logs with goal alignments to generate process effect logs (PD) and task effect logs (TD) respectivelyFootnote 2. Phase 2 : Watson Analytics Engine for generating performance and goal alignment predictions using the PD and TD data sets respectively as depicted in Table 2. For individual task level executions, the alignment predictions are at OR-refinement sub goal levels (providing alternate realization of its parent goal) for a given goal model. This is based on the accumulated effects at the completion of corresponding task execution.

Fig. 3.
figure 3

Performance predictions at partial states

The consolidated view of predictive insights as a visualization is depicted in Fig. 3. Here the performance prediction in terms of total process execution time is depicted for each observed effect at completion of a task. We started with questions of type Q01, Q02 to generate the predictions of process performance time (in minutes) for each of the six contextual factors DataIssues + AgentExplow,DataIssues + Highseverity, RemoteResolution + CustomerNew, RemoteResolution + AlertsComplete, SoftwareUpgrade, PasswordReset + AgentExplow, PasswordReset + Severity High at specific semantic traces in the execution of process instances. This consolidated representation generated using the Watson Analytics Engine helps in predicting performance at different partial states of an instance execution. This demonstrates the impact of contexts on the execution of otherwise similar process execution instances. Similarly using this prediction model represented in Fig. 3, we can make predictions of performances at multiple states of process execution. This eventually can lead the organization to evaluate their resource deployment strategies, shifting to a different process design variant.

5 Conclusion

Organizations increasingly tend to analyze the performance drifts in day to day execution of customer and context sensitive business processes. In our proposed approach, we leverage goal correlated process variations and contextual factors mined from process log and goal correlated state transitions mined from effect logs. In our future work, we will focus on correlating dynamic run-time variations in contextual factors with shifts in goal alignment.