1 Introduction

Nowadays, sensors are widely deployed in industrial environments to monitor devices’ status in real-time. A sensor continuously generates sensor events and a series of sensor events are correlated with each other. Such event correlations are modeled to enable application-level sensor collaboration. We designed an event log based on a stream data processing infrastructure [1, 2]. However, the correlations can be dynamically interwoven, and data-driven analysis would be of help.

Event correlation discovery problem is concerned about how to identify relationships among sensor events. Similar research has also received notable attention for the discovery, monitoring and analysis of processes. In those studies, relationships among sensor events refer to semantic relationship herein [3,4,5,6,7].

In our previous work [1, 2], we studied a new kind of relationship among sensor events, called statistical correlation. We used Pearson coefficient to measure such relationship. Specifically, we tried to map physical sensors into a software-defined abstraction, called proactive data service. A proactive data service takes event streams derived from physical sensors or other services as inputs and transforms them into new streams based on user-defined operations. In [2], we also proposed a new abstraction, called service hyperlink, to encapsulate correlations between streams received and outputted by two data services. With service hyperlinks, a service can dynamically route an event to other services at runtime. In this way, the knowledge segment about how these sensors collaborate with each other can be depicted at the software layer.

In this paper, we further refine event correlation on when and how a type of event causes another type. Such event correlation can be easily transformed into a relationship between two IoT services. The main contributions include: (1) We propose an algorithm, called CorFinder, to discover such event correlations in a log of sensor events. To reach this goal, we update classic frequent sequence mining algorithm. (2) In a real application, we apply our approach to make anomaly warnings in a power plant based on discovered event correlations. We elaborate on how our approach works and what the differences with the traditional approaches are. (3) Furthermore, a lot of experiments are done to show the effectiveness of our approach based on a dataset from a power plant.

2 Problem Analysis

Figure 1 shows a real case of anomaly detection in a power plant. Fan stall is a major failure for the important equipment – primary air fan (PAF) in a power plant. It will cause severe damages to the whole air and flue system. Currently, detecting such equipment failures in a power plant mainly depends on the observation and judgment of envelope range. They detect anomalies through various phenomena, like the sharp descending of exit air pressure, electricity, and air volume in a PAF. However, when such phenomena are observed, an anomaly has already occurred and the loss is inevitable.

Fig. 1.
figure 1

Partial possible cases of fan stall in the primary air fan: a real case.

From a systematic view, a severe failure is often caused by some trivial anomalies step by step. The paths of anomaly propagation are usually hidden behind the correlations of sensor events in an IoT system. Figure 1 shows several possible event propagation paths lead to the fan stall failure. We can observe that each propagation path is formed of several correlated sensors.

For example, a decrease of valve degree (Valve Degree Descending Event) will reduce the inlet air header pressure (Inlet Air Header Pressure Descending Event). To maintain the output of the boiler, valve degree (Valve Degree Ascending Event) is automatically increased to prevent inlet air header pressure event from decreasing in this case. Following its rise, air pressure (Air Degree Ascending Event) increases and will lead to the growth of electricity and exit air pressure. Unfortunately, excess air pressure will cause a fan stall, which manifests as a sharp drop of electricity (Electricity Descending Event) and exit air pressure (Exit Air Pressure Descending Event).

However, we find such correlations are not always available. For example, considering exit air pressure sensor and inlet air header pressure sensor, their correlations only exist when the value of exit air pressure sensor exceeds 5. In this situation, the value of inlet air header pressure sensor usually will keep the accordance with exit air pressure sensor after about 3 min. Lots of similar cases can be found.

The above case shows, to make warnings in advance, we need to clearly under-stand the way how an event transforms itself and propagates among different devices. An effective way is to mine the event correlations. If we find such correlations, we can merge these correlations to form an event propagation path as Fig. 1 shows.

3 Definitions

A sensor event e consists of four elements: a generation timestamp, a unique identifier, a sensor id and a value. A sensor event log records events from all sensors in an IoT system. We formulate a sensor event log as follows.

Definition 1

(Sensor Event log): given a set of sensors \( S = \left\{ {s_{1} ,s_{2} , \ldots ,s_{m} } \right\} \), a sensor event log is a set of sensor events \( L = \left\{ {e_{1} ,e_{2} , \ldots ,e_{n} } \right\} \), where \( e_{i} \left( {i = 1,..,n} \right) \) is a sensor event generated from a sensor \( s_{j} \in S \).

For example, a sample of sensor event log is L = {

2015-11-15 02:24:20, 118967, A110(Valve Degree), 0.359347557;

2015-11-15 02:24:20, 118968, A763(Coal Consumption), 36.54394756;

2015-11-15 02:24:20, 118969, A945(Electricity), 123.4148096;

2015-11-15 02:24:20, 118970, A658(Vibration), 97.32905983;

2015-11-15 02:24:21, 118967, A110(Valve Degree), 0.359347557;

}.

From a sensor event log L, an event sequence is a set of events from the same sensor in ascending order by their timestamps. The correlation between event sequences is defined as follows.

Definition 2

(Event Correlation): Given two event sequences , let be the event correlation between and , where is the source, is the target, \( \Delta t \) is the time delayed to , and \( conf \) is a measure of relationship between and .

The left part of Fig. 2 elaborates an example of event correlation. In this picture, the red dashes line marks out and respectively. \( \Delta t \) is 4 s.

Fig. 2.
figure 2

An example of event correlation.

4 Discovery of Event Correlation

4.1 The Rationales

The main idea is to transform the event correlation discovery into a frequent sequence mining problem. To do this, as the right part of Fig. 2 shows, a numerical event sequence from a sensor is firstly transformed into a symbol sequence [8]. Essentially, symbolization is a coarse-grained description since each symbol corresponds to a segment of the original sequence. In this manner, if a sequence correlates with another one, there probably exists a frequent sequence between their symbolized sequences [8]. It inspires us to use the frequent sequence to measure event correlation. In another word, if two symbolized sequences and have a long enough frequent sequence, there is a correlation between them.

One challenge is how to identify the time delay between two correlated event sequences shown in Fig. 2. It actually reflects how long that a sensor will be affected by the value changes of its correlated sensor. However, traditional frequency sequence mining algorithm cannot directly solve such problem. Traditional algorithms only focused on the occurrence frequency of a sequence in a sequence set [9, 10]. Hence, we try to design an algorithm which can discover a frequent sequence, each element of which occurs in a sequence set within a short time period, i.e., time delay \( \Delta t \) in Definition 2. Another challenge is how to determine the target and source by a frequent sequence. If each element of a frequent sequence occurs in same order in the sequence set, such frequent sequence can identify the target and source. Taking the right picture in Fig. 2 as an example, each element occurs a little earlier (no more than 4 s) in valve degree sequence than in coal consumption sequence. It indicates that valve degree sequence is the source, and coal consumption sequence is the target. In a word, if two symbol sequences \( s_{i} \) and \( s_{j} \) have a long-enough frequent sequence, each element of which occurs in \( s_{i} \) and \( s_{j} \) in same order within the time period \( \Delta t \), the original sequences of \( s_{i} \) and \( s_{j} \) have an event correlation . In this way, conf can be computed as the ratio of the frequent sequence length to the length of .

Based on the above discussion, we propose an algorithm called CorFinder to discover event correlations. Firstly, it uses a classic algorithm, called SAX [8], to symbolize each event sequence in a sensor event log. Secondly, it mines the above frequent sequences. Notably, we take gap constraint [9] into consideration. A gap constraint γ means any two adjacent elements in a frequent sequence skip no more than the predefined consecutive elements in any sequence containing the frequent sequence. A gap constraint can identify uncorrelated segments from correlated sequences.

Symbolization.

In this paper, the classic symbolic representation algorithm Symbolic Aggregate approXimation (SAX) [8] is used to preprocess our input numerical event sequences. SAX algorithm allows an event sequence of length n to be reduced to a symbol sequence of length m \( \left( {m \ll n} \right) \) composed of k different symbols. We will attach a timestamp to each symbol. The sequences in Table 1 are the symbolization of four event sequences from a sensor event log in a power plant via SAX algorithm with \( k = 15 \). The first two event sequences are shown in Fig. 2.

Table 1. A sample of a symbolized event sequence set (running example).

Frequent Sequence Mining.

Before introducing our algorithm, we list some related concepts about frequent sequence mining. A sequence in a sequence set D is associated with an identifier, called a SID. A support of a sequence is the number contained in D. A sequence becomes frequent if its support exceeds a pre-specified minimum support threshold in D. A frequent sequence with length l is called l-frequent sequence. It becomes closed if there is no super-sequence of it with the same support in D. A projection database of sequence in L is defined as (β is the minimum prefix of η containing ).

Projection-based algorithms are a classic category of traditional algorithms in frequent sequence mining [10]. They adopt a divide-and-conquer strategy to discover frequent sequences by building projection database. These algorithms firstly generate 1-frequent sequences \( F_{1} \), where \( F_{1} = \left\{ {s_{1} :sup_{1} ,s_{2} :sup_{2} , \ldots ,s_{n} :sup_{n} } \right\} \), \( s_{i} \) is a 1-frequent sequence and \( sup_{i} \) is its support. This step is followed by the construction of projection database for each 1-frequent sequence. In each projection database above, they generate 1-frequent sequences \( F_{2} \) and projection database of each element in \( F_{2} \). The process is repeated until there is no 1-frequent sequence. We propose two data structures as follows to update the classic algorithms.

Loose \( \left( {\varvec{\lambda},\Delta \varvec{t},\varvec{l}} \right) \) -frequent sequence and \( \varvec{\lambda} \)-projection database.

We propose several concepts in this section. Traditionally, a frequent sequence with length 1 is called 1-frequent sequence. In this paper, a 1-frequent sequence occurring in time period \( \Delta t \) is called \( \left( {\Delta t,1} \right) \)-frequent sequence. The concept is extended as loose \( \left( {\Delta t,1} \right) \)-frequent sequence \( s^{{\prime }} :\left\langle {\left( {SID_{1} ,t_{1} } \right),\left( {SID_{2} ,t_{2} } \right), \ldots ,\left( {SID_{m} ,t_{m} } \right)} \right\rangle \), where \( s^{{\prime }} \) occurs in \( SID_{i} \) at \( t_{i} \) and \( t_{i + 1} - t_{i} \le \Delta t \). Generalize loose \( \left( {\Delta t,1} \right) \)-frequent sequence into length l as follows. Given a set of \( \left( {\Delta t,1} \right) \)-frequent sequences \( s_{1}^{{\prime }} ,s_{2}^{{\prime }} , \ldots ,s_{l}^{{\prime }} \) for id-list \( \left\langle {SID_{1} ,SID_{2} , \ldots ,SID_{m} } \right\rangle \), if \( s_{1}^{{\prime }} ,s_{2}^{{\prime }} , \ldots ,s_{l}^{{\prime }} \) orderly occurs in \( SID_{j} \left( {j = 1,2, \ldots ,m} \right) \), is a loose \( \left( {\Delta t,l} \right) \)-frequent sequence for the id-list. A loose \( \left( {\Delta t,l} \right) \)-frequent sequence becomes a loose \( \left( {\lambda ,\Delta t,l} \right) \)-frequent sequence if it satisfies gap constraint \( \lambda \), i.e., \( s_{i}^{{\prime }} \) and \( s_{i + 1}^{{\prime }} \) \( \left( {i = 1,2, \ldots ,l - 1} \right) \) skips no more than \( \lambda \) consecutive elements in \( SID_{j} \left( {j = 1,2, \ldots ,m} \right) . \)

According to previous analysis, loose \( \left( {\lambda ,\Delta t,l} \right) \)-frequent sequences is the formulation of the frequent sequences our algorithm focuses on. It can identify our event correlations. To discover loose \( \left( {\lambda ,\Delta t,l} \right) \)-frequent sequences, we propose γ-projection database. γ-projection database of sequence is denoted as (β is the minimum prefix of η containing, α is the prefix of θ with length γ + 1).

Some examples of the above concepts are shown in Table 1. Let Δt = 5 s and γ = 2. l: 〈(VD, t 1), (CC, t 2)〉 is a \( \left( {\Delta t,1} \right) \)-frequent sequence (grey squares in Table 1); c: 〈(E, t14), (VD, t15), (CC, t16)〉 is a loose \( \left( {\Delta t,1} \right) \)-frequent sequence (blue squares in Table 1), and {(VD, 〈(b,t16), (b,t17)〉), (CC,〈(b,t17), (b,t18)〉), (E,〈(e,t15), (c,t16)〉)} is its γ-projection database (red squares in Table 1); 〈c, h〉: 〈(VD, 〈t23, t24〉), (CC, 〈t24, t25〉)〉 is a loose (γ, Δt, 2)-frequent sequence (green squares in Table 1); 〈c, i〉: 〈(E,〈t22, t26〉), (V,〈t22,t26〉)〉 is a loose \( \left( {\Delta \text{t},2} \right) \)-frequent sequence but not (γ, Δt, 2) one (purple squares in Table 1).

4.2 The CorFinder Algorithm

In this paper, we improve the classic projection-based algorithms and propose the CorFinder algorithm to solve our problem. Traditional 1-frequent sequence s:sup does not consider occurrence time of s. Consequently, we propose the concept of \( \left( {\Delta t,1} \right) \)-frequent sequence. However, any adjacent \( \left( {\Delta t,1} \right) \)-frequent sequences for same sequence s are overlapped. It will increase storage cost and lead to repeated counting. For instance, adjacent \( \left( {\Delta t,1} \right) \)-frequent sequences for c, c:〈(E, t14), (VD, t15)〉 and c:〈(VD, t15), (CC, t16)〉, are overlapped in (VD, t15). Therefore we extend \( \left( {\Delta t,1} \right) \)-frequent sequence into loose \( \left( {\Delta t,1} \right) \)-frequent sequence. The following Theorem 1 lays the foundation of the completeness of our algorithm.

Theorem 1.

Each \( \left( {\Delta t,1} \right) \)-frequent sequence in a given sequence set D is contained by a loose \( \left( {\Delta t,1} \right) \)-frequent sequence in D. Versa, any element of each loose \( \left( {\Delta t,1} \right) \)-frequent sequence in D is contained by a \( \left( {\Delta t,1} \right) \)-frequent sequence.

Proof.

We prove the theorem by reduction to absurdity. Let D be a sequence set, and there is a \( \left( {\Delta t,1} \right) \)-frequent sequence \( s:\left\langle {\left( {SID_{1} ,t_{1} } \right),\left( {SID_{2} ,t_{2} } \right), \ldots ,\left( {SID_{k} ,t_{k} } \right)} \right\rangle \) in D. Assume that there is no loose \( \left( {\Delta t,1} \right) \)-frequent sequence containing s. Thus, any \( SID_{i} \in s\left( {i < k} \right) \), \( t_{i + 1} - t_{i} > \Delta t \). Obviously, \( t_{k} - t_{1} > \left( {k - 1} \right)*\Delta t \). Therefore \( s:\left\langle {\left( {SID_{1} ,t_{1} } \right),\left( {SID_{2} ,t_{2} } \right), \ldots ,\left( {SID_{k} ,t_{k} } \right)} \right\rangle \) is not a \( \left( {\Delta t,1} \right) \)-frequent sequence. It is a contradiction in the assumption.

On the other hand, assume that there is an element \( \left( {SID_{i} ,t_{i} } \right) \) of a loose \( \left( {\Delta t,1} \right) \)-frequent sequence \( s^{{\prime }} :\left\langle {\left( {SID_{1} ,t_{1} } \right),\left( {SID_{2} ,t_{2} } \right), \ldots ,\left( {SID_{m} ,t_{m} } \right)} \right\rangle \), and \( \left( {SID_{i} ,t_{i} } \right) \) is contained by none of \( \left( {\Delta t,1} \right) \)-frequent sequences in D. Let \( SID_{j} \) be the nearest element to \( SID_{i} \) under \( SID_{i} \ne SID_{j} \). Since \( \left( {SID_{i} ,t_{i} } \right) \) is not contained by any \( \left( {\Delta t,1} \right) \)-frequent sequence, \( \left| {t_{i} - t_{j} } \right| > \Delta t \). It is in contradiction with the assumption that \( s^{\prime} \) is a loose \( \left( {\Delta t,1} \right) \)-frequent sequence. So far, Theorem 1 is proved.

Loose \( \left( {\Delta t,l} \right) \)-frequent sequence can tell the target and source in an event correlation while considering time delay \( \Delta t \) between the target and source. It is a measure of our event correlation. Our CorFinder algorithm aims at discovering loose \( \left( {\lambda ,\Delta t,l} \right) \)-frequent sequences for finding event correlations. The Theorem 2 inspires us to discover a loose \( \left( {\lambda ,\Delta t,l} \right) \)-frequent sequence in γ-projection database of its l-1 prefix.

Theorem 2.

Any loose \( \left( {\lambda ,\Delta t,l} \right) \)-frequent sequence can be discovered in the id-lists of and \( s_{l}^{{\prime }} \), where is the prefix of with length l-1 and \( s_{l}^{{\prime }} \) is a loose \( \left( {\Delta t,1} \right) \)-frequent sequence in γ-projection database of .

Proof.

Obviously, is a loose \( \left( {\lambda ,\Delta t,l - 1} \right) \)-frequent sequence. Let be the γ-projection database of . Because is a loose \( \left( {\lambda ,\Delta t,l} \right) \)-frequent sequence, assume its id-list is \( \left\langle {SID_{1} ,SID_{2} , \ldots ,SID_{m} } \right\rangle \), we get and \( s_{l}^{{\prime }} \) must be a loose \( \left( {\Delta t,1} \right) \)-frequent sequence for the id-list. Therefore, \( s_{l}^{{\prime }} \) is a loose \( \left( {\Delta t,1} \right) \)-frequent sequence in . Theorem 2 is proved.

Theorem 2 indicates that we can discover a loose \( \left( {\lambda ,\Delta t,l} \right) \)-frequent sequence with l-1 prefix by the following steps. (1) Generate and all loose \( \left( {\Delta t,1} \right) \)-frequent sequence in . (2) For each loose \( \left( {\Delta t,1} \right) \)-frequent sequence \( s_{l}^{{\prime }} \), discover frequent sequences in id-lists of and \( s_{l}^{{\prime }} \). (3) Generate loose \( \left( {\lambda ,\Delta t,l} \right) \)-frequent sequences in the frequent sequences.

Consequently, the recursion of generating γ-projection databases and loose \( \left( {\Delta t,1} \right) \)-frequent sequences can discover all loose \( \left( {\lambda ,\Delta t,l} \right) \)-frequent sequences. Finally, CorFinder algorithm can discover event correlations by these loose \( \left( {\lambda ,\Delta t,l} \right) \)-frequent sequences.

5 Application of Event Correlation for Anomaly Warning

5.1 The Service Collaboration Framework

Our previous work proposed an IoT service model to encapsulate sensor events into a service [1, 2]. It can serve as the fundamental unit to form an IoT application. When building a service, a user customizes its functionality by customizing the input event sensors as well as operations. Each service processes its input sensor events by predefined operations and generates higher-level events in form of stream. A created service can be encapsulated into a Restful-like API so that other services or applications can use it conveniently and simply. Moreover, our service has an important component, which is called service hyperlink. Hyperlink is responsible for indicating target services for an outputted event. In this way, our services can run proactively to correlate and collaborate with sensor events to serve IoT applications. Figure 3 presents the framework of our approach.

Fig. 3.
figure 3

The framework of our approach to correlating and collaborating with sensor events.

Different from traditional service models and frameworks with the “request-and-response” model, ours works in a more automatic and real-time way with the ‘stimuli-and-response’ pattern while maintaining the common data service capabilities. To reach this goal, service hyperlink is the key point. A service hyperlink enables higher-level events outputted from a service (source service) to be routed to another one (target service). After a higher-level event is routed to a target service, the target service will be stimulated and autonomously respond to the event.

Our previous work encapsulated correlations among input sensor events as service hyperlinks and used Pearson coefficient to weigh the correlation degree. However, it is hard to tell the source and the target between two correlated services. To consummate the previous work, we encapsulate event correlation in this paper as service hyperlinks. With hyperlinks, a service can route an event to another service involving the target sequence of encapsulated event correlation.

5.2 The Process to Make Anomaly Warnings in a Power Plant

Service Customization.

Making early warnings in a power plant is a typical case for our framework. As the beginning of the paper elaborates, we make early warnings by event propagation paths, e.g., a valve degree ascending event propagates along the way as valve degree → coal consumption → electricity → vibration and finally leads to a fan stall in Fig. 1. To reach this goal, we create services inputting sensor events from different sensors. Each service will detect and output trivial anomaly events, such as a valve degree ascending event. How to define and detect the trivial anomaly events precisely is the first problem in this case. It can be solved in two ways. On the one hand, such events can be defined based on business knowledge. On the other hand, those events can be identified by clustering techniques [11]. According to the defined events, we customize operations in each service so that a service can detect these trivial anomaly events autonomously.

For example, we build a valve degree data service as Fig. 4 presents. In this service, we select valve degree sensor events as its inputs. To detect a valve degree ascending event, we customize subtraction as one of its operations. The subtracting operation will subtract the value of a sensor event from that of the previous one. We perform K-means algorithm on a real data set within 6 months in a power plant and conclude that valve degree difference (short for diff) exceeding 14.97% is a trivial anomaly event. Thus, a filtering operation diff > 14.97% is selected to detect valve degree ascending events. Besides, inspection man concludes that the valve degree suddenly opening to all is a trivial anomaly event. According to the business knowledge, we select another filtering operation: diff > 0 ∧ valve degree = 100%. Valve degree and valve degree difference is the key attributes (KPIs) to be exposed with REST-like APIs. Based on the Fig. 2, the hyperlink of this service indicates that coal consumption service is its target service.

Fig. 4.
figure 4

The example of valve degree service.

Event Propagation.

A service hyperlink encapsulates an event correlation . An outputted event e related to sensor of will be routed along the hyperlink to its target service. The target service keeps detecting trivial anomaly events. If it detects e’ with respect to sensor of in time period \( \Delta t \) after e arrives, the target service will record a composite event by appending e to trivial anomaly event e’. Instead of e’, the composite event will be routed along the hyperlink related to e’. A composite event records the event propagation path. Figure 5 presents four correlated services. The composite event in vibration service indicates an event propagation path as valve degree ascending event → coal consumption ascending event → electricity ascending event → vibration ascending event.

Fig. 5.
figure 5

An example of event propagation path.

Anomaly Warning.

So far, we can get the propagation paths of trivial anomaly events in each service. But it is still insufficient to make early warnings since the trivial anomaly events are not equal to equipment anomalies. Practically, an inspector performs scheduled maintenances and records equipment anomalies in maintenance records. A maintenance record r = 〈rid, anomaly_desc, rec_time, anomaly_obj〉 consists of record id, anomaly description, recorded time, and anomaly object. For example, there is a maintenance record r = 〈118977, vibration increases - fan stall, 2015/10/12 05:12:00, vibration in #2 primary air fan in #3 boiler〉. According to recorded time and anomaly description in a maintenance record, we can infer causality between event propagation paths and anomalies. For instance, an event propagation path in Fig. 5 often occurred before a fan stall anomaly. Thus we can infer causality as valve degree ascending event → coal consumption ascending event → electricity ascending event → vibration ascending event ⇒ fan stall. Once such a propagation path occurs in the runtime, a warning of a fan stall can be made. Consequently, each service is initialized with an operation for comparing runtime event propagation paths with historical ones. This operation takes composite events as input, and outputs warnings to users or other applications. The process to make anomaly warnings in a service after receiving a sensor event is shown in Fig. 6.

Fig. 6.
figure 6

Process of responding stimuli autonomously and proactively in a service.

6 Experiments

6.1 Experiment Setup

Datasets:

The following experiments use a sensor event log from a power plant. The log contains sensor events from 2015-07-26 23:58:30 to 2016-08-17 07:55:00. Totally 480 sensors are involved and each sensor generates one event per second. The log is divided into two sets. The training set is from 2015-07-26 23:58:30 to 2016-01-31 23:59:55. This set is responsible for discovering event correlations. The testing set is from 2016-02-01 00:00:00 to 2016-08-17 07:55:00. It is used for making early warnings by our approach. In this set, events from same source are sent to our services as a stream. The time interval between two adjacent events is in accordance with real intervals when they were generated. Besides, we use maintenance records of this plant power from 2015-07-26 23:58:30 to 2016-01-31 23:59:55 to verify the accuracy of our approach.

Environments:

The experiments are done on a PC with four Intel Core i5-2400 CPUs 3.10 GHz and 4.00 GB RAM. The operating system is Windows 7 Ultimate. All the algorithms are implemented in Java with JDK 1.8.0.

6.2 Experiment Results

To verify the effectiveness of our approach, firstly, we create services according to physical sensors. We learn business knowledge from a power plant during the creation. Besides, sensors related to one attribute of devices’ status are inputted into one service, such as events from bearing temperature 1, 2, 3 and 4 sensor in primary air fan are the inputs of bearing temperature service. We created 108 services from all 440 sensors. Secondly, we input the training set into CorFinder algorithm to discover service hyperlinks. Next, on top of business knowledge and K-means clustering algorithm, we customize operations in our services to detect trivial anomaly events. After this, we sent testing set into our services as event streams. Once a service makes an early warning of an anomaly, it will print the message in the console. We compare the warnings with maintenance records to verify the accuracy of our approach. To measure the accuracy, we use the following indicators. Precision is the number of correct results divided by the number of all results. Recall is the number of correct results divided by the number of results that should have been returned. Notably, in this paper, our approach makes early warnings of the anomalies occurred both in training set and testing set.

To avoid loss, it is better to make early warnings of anomalies before they occur. To achieve this goal, we compute the precision and recall of our approach under different lengths of the trivial anomaly event propagation path. In the experiments, we set the length from 5 to 20 and draw the results as Figs. 7 and 8.

Fig. 7.
figure 7

The precision of our approach.

Fig. 8.
figure 8

The recall of our approach.

As Fig. 7 shows, the precision of our approach increases with the growth of propagation path’s length. The reason is that longer propagation path can specify an anomaly more clearly. When the length is short, the event has multiple possible propagation paths so that it may evolve into different anomalies. Consequently, the shorter the length of event propagation path is, the lower the precision of our approach is. Meanwhile, shorter path needs less time to make a warning. It indicates that higher precision needs more time. In this experiment, our approach makes warnings of anomalies before the complete event propagation path is formed. It is the main reason that the precision keeps below 100%.

On the other hand, as the Fig. 8 shows, our approach’s recall decreases with the rise of propagation path’s length. Different from precision, our recall can reach 91.67% when the length is 5. It is because shorter event propagation path can specify more possibilities of anomalies, including those should have been made warnings. Besides, we analyze the details of the results and find that, regardless of the path’s length, there are several anomalies our approach cannot discover. The reason is their propagation path is not completely covered by the paths in training set. Our approach cannot search the corresponding anomaly in training set. Fortunately, the anomaly occurs frequently in testing set, and we find that paths of the undiscovered anomalies can be covered by testing set. It inspires us to solve this problem by updating training set periodically.

Our experiment results show that we can make warnings of anomalies before they happen for 5 days ahead at most and 39.8 h ahead averagely, while the precision and recall exceeding 80%.

7 Related Works

Service correlation has attracted much attention in the field of service computing. Dong et al. tried to capture the temporal dependencies based on the amounts of calls to different services [12]. Hashmi et al. proposed a framework for web service negotiation management based on dependency modeling for different QoS parameters among multiple services [13]. Wang et al. considered that a dependency is a relation between services wherein a change to one of the services implies a potential change to the others [14]. They utilized a service dependency matrix to solve the service replacement problem.

However, most of the existing work only considers input/output dependency, pre/post condition dependency, correlations among services and so on. Neither of them takes the dependency of the involved data, which can be regarded as events. Hence, existing studies of event correlation is also the foundation of our work.

Reguieg et al. regarded event correlation as correlation condition, which is a predicate over the attributes of events that can verify which sets of events belong to the same instance of a process [3]. It presented a framework and techniques with multi-pass algorithms to discover correlation conditions in process discovery and analysis tasks over big event datasets using MapReduce framework. It guarantees the efficiency and scalability by partitioning, replication and optimizing the I/O cost. Motahari-Nezhad et al. focused on event correlations in service-based processes [4]. It proposed the notion of correlation condition mentioned above. It developed an algorithm to discover event correlation (semi-) automatically from service interaction logs. Liu et al. presented an event correlation service for distributed middleware-based applications [5]. It enables complex event properties and dependencies to be explicitly expressed in correlation rules. Remarkably, these correlation rules can be accessed and updated at runtime. These event correlation studies provide foundations for our study. However, they do not consider the event correlation in an IoT environment.

Recently, some researchers focus on event dependencies. Song et al. mined activity dependencies (i.e., control dependency and data dependency) to discover process instances when event logs cannot meet the completeness criteria [6]. In this paper, the control dependency indicates the execution order and the data dependency indicates the input/output dependency in service dependency. A dependency graph is utilized to mine process instances. In fact, the authors do not consider the dependency among events. Plantevit et al. presented a new approach to mine temporal dependencies between streams of interval-based events. [7]. Two events have a temporal dependency if the intervals of one are repeatedly followed by the appearance of the intervals of the other, in a certain time delay.

8 Conclusion

In this paper, we elaborate service hyperlink by encapsulating event correlations in an IoT environment to consummate our previous work. We transform service hyperlink discovery into frequent sequence mining problem and propose the CorFinder algorithm. Moreover, we apply our approach to make anomaly warnings in a power plant. Experiments show that, our approach can make warning of anomalies before they happen for 5 days ahead at most, and 39.8 h ahead in average while the precision and recall exceed 80%.