Keywords

1 Introduction

With recent advancements in IoT technologies, a new ecosystem of Internet-capable physical objects that sense, compute and communicate without human intervention is being formed. ThingStore [1] is a collaborative platform that brings several actors of this ecosystem together by providing service interface definitions, APIs and query processing servers. This study presents Event Query Language (EQL) and a query processing service that resides within ThingStore for real-time event queries in an IoT environment.

Service definitions and event query language contribute to the field of IoT as a part of standardization efforts in service hosting and application development. The previous study on ThingStore [1] gives a shorter definition for EQL, but focuses on the complete system design. In contrast, the current study not only details smart services, event query language and query processing environment, but also renews them. Studies on temporal and logical operators also exist among CEP research [3, 4]; however, they are not able to support heterogeneous nature of IoT environment. The contributions of this study are as follows:

  • An event model and a set of event processing operators are introduced to the field of Complex Event Query Processing. Unlike the previous study on ThingStore, the new operator and smart service definitions are not constrained by any system-specific parameters such as event sampling intervals.

  • A new event query processor is designed to provide immediate response to event occurrences. Unlike the previous study, events are modelled in continuous-time rather than discrete-time, which relies on sampling events.

2 Event Query Language

For the convenience of application developers, EQL is designed to be a user-intuitive SQL-like language. An example of a query is given below:

figure a

In this study, we present a new set of complex event definitions for WHERE section of the language. The other sections are similar to the previous work [1].

Definition 1

A smart service is a software program that performs a computation over a data stream to produce and deliver a useful approximation to a boolean function of continuous time, where the description of the function is some phenomena in the physical domain.

The boolean function \(f(t): \mathbb {R} \rightarrow \{0,1\}\) can be defined as:

$$ f(t)= {\left\{ \begin{array}{ll} 1, &{} Physical~phenomena~exists~at~time~t \\ 0, &{} \text {otherwise} \end{array}\right. } $$

Using the function above, we can transfer information from a smart service to the query processing framework in real-time by delivering a boolean value and a timestamp at the time of each state change. This sends a \(\{1, t_0\}\) signal at some time \(t_0\) such that \(f(t_0)=1 ~\wedge ~ \lim _{\delta t \rightarrow 0}f(t_0-\delta t)=0 \). Similarly, it sends a \(\{0, t_1\}\) signal at some time \(t_1\) such that \(f(t_1)=0 ~\wedge ~ \lim _{\delta t \rightarrow 0}f(t_1-\delta t)=1 \). The ideal method assumes existence of a global timestamp among all smart services.

Using the data transport method given above, any boolean, continuous-time function of the physical world can be re-constructed by the query processor. Assume a query processor receives some sequence of tuples \(F = F_1, F_2, F3 ...\), where \(F_n \! = \! (v_n, t_n) ~ \forall n\in \mathbb {N}\) and \(v_n\) represents the boolean state value at timestamp \(t_n\). The re-construction of function f(.) is defined as:

$$ f'(t)= {\left\{ \begin{array}{ll} 1, &{} \exists ~ n: (t_n<t<t_{n+1})\wedge (v_n=1) \\ 0, &{} \text {otherwise} \end{array}\right. } $$

Although this function and the transport method outlined above represent the complete functionality desired from a smart service, smart services can only deliver an approximation of these features due to the limitations of available sensing and software technologies and lack of global timestamp. As an approximation, the system assumes arrival time of information as the occurrence time, and it does not retrieve timestamp from the sensing environment.

Definition 2

Given f(.) as the re-constructed continuous-time function of smart service A, an event instance of A is a time interval defined as a tuple \((t_{start}, t_{end})\) such that:

$$\begin{aligned}\begin{gathered} t_{start} \le t_{end} \\ \forall t_0: t_{start} \le t_0<t_{end} \implies f(t_0)=1 \\ \lim _{\delta t \rightarrow 0}f(t_{start}-\delta t)=0 \\ f(t_{end}) =0 \end{gathered}\end{aligned}$$

In other words, any continuous time interval between two state changes in which function f(.) remains true between are considered to be event instances.

Definition 3

A complex event is a set of time intervals, where a time interval is defined as a tuple of timestamps \((t_{start}, t_{end})\) such that \(t_{start} \le t_{end}\).

Every smart service can be expressed as the set of all event instances associated with their continuous-time functions. Thus, smart services may be regarded as complex events. Some smart service A can be represented as: \( A = \{a_1, a_2, a_3 ... \}\) where \(a_n=(a_{sn}, a_{en})\).

Definition 4

A smart service operator is a function that takes one or more smart services as input, and outputs another smart service.

Definition 5

A complex event operator is a function that takes one or multiple complex events as input, and generates another complex event according to a pre-defined pattern.

It is important that operator outputs are the same type as their input so that they can be called and processed by other operators in a nested manner. The difference between a smart service and a complex event is that event instances in smart services cannot overlap, but a complex event may have overlapping event instances. In these terms, smart services can be processed by smart service and complex event operators, but the output of complex event operators can only be processed by the same class.

The proposed query language supports three smart service operators which are derived from boolean algebra. These are AND, OR and NOT. Assume that \(f_A\), \(f_B\) and \(f_C\) are continuous-time boolean functions of smart services A, B and C that are \(\mathbb {R} \rightarrow \{0,1\}\), smart service operators are defined as:

$$\begin{aligned} C = \varvec{AND}(A,B) \iff&\forall t: f_C(t)=f_A(t) \wedge f_B(t) \\ C = \varvec{OR}(A,B) \iff&\forall t: f_C(t)=f_A(t) \vee f_B(t) \\ C = \varvec{NOT}(A) \iff&\forall t: f_C(t)=1-f_A(t) \end{aligned}$$

The proposed query language supports seven complex event operators: BEFORE, MEETS, OVERLAPS, STARTS, DURING, FINISHES, EQUAL. These operators are derived from Allen’s temporal event logic [2]. Given two complex events, \(A=\{a_1,a_2, a_3...\}\) and \(B=\{b_1,b_2,b_3...\}\) where each element of A and B indicates time interval tuples \(a_n=(a_{sn},a_{en})\) and \(b_k=(b_{sk},b_{ek})\), the temporal operators are defined as below:

$$\begin{aligned} ( a_{sn}, b_{ek} ) ~\in \varvec{BEFORE}(A,B) \iff&a_{en}<b_{sk} \\ ( a_{sn}, b_{ek} ) ~\in \varvec{MEETS}(A,B,\sigma ) \iff&|a_{en}-b_{sk}|<\sigma \\ ( a_{sn}, b_{ek} ) ~\in \varvec{OVERLAPS}(A,B) \iff&(a_{sn}< b_{sk}< a_{en}< b_{ek})\\ ( a_{sn}, b_{ek} ) ~\in \varvec{STARTS}(A,B,\sigma ) \iff&(|a_{sn}-b_{sk}|<\sigma ) \wedge (a_{en}<b_{ek})\\ ( a_{sn}, b_{ek} ) ~\in \varvec{DURING}(A,B) \iff&(a_{sn}<b_{sk}) \wedge (a_{en}>b_{ek}) \\ ( a_{sn}, b_{ek} ) ~\in \varvec{FINISHES}(A,B,\sigma ) \iff&(a_{sn}<b_{sk}) \wedge (|a_{en} -b_{ek}|<\sigma )\\ ( a_{sn}, b_{ek} ) ~\in \varvec{EQUAL}(A,B,\sigma ) \iff&(|a_{sn}-b_{sk}|<\sigma ) \wedge (|a_{en} -b_{ek}|<\sigma ) \end{aligned}$$

From the definitions above, as the constant \(\sigma \) goes to zero, the operators MEETS, STARTS, FINISHES and EQUAL define simultaneous start or end points as constraints. The detection of simultaneousness in measures of milliseconds normally requires a hardware real-time environment. However, the proposed language allows one to define a vicinity of time for this. This is enabled using the parameter \(\sigma \).

Fig. 1.
figure 1

An example of an execution tree for an event query.

3 Event Query Processing

A query that is received by the query processor is parsed into an execution tree whose nodes are smart services, operators and event handlers. An example of an execution tree is given in Fig. 1. Smart services are the leaves of execution tree. They provide event state changes (OnEventAStart, OnEventAEnd, OnEventBStart, OnEventBEnd) to the upper branches. Upper branches are smart service operators and complex event operators. While smart services, smart service operators and complex event operators can be leaves of complex event operators, only smart services and smart service operators can be leaves for smart service operators.

Operators perform with soft real-time requirements. Each event change is handled with a procedure call which is described within the node. A sample procedure to handle events in a complex event operator node is given in Algorithm 1.

figure b

4 Experimental Study

For experimental studies, the same setup as the previous study of ThingStore [1] is built. The differences obtained clearly proves the effectiveness of the new system. First, while the previous system performs computation on every discrete time step of the ThingStore query processor, New EQL computes only in the case of state changes from smart services. Since the average state change frequency in an environment will be significantly smaller than the discrete sampling frequency, the new proposed system always performs with fewer operations. In the case of our test, each new service changes its state once, so the throughput is 1 for each new query subscription added to the system. To perform the same functionality, our previous system calculates and delivers output for every internal sampling clock, which reaches above 40000 computations per second when 1000 queries are present in the system. Comparison with the previous experiment is shown on Fig. 2a. ThingStore data is re-used and shows a median value of 10 experiments.

The new system also delivers a fast response to event instances. Figure 2b shows the end-to-end delay between smart service and the client under different loads. As in the previous paper, the median value of 10 simulations is used for ThingStore. The median delay of 10 simulations for the new system is a constant value of 1ms, so delay values for a single simulation are directly shown in the graph. As also indicated in the previous study, when a small number of queries are subscribed, the delay value of ThingStore will be at least a random value around 1ms to 15ms, which is the sampling frequency. On the contrary, the new system has a robust performance of around 1ms or 2ms for almost every event instant in this setup. It can also be observed that ThingStore’s delay increases as the subscriptions increase and computations get delayed.

Fig. 2.
figure 2

Performance analysis of event query execution. Throughput (b) and delay (a) of the system under different system loads.