1 Introduction

Dependability and reliability are two crucial aspects of any computing system that deals with cybersecurity. This is because even a short transient violation of security or privacy policies may result in leaking private or highly sensitive information, compromising safety, or lead to the interruption of vital public or social services. One approach to gain confidence about the well-being of such a system is to continuously monitor it with respect to a set of formally specified requirements that system should meet at all times. This approach is commonly known as runtime verification (RV).

We start with the premise that existing RV techniques cannot monitor a large but vital class of the security and privacy polices, e.g., information flow. Take, for instance, the non-interference policy [12], where a low user should not be able to acquire any information about the activities (if any) of the high user by observing independent execution traces. Monitoring this policy would require observing and reasoning about multiple execution traces, whereas existing RV techniques are limited to evaluating only one trace at run time.

In order to specify security and privacy policies, we focus on HyperLTL [8], a temporal logic for expressing hyperproperties [9]. A hyperproperty is a set of sets of execution traces. HyperLTL adds explicit and simultaneous quantification over multiple traces to the standard LTL. HyperLTL significantly extends the range of security policies under consideration, including complex information-flow properties like generalized non-interference, declassification, and quantitative non-interference. For example, the following is a HyperLTL formula:

$$\begin{aligned} \varphi = \forall \pi . \forall \pi '. \;a_{\pi } \, \rightarrow \, \mathbf {F}b_{\pi ^{'}} \end{aligned}$$

It states that for any pair of traces \(\pi \) and \(\pi '\), if proposition a holds in the initial state of \(\pi \), then proposition b should eventually hold in trace \({\pi ^{'}}\). To describe the challenges in monitoring HyperLTL specifications, consider formula \(\varphi \) and two traces \(t = cde\) and \(t' = acddb\). These traces individually (e.g., if \(\pi \) and \(\pi '\) are both instantiated by t), satisfy the formula, but collectively (e.g., if \(\pi \) is instantiated by t and \(\pi '\) by \(t'\)) do not. If a monitor first observes trace t and then \(t'\), it has to somehow remember that b never occurred in t and declare violation as soon as it observes a in the initial state of \(t'\). Thus, a HyperLTL monitor has to be memeoryful; i.e., the monitoring algorithm has to be able to memorize the status of propositions of interest in the past traces to be able to reason about current and future traces.

With this motivation, in this paper, we introduce a novel RV algorithm for monitoring the alternation-free fragment of (i.e., \(\forall ^*\) and \(\exists ^*\)) HyperLTL (in Sect. 4, we will argue that alternating formulas cannot be monitored using a runtime technique only). Our algorithm takes as input a formula \(\varphi \) and a finite but unbounded-size set T of finite traces (see Fig. 1(a)). The traces in T can be produced by multiple sequential terminating or concurrent executions of a system under inspection. This means that the traces in T can grow in number and/or length at run time. The algorithm works as follows (see Fig. 1(b)):

Fig. 1.
figure 1

RV framework for HyperLTL

  • First, given \(\varphi \), it identifies the propositions and possibly simple Boolean expressions that need bookkeeping using a function \(\Gamma \).

  • Then, for each trace \(t_i \in T\), by incorporating the elements returned by \(\Gamma \), the monitor generates a constraint \(C_i\). This constraint basically encapsulates two things. It

    1. 1.

      encodes what the monitor has observed in \(t_i\) with respect to the elements returned by \(\Gamma \), so it can reason about new incoming traces as well as existing traces growing in length, and

    2. 2.

      rewrites the inner LTL formula in \(\varphi \) using Havelund and Rosu’s algorithm [13] and obtains a formula \(\varphi _r\).

    Hence, the resulting constraint \(C_i\) encodes the full memory of all relevant things that has occurred in \(t_i\).

  • At any point of time, the conjunction \(\bigwedge _{i=1}^m C_i\) where m is the number of traces being monitored, determines the current RV verdict (see Fig. 1(a)). That is, the result of simplification of the conjunction shows whether \(\varphi \) has been satisfied, violated, or currently impossible to tell (i.e., it can go either way in the future).

Finally, we note that although the number and length of the generated constraints are theoretically unbounded, this can be prevented by making practical assumptions. One example is to incorporate a synchronization mechanism that ensures that the difference in length of traces do not grow over a certain bound. Furthermore, the complexity of our algorithm is detached from the number of trace quantifiers in a given HyperLTL formula.

Organization. The rest of the paper is organized as follows. Section 2 presents the syntax and semantics of HyperLTL. In Sect. 3, we introduce our finite semantics for HyperLTL. Section 4 discusses challenges in monitoring HyperLTL formulas. Subsequently, the components of our RV algorithm are presented in Sects. 5 and 6. Related work is discussed in Sect. 7. Finally, we make concluding remarks and discuss future work in Sect. 8.

2 Background

Let \( AP \) be a finite set of atomic propositions and \(\mathrm {\Sigma }= 2^{ AP }\) be the finite alphabet. We call each element of \(\mathrm {\Sigma }\) a letter (or an event). Throughout the paper, \(\mathrm {\Sigma }^\omega \) denotes the set of all infinite sequences (called traces) over \(\mathrm {\Sigma }\), and \(\mathrm {\Sigma }^*\) denotes the set of all finite traces over \(\mathrm {\Sigma }\). For a trace \(t \in \mathrm {\Sigma }^\omega \) (or \(t \in \mathrm {\Sigma }^*\)), t[i] denotes the \(i^{th}\) element of t, where \(i \in \mathbb {Z}_{\ge 0}\). Also, t[0, i] denotes the prefix of t up to and including i, and \(t[i, \infty ]\) is written to denote the infinite suffix of t beginning with element i. By, |t| we mean the length of (finite or infinite) trace t.

Now, let u be a finite trace and v be a finite or infinite trace. We denote the concatenation of u and v by \(\sigma = uv\). Also, \(u \le \sigma \) denotes the fact that u is a prefix of \(\sigma \). Finally, if U is a set of finite traces and V is a finite or infinite set of traces, then the prefix relation \(\le \) on sets of traces is defined as:

$$U \le V \; \equiv \; \forall u \in U . \; (\exists v \in V . \; u \le v)$$

Note that V may contain traces that have no prefix in U.

2.1 HyperLTL

Clarkson and Schneider [9] proposed the notion of hyperproperties as a means to express security policies that cannot be expressed by traditional properties. A hyperproperty is a set of sets of execution traces. Thus, a hyperproperty essentially defines a set of systems that respect a policy. HyperLTL [8] is a logic for syntactic representation of hyperproperties. It generalizes LTL by allowing explicit quantification over multiple execution traces simultaneously.

Syntax. The set of HyperLTL formulas is inductively defined by the grammar as follows:

$$\begin{aligned} \begin{aligned}&\varphi {::}= \exists \pi . \varphi \mid \forall \pi . \varphi \mid \phi \\&\phi {::}= a_\pi \mid \lnot \phi \mid \phi \vee \phi \mid \phi \, \mathbf {U}\, \phi \mid \mathbf {X}\phi \end{aligned} \end{aligned}$$

where \(a \in AP \) and \(\pi \) is a trace variable from an infinite supply of variables \(\mathcal {V}\). Similar to LTL, \(\mathbf {U}\) and \(\mathbf {X}\) are the ‘until’ and ‘next’ operators, respectively. Other standard temporal connectives are defined as syntactic sugar as follows: \(\varphi _1 \, \rightarrow \, \varphi _2 = \lnot \varphi _1 \, \vee \, \varphi _2\), \(\varphi _1 \, \wedge \, \varphi _2 = \lnot (\lnot \varphi _1 \, \vee \, \lnot \varphi _2)\), \(\mathtt {true}= a_\pi \vee \lnot a_\pi \), \(\mathtt {false}= \lnot \mathtt {true}\), \(\mathbf {F}\phi = \mathtt {true} \, \mathbf {U}\, \phi \), and \(\mathbf {G}\phi = \lnot \mathbf {F}\lnot \phi \). Quantified formulas \(\exists \pi \) and \(\forall \pi \) are read as ‘along some trace \(\pi \)’ and ‘along all traces \(\pi \)’, respectively.

Semantics. A formula \(\varphi \) in HyperLTL satisfied by a set of traces T is written as \(\Pi \models _T \varphi \), where trace assignment \(\Pi : \mathcal {V}\rightarrow \mathrm {\Sigma }^\omega \) is a partial function mapping trace variables to traces. \(\Pi [\pi \rightarrow t]\) denotes the same function as \(\Pi \), except that \(\pi \) is mapped to trace t. The validity judgment for HyperLTL is defined as follows:

where the trace assignment suffix \(\Pi [i, \infty ]\) denotes the trace assignment \(\Pi ' = \Pi (\pi )[i, \infty ]\) for all \(\pi \). If \(\Pi \models _T \phi \) holds for the empty assignment \(\Pi \), then T satisfies \(\phi \).

Example. Non-interference (NI) security policy requires any pair of traces with the same initial low observation to remain indistinguishable for low users, yet low inputs will be unaltered, irrespective of the the high inputs. This policy can be specified by the following HyperLTL formula:

$$\forall \pi . \forall \pi ' . (\mathbf {G}\lambda _H (\pi ') \wedge \mathbf {G}\lnot (\bigwedge _{a \in H} a_{\pi } \leftrightarrow a_{\pi '})) \; \rightarrow \, \mathbf {G}(\bigwedge _{a \in L} a_{\pi } \leftrightarrow a_{\pi '})$$

Where \(\mathbf {G}\lambda _H (\pi ')\) denotes all the high variables in \(\pi '\) that hold the value \(\lambda \), and H and L are the high and low variables in their respected security levels.

3 Finite Semantics for HyperLTL

In this section, we present our finite semantics for HyperLTL, inspired by the finite semantics of LTL [15]. For a finite trace t, let t[ij] denote the subtrace of t from position i up to and including position j:

$$t[i, j] \;\; = \;\; {\left\{ \begin{array}{ll} \epsilon &{} \text {if }~~~~ i > |t| \\ t[i, \min (j, |t|-1)] &{} \text { otherwise} \end{array}\right. }$$

where \(\epsilon \) is the empty trace. We let t[i, ..] denote \(t[i, |t|-1]\).

Let trace assignment \(\Pi _F : \mathcal {V}\rightarrow \mathrm {\Sigma }^*\) be a partial function mapping trace variables to finite traces. Similar to the infinite semantics, \(\Pi _F[\pi \rightarrow t]\) denotes the same function as \(\Pi _F\), except that \(\pi \) is mapped to finite trace t. We consider two truth values for the finite semantics: \(\top \) and \(\bot \). To distinguish finite from infinite semantics, we use \([\Pi _F \models _T \varphi ]\) to denote the valuation of HyperLTL formula \(\varphi \) for a set T of finite traces. The finite semantics for Boolean operators ‘\(\vee \)’ and ‘\(\lnot \)’ as well as for the trace quantifiers ‘\(\forall \)’ and ‘\(\exists \)’ are identical to those of infinite semantics. We define the finite semantics of HyperLTL for temporal operators as follows:

figure a

where \(\bar{X}\) denotes the ‘weak next’ operator.

Example. Consider formula \(\phi = \forall \pi _{1}. \forall \pi _{2}. \ a_{\pi _1} \ {\mathbf {U}} \ b_{\pi _2}\) and \(T = \{t_1 = aaab, t_2 = aab, t_3 = aab\}\). Although traces \(t_1\), \(t_2\), and \(t_3\) individually satisfy the formula \(\phi \), we have \([\Pi _F \models _T \varphi ] = \bot \), as there does not exist a position, where each pair of traces agree on the position of b. Now consider formula \(\varphi ' = \forall \pi _1. \forall \pi _2. \mathbf {F}a_{\pi _1} \, \wedge \, \mathbf {F}b_{\pi _2}\) and let \(T' = \{**a*b,~*b**a\}\). We have \([\Pi _F \models _{T'} \varphi '] = \top \).

4 Challenges in Monitoring HyperLTL Formulas

Let us assume we are to monitor a finite but unbounded-size set T of finite traces with respect to a HyperLTL formula \(\varphi \). The traces in T can be produced by multiple sequential terminating or concurrent executions of a system under inspection. This means that traces in T can grow in number and/or length at run time. Unlike conventional runtime monitoring techniques, where verification decision only depends upon one current execution, monitoring T for \(\varphi \) may depend on the past, future, or concurrent evolution of the traces in T. Thus, a monitor for \(\varphi \) needs to bookkeep the occurrence (and even not occurrence) of certain events to be able to reason about \(\varphi \) at run time. In the following, we outline a set of challenges which need to be addressed in order to develop a monitoring algorithm.

Alternating Formulas. Let \(\varphi = \forall \pi .\exists \pi '.\psi \). Verifying this formula requires us to show that for all traces in T, there exists a trace that satisfies \(\psi \). However, since the number of traces in T may grow, a runtime monitor can never prove or disprove \(\varphi \). This argument holds in general for \(\forall ^*\exists ^*\) and \(\exists ^*\forall ^*\) formulas. This is the main reason that in the remainder of this paper, we will only focus on the alternation-free fragment of HyperLTL. Observe that for \(\forall ^*\) (respectively, \(\exists ^*\)) formulas, it is possible to compute verdict \(\bot \) (respectively, \(\top \)) at run time.

Inter-trace Dependencies. Reasoning about \(\varphi \) by observing individual traces in T is clearly not sufficient. Progression through traces in T requires to keep information about the past or concurrent traces in T. One root cause of this is due to the existence of a disjunction in \(\varphi \) involving two distinct trace variables. For example, let \(\phi = \forall \pi _{1}. \forall \pi _{2}. \ a_{\pi _1} \rightarrow \mathbf {F}b_{\pi _2}\). Now, consider two traces \(t_1 = dcf\) and \(t_2 = aeb\), where \( AP = \{a,b,c,d,e,f\}\). Note that traces \(t_1\) and \(t_2\), individually satisfy \(\varphi \), but they collectively violate \(\varphi \), as event b does not occur in \(t_1\).

Time of Occurrence of Events. Reasoning about some formulas requires bookkeeping the time of occurrence of some propositions in each trace. For example, consider formula \(\varphi _1 = \forall \pi _{1}. \forall \pi _{2}. \ a_{\pi _1} \, {\mathbf {U}} \, b_{\pi _2}\) and traces \(t_1 = aab\), \(t_2 = ab\), and \(t_3 = aaaab\). Although, each trace individually satisfies the formula, any pair of them violates the formula, as event b occurs at different times. This can become even more complex when the occurrence of some propositions needs to agree across multiple traces and multiple times. An example of such a formula is \(\varphi _2 = \forall \pi _{1}. \forall \pi _{2}. \forall \pi _{3}. \ (a_{\pi _1} \ {\mathbf {U}} \ b_{\pi _2}) \ {\mathbf {U}} \ c_{\pi _3}\), where the first occurrence of c and every occurrence of b need to be agreed across all traces in T. For example, for traces \(t_1 = (ab)a(ac)(ac)b\), \(t_2 = (ab)a(ac)(a)(b)\), and \(t_3 = a (ac)(ac)b\), traces \(t_1\) and \(t_2\) agree on times of occurrence of b and c, but trace \(t_3\) violates this agreement, thus violating formula \(\varphi _2\). Yet other examples are formula \(\varphi _3 = \forall \pi _{1}.\forall \pi _{2}. \ \mathbf {G}(a_{\pi _1} \rightarrow a_{\pi _2})\) (which requires all traces to agree on each occurrence of a) and the non-interference formula discussed in Sect. 2.

5 Identifying Propositions of Interest

The challenges and examples outlined in Sect. 4 suggest that monitoring a HyperLTL formula requires the identification of propositions which shape the trace agreement to be followed amongst distinct traces. We call this process bookkeeping, denote \(\mathcal {BK}\) as a set of all elements which require bookkeeping, and \(\Gamma \) as the function that computes \(\mathcal {BK}\).

We note that only the structure of the HyperLTL formula contributes to the elements of \(\mathcal {BK}\). More precisely, the ‘until’ operator is the main contributor to \(\mathcal {BK}\), as its semantics (in particular, the existential quantifier) may delineate the existence of an index for satisfaction of some propositions across multiple traces. Moreover, we may need to bookkeep Boolean expressions (and not just atomic propositions). We may prefix elements of \(\mathcal {BK}\) by either \(\#\) or \(\mathbf {X}\). Prefixing an element by \(\#\) means that only the first occurrence of the element needs to be bookkept. Prefixing by \(\mathbf {X}\) means that bookkeeping starts from the next state.

Examples. In formula \(\forall \pi _1. \forall \pi _2. \forall \pi _3. (a_{\pi _1} \, {\mathbf {U}} b_{\pi _2}) \, {\mathbf {U}} c_{\pi _3}\), we will have \(\mathcal {BK}= \{b, \#c\}\), meaning every occurrence of b and only the first occurrence of c should be memorized. For formula \(\forall \pi _1.\forall \pi _2. a_{\pi _1} \, {\mathbf {U}} \, (b_{\pi _2} \, \vee \, c_{\pi _2})\), we have \(\mathcal {BK}= \{\#(b \, \vee \,c)\}\). However, for formula \(\forall \pi _1.\forall \pi _2. \forall \pi _3. a_{\pi _1} \, {\mathbf {U}} \, (b_{\pi _2} \, \vee \, c_{\pi _3})\), we have \(\mathcal {BK}= \{\#b, \#c\}\). Finally, for formula \(\forall \pi . \forall \pi '. \mathbf {X}(a_{\pi } \, {\mathbf {U}} b_{\pi '})\), we will have \(\mathcal {BK}= \{\mathbf {X}\#b\}\).

Our bookkeeping recursive function \(\Gamma \) takes as input a HyperLTL formula, a set of trace variables \(\mathcal {V}\) (initially empty), and a Boolean value (initially \( false \)), and it returns as output the set \(\mathcal {BK}\), defined in Fig. 2. The function works as follows. The first three cases are straightforward, as a HyperLTL formula involving only a proposition requires bookkeeping if it is under the scope of an ‘until’ operator, whereas operators \(\lnot \) and \(\mathbf {X}\) allow the recursive application of \(\Gamma \) function to the formula \(\phi \). The symbol \(\odot \) denotes the application of unary operators (\(\lnot \), # and \(\mathbf {X}\)) to the elements of set \(\mathcal {BK}\) (e.g., \(\lnot \odot \{a, b\} = \{\lnot a, \lnot b\}\)).

The next case \(\phi _1 \mathbf {U}\phi _2\), we require further matching on the structure of both \(\phi _1\) and \(\phi _2\), as follows:

Fig. 2.
figure 2

Bookkeeping function \(\Gamma \)

  • (Case 1: Both operands are propositions). In this case, \(\Gamma \) returns \(\{\#b\}\) if \(\pi \) and \(\pi '\) are bound by different quantifiers or removing \(\pi '\) from \(\mathcal {V}\) does not result in an empty set. Otherwise, \(\Gamma \) returns the empty set. For example, consider two formulas \(\forall \pi _1. a_{\pi _1} \, \mathbf {U}\, b_{\pi _1}\) and \(\forall \pi _1. \forall \pi _2. a_{\pi _1} \, \mathbf {U}\, b_{\pi _2}\). The first formula does not require any trace agreement whereas the second does require a trace agreement due to the scope of the trace quantifiers.

  • (Case 2: Only the left operand is a proposition). In this case, we store the trace variable associated with a in set \(\mathcal {V}\) and invoke \(\Gamma \) recursively to formula \(\phi _2\). We also set the value of Boolean variable \(k\) to \( true \) which indicates that the original formula \(\phi \) includes an ‘until’ operator. For example, for formula \(\forall \pi . a_{\pi } \, \mathbf {U}\, (b_{\pi } \mathbf {U}c_{\pi })\), recursing through \(\Gamma \) will result in an empty set since there were no variations in the trace variables, whereas for formula \(\forall \pi _1. \forall \pi _2. a_{\pi _1} \, \mathbf {U}\, (b_{\pi _1} \, \mathbf {U}\, c_{\pi _2})\), the \(\Gamma \) function will simply return \(\{\#c\}\).

  • (Case 3: None of the operands are propositions). In this case, we recurse through \(\phi _1\) only if it contains an ‘until’ operator, where \({trace\_vars}(\phi )\) denotes the set of trace variables found in \(\phi \). Furthermore, we recurse through \(\phi _2\) and indicate that any elements produced need to be tracked only once (i.e., their first occurrence). Moreover, we prefix the recursion of \(\Gamma \) on \(\phi _1\) by symbol \(\#^{-1}\), which helps to remove the prefix \(\#\) for elements which require tracking more than once. The result will consist of the union of both produced sets. For example, for formula \(\forall \pi _1. \forall \pi _2. \forall \pi _3. \forall \pi _4. (a_{\pi _1} \mathbf {U}b_{\pi _2}) \mathbf {U}(c_{\pi _3} \mathbf {U}d_{\pi _4})\), we have \(\mathcal {BK}= \{b, \#d\}\). Note that expressions \(\#^{-1}\# a\) and \(\# \# b\) are equivalent to a and \(\#b\), respectively.

The last inductive case includes an ‘or’ (\(\vee \)), which also requires further matching on the structure of formulas \(\phi _1\) and \(\phi _2\). Here, we consider the condition of \(k\), which reflects the case when \(\phi _1 \vee \phi _2\) is under the scope of an ‘until’ operator. For example, formula \(\forall \pi _1. \forall \pi _2. a_{\pi _1} \, \mathbf {U}\, (b_{\pi _2} \vee c_{\pi _2})\). The application of \(\Gamma \) function will result in \(\Gamma (b_{\pi _2} \vee c_{\pi _2}, \mathcal {V}, k:= true )\), which further results in \(\{\#(b \vee c)\}\). On the contrary, the case of formula \(\forall \pi _1. \forall \pi _2. \forall \pi _3. a_{\pi _1} \, \mathbf {U}\, (b_{\pi _2} \vee c_{\pi _3})\), the \(\Gamma \) function will return \(\{\#b, \#c\}\) due to the disparity of trace variables.

Theorem 1

(Soundness and optimality of \(\Gamma \) function). Given a HyperLTL formula \(\varphi \) and assuming we have set T such that \([\Pi _F \models _T \varphi ] = \top \) then

  • \(\Gamma \) function returns all the propositions required for bookkeeping.

  • Given the set \(\mathcal {BK}\), every element \(k \in \mathcal {BK}\) is included in some trace agreement described by \(\varphi \).

6 Monitoring Algorithm

6.1 Algorithm Sketch

Given an alternation-free HyperLTL formula \(\varphi \) of the form \(\forall ^*\), our algorithm consists of the following elements:

  1. 1.

    Monitor: In order to monitor \(\varphi \), we begin by intaking an event for a particular trace and begin to generate the constraints. At any point of time, we can take a snapshot of our system and utilize our satisfaction function \(\mathtt {SAT}\) to find the RV verdict (see Fig. 1(a)).

  2. 2.

    Constraint Handler: Next, we manipulate \(\varphi \) according to its structure. Disjunctions are divided and treated separately to detect which half prompted the satisfaction. Each sub-formula of the disjunction is then subject to \(\mathtt {ConstraintRewriting}\). Temporal formulas without disjunction do not undergo any manipulation before being sent to \(\mathtt {ConstraintRewriting}\).

  3. 3.

    Constraint Rewriting: Initially, \(\varphi \) is stripped of its quantifiers. This allows for rewriting using the technique in [22] to evaluate the altered formula \(\varphi _r\). The events are examined against the propositions or Boolean expressions in \(\mathcal {BK}\) and the satisfaction of \(\varphi _r\) to generate the corresponding constraints.

  4. 4.

    Satisfaction of Function \(\mathtt {SAT}\) : On each invocation of the \(\mathtt {SAT}\) function, we compute the conjunction of all the constraints collectively. If \(\mathtt {SAT}\) returns \(\mathtt {false}\), then \(\varphi \) is violated. Otherwise, the constraints are further checked for possible refinement by checking the membership of other generated constraints.

Observe that a formula of the form \(\forall ^*\) cannot be evaluated to \(\top \). This would require the full set of all possible system traces, which is not possible at run time. We note that monitoring a formula of the form \(\exists ^*\) can be achieved by simply monitoring its negation which would be of the form \(\forall ^*\).

6.2 Algorithm Details

We utilize the following HyperLTL formula as a running example to demonstrate the steps of our proposed algorithm.

$$ \forall \pi _1. \forall \pi _2. \forall \pi _3. \forall \pi _4. \; ((a_{\pi _1} \vee b_{\pi _2}) \, \mathbf {U}\, c_{\pi _3}) \vee d_{\pi _4} $$

where \( AP = \{a,b,c,d\}\). We now describe the algorithm in detail which leads to the overview of Fig. 1.

Algorithm 1 (HyperLTL Monitor). This is our main monitoring algorithm which is comprised of a while loop. We continue to iterate as long as new events associated with a trace come in and until we find a violation. On Lines 2–3, we check for a new trace and then add it to our set of traces M. Given that the incoming event is associated with some trace \(t_j\), at Line 4, we call \(\mathtt {ConstraintsHandler}\) for \(t_j\), which returns constraint \(C_j\). Lines 5–6 deal with the process of taking a snapshot of our system to determine the RV verdict using function \(\mathtt {SAT}\). Finally, if the returned value from function \(\mathtt {SAT}\) is \(\mathtt {false}\) (Lines 7–9), then we have found a violation and return \(\bot \) (Line 10). Otherwise, we continue to iterate through the while loop.

Algorithm 2 (Constraint Handler). In this algorithm, we treat the given HyperLTL formula according to its structure. The algorithm is recursively applied to the given formula based on different cases. The first block of the algorithm (Lines 1–10) handles the case (\(\varphi = \phi _1 \vee \phi _2\)), where the given (sub-)formula is a disjunction. In particular, we call \(\mathtt {ConstraintsHandler}\) function for both \(\phi _1\) and \(\phi _2\) (Lines 2–3). We also need to pass the information about the elements of \(\mathcal {BK}\) which are associated with \(\phi _1\) and \(\phi _2\) (as given by \(\mathcal {BK}_{\phi _i}\)). In our running example, we have \(\phi _1 = ((a_{\pi _1} \vee b_{\pi _2}) \, \mathbf {U}\, c_{\pi _3})\) and \(\phi _2 = d_{\pi _4}\). In case both values from previous steps are \(\mathtt {false}\), then we have found a violation and the algorithm returns \(\mathtt {false}\) (Lines 4–5). On the other hand, if one of the values from Lines 2 and 3 is a constraint, then we return the corresponding constraint (Lines 6–7). Moreover, if both values have generated constraints, we return them both (Lines 10) meaning that any one of them can influence the verdict in future.

Next block in the algorithm (Lines 12–22) handles the case when the input formula contains an ‘until’ operators with a disjunction on the left operand with a disparity in corresponding trace quantifiers. We invoke \(\mathtt {ConstraintsHandler}\) function for both operands of ‘\(\vee \)’; i.e., \(\phi _L\) and \(\phi _R\) (Lines 13–14). In our running example, \(\phi _1 = ((a_{\pi _1} \vee b_{\pi _2}) \, \mathbf {U}\, c_{\pi _3})\) matches this case and \(a_{\pi _1}\) and \(b_{\pi _2}\) will go through \(\mathtt {ConstraintsHandler}\). If both values in Lines 13 and 14 result in \(\mathtt {false}\), then the formula has been violated and we return \(\mathtt {false}\).

However, if only one of the sides returns some constraints, then we return \(\mathtt {false}\) and alternating constraint for further refinement (Lines 17–20). Finally, if both sides satisfy the formula, then we return a combination of the returned values of Lines 13 and 14. This allows us to refine the constraints from the function \(\mathtt {SAT}\) in Algorithm 4.

figure b

The last part of the algorithm (Lines 24–28) invokes the \(\mathtt {ConstraintRewriting}\) function which return the constraints for other types of formulas. For example, formula \(\forall \pi _1. \forall \pi _2. \forall \pi _3. \forall \pi _4. (a_{\pi _1} \mathbf {U}b_{\pi _2}) \, \mathbf {U}\, (c_{\pi _3} \, \mathbf {U}\, d_{\pi _4})\)) will directly undergo constraint generation.

Algorithm 3 (Constraints Rewriting). This algorithm generates the constraints (denoted by r) by utilizing the elements of \(\mathcal {BK}\). We set the initial value of r to \(\mathtt {true}\) as we have no violation in the start of the monitoring process. We strip off the quantifiers of our formula \(\varphi \) to convert into its corresponding LTL form \(\varphi _r\) (Line 2). For example, \(\forall \pi _1. \forall \pi _2. (a_{\pi _1} \,\mathbf {U}\, b_{\pi _2})\) will be converted to \((a \, \mathbf {U}\, b)\). Then, we apply \(\mathtt {REWRITE}\) function to formula \(\varphi _r\) with the given event \(e_i\) (Line 3). This function is essentially the rewriting algorithm by Havelund and Rosu [13] (see Algorithm 5). If the event violates our formula then we immediately return the violation (Lines 4–5).

If \(\phi \) is not violated and if the event satisfies any object \(a \in \mathcal {BK}\), then a is considered for our constraints (Line 6). Given the position of the event is i in a trace, in Line 7 we administer \(\mathbf {X}^i\) on a (i.e., \(\mathbf {X}^i a\)). The elements of \(\mathcal {BK}\) which are prefixed by “\(\#\)” are removed from \(\mathcal {BK}\) as we have indicated that their first appearance is significant (Lines 8–9). In our running example, the invocation of \(\mathtt {ConstraintRewriting}\) for \(a_{\pi _1} \, \mathbf {U}\, c_{\pi _3}\) with set \(\mathcal {BK}= \{\#c\}\) and consecutive events of traces \(t_1 = (ab) (ab) a (ad) c\), \(t_2 = a (abcd)\), \(t_3 = c\) will result in \(r_1 = \mathbf {X}^{4}c\), \(r_2 = \mathbf {X}c\) and \(r_3 = c\), respectively.

The elements of \(\mathcal {BK}\) with “\(\mathbf {X}\)” operators are considered for upcoming events by stripping one instance of “\(\mathbf {X}\)” on that element (Lines 10–11). Indeed, the presence of \(\mathbf {X}\)’s in the elements of \(\mathcal {BK}\) delays the observation and expose the corresponding proposition to be observed for constraint generation in the subsequent rounds. Finally, we return our generated constraint r.

figure c

Algorithm 4 (Satisfaction Function). The input of the \(\mathtt {SAT}\) function is a set consisting of the constraints associated with each trace, i.e., \(\mathcal {C} = \{C_1, C_2, \dots , C_m\}\). We can imagine all these constraints as rows of a matrix. For our running example, we will have \(\mathcal {C}_i = [C^{(a_{\pi _1} \,\mathbf {U}\, c_{\pi _3})}_{i},C^{(b_{\pi _2} \,\mathbf {U}\, c_{\pi _3})}_{i}, C^{d_{\pi _4}}_{i} ]\) where i corresponds to \(i^{th}\) trace in M. We iterate through the columns for each of the traces and conjunct together their constraints. If they evaluate to \(\mathtt {false}\), then we can drop the column as traces have found a disagreement (Lines 3–8). If the conjunction is not \(\mathtt {false}\), we acquire the longest constraint \(m'\) of the corresponding column. We then check to see that no constraints associated by other traces disagree by confirming that they are members of \(m'\) (Lines 10–11). If one of the constraints disagrees, then we drop the column, or else we have found an agreement of constraints between the traces (Lines 12–14). Finally, we return a violation if we were unable to find any agreement within the constraints between traces (Lines 15–18).

Note that the process of dropping columns indeed results in a refined set of constraints. Since the incoming traces can progress at various speeds, we confirm that the constraints for “slower” traces are in-fact a member of the “fastest” trace’s constraints. If no traces contradict the “fastest trace”, then this suggests that no disagreement has yet emerged in the system. We resume taking snapshots of the system until a violation is detected.

Theorem 2

(Correctness of Algorithm 1 ). Let \(\varphi \) be a HyperLTL formula. Algorithm 1 returns \(\bot \) for an input set of traces T iff \([\Pi _F \models _T \varphi ] = \bot \).

6.3 Discussion

Our algorithms reflect that the decision of appropriate consideration for propositions or Boolean expressions, paired with the effective structural division of a HyperLTL formula, and provides an effective way to monitor complex HyperLTL formulas. Additionally, we encode only the minimum information to check that the agreement between traces is delineated according to the observed locations of propositions or Boolean expressions.

A potential drawback of our RV technique is its theoretical unbounded memory requirement. However, this requirement does not influence the cases where the verification is done offline. For online RV we can still use our algorithms for by making practical assumptions. For example, we can incorporate a synchronization mechanism amongst traces to ensure that the difference in length of traces is not beyond some bound. We note that the worst case complexity of Algorithm 1 is \(\mathcal {O}(|t|\cdot |T|)\), where |t| is the length of the longest trace in set T. Interestingly, this complexity is independent from the number of trace quantifiers in a given HyperLTL formula. Indeed, the set \(\mathcal {BK}\) computed pre-runtime by \(\Gamma \) function provides the means to avoid dependence on the trace quantifiers, which otherwise is polynomial on the order of numbers of quantifiers. We believe that our proposed algorithm is efficient enough to be adopted for the monitoring of security policies in real-world applications.

Note that our proposed algorithm can only be used to monitor alternation-free fragment (i.e., \(\forall ^*\) and \(\exists ^*\)) of HyperLTL, which can express a wide class of security policies including non-interference and declassification. However, specification of some security policies require alternation in the trace quantifiers. For example, noninference [17] specifies that the behavior of low-variables should not change when all high variables are replaced by an arbitrary variable \(\lambda \), given as follows:

$$\forall \pi . \exists \pi ' . (\mathbf {G}\lambda _H (\pi ') \wedge \mathbf {G}(\bigwedge _{a \in L} a_{\pi } \leftrightarrow a_{\pi '})$$

Similarly, generalized non-interference (GNI) [16] also requires alternation in trace quantifiers as it allows non-determinism in the low variables of the system.

7 Related Work

Static Analysis. Sabelfeld and Myers [24] survey the literature focusing on static program analysis for enforcement of security policies. In some cases, with compilers using Just-in-time compilation techniques and dynamic inclusion of code at run time in web browsers, static analysis does not guarantee secure execution at run time. Type systems, frameworks for JavaScript [6] and ML [21] are some approaches to monitor information flow. Several tools [11, 18, 19] add extensions such as statically checked information flow annotations to Java language. Clark and Hunt [7] present verification of information flow for deterministic interactive programs. On the other hand, our approach is capable of monitoring the subset of hyperproperties described by alternation-free HyperLTL and not just information flow without assistance from static analyzers. In [2], the authors propose a technique for designing runtime monitors based abstract interpretation of the system under inspection.

Dynamic Analysis. Russo and Sabelfeld [23] concentrate on permissive techniques for the enforcement of information flow under flow-sensitivity. It has been shown that in the flow-insensitive case, a sound purely dynamic monitor is more permissive than static analysis. However, they show the impossibility of such a monitor in the flow-sensitive case. A framework for inlining dynamic information flow monitors has been presented by Magazinius et al. [14]. The approach by Chudnov and Naumann [5] uses hybrid analysis instead and argues that due to JIT compilation processes, it is no longer possible to mediate every data and control flow event of the native code. They leverage the results of Russo and Sabelfeld [23] by inlining the security monitors. Chudnov et al. [4] again use hybrid analysis of 2-safety hyperproperties in relational logic. In [1], the authors propose an automata-based RV technique for monitoring only a disjunctive fragment of alternation-free HyperLTL.

Austin and Flanagan [3] implement a purely dynamic monitor, however, restrictions such as “no-sensitive upgrade” were placed. Some techniques deploy taint tracking and labelling of data variables dynamically [20, 26]. Zdancewic and Myers [25] verify information flow for concurrent programs. Most of the techniques cited above aim to monitor security policies described solely with two trace quantifiers (without alternation), on observing a single run, whereas, our work is for any hyperproperties that can be described with alternation-free HyperLTL, when multiple runs are observed.

SME. Secure multi-execution [10] is a technique to enforce non-interference. In SME, one executes a program multiple times, once for each security level, using special rules for I/O operations. Outputs are only produced in the execution linked to their security level. Inputs are replaced by default inputs except in executions linked to their security level or higher. Input side effects are supported by making higher-security-level executions reuse inputs obtained in lower-security-level threads. This approach is sound in a deterministic language.

While there are small similarities between SME and our work, there are fundamental differences. SME only focuses on non-interference and aims to enforce it, but there are many critical hyperproperties that differ from non-interference that our method is able to monitor. Thus, SME enforces a security policy at the cost of restricting what it can enforce, whereas our technique monitors a much larger set of policies.

8 Conclusion

In this paper, we introduced an algorithm for monitoring alternation-free fragment of HyperLTL [8], a temporal logic that allows for expressing complex information-flow properties like generalized non-interference, declassification, and quantitative non-interference. The main challenge in designing an RV algorithm for HyperLTL formulas is that reasoning about the formula involves analyzing multiple traces (as opposed to a single trace in traditional RV techniques). Our algorithm has three components: (1) a function that identifies propositions that have to be bookkept across multiple traces, (2) a constraint generator that encodes the occurrence of propositions of interest, and (3) a rewriting module based on the algorithm in [22] that incorporates formula progression with respect to incoming events for traces. In our view, our algorithm is a significant step forward in monitoring sophisticated information-flow security and privacy policies.

Our first step to extend this work will be to implement our algorithm and test it for real-world applications, e.g., in smartphones. For future work, one may consider RV algorithms based on monitor synthesis (as opposed to rewriting). We are also planning to develop techniques for monitoring alternating HyperLTL formulas. We believe dealing with such formulas is not possible without assistance from a static analyzer.