Keywords

1 Introduction

Using structured domain knowledge, production rule systems realize a diversity of tasks in domains such as business, science and healthcare. Knowledge is formulated as a set of productions (i.e., if-then rules) together with a set of assertions. In the healthcare domain, production rule systems are often at the core of Clinical Decision Support Systems (CDSS), which aid in diagnosis, prognosis and treatment tasks [1, 2]. To semantically structure health data, a variety of biomedical ontologies (e.g., see BioPortal [3]) and clinical health terminologies (e.g., SNOMED-CT [4]) are available. In order to improve decision support accuracy, a CDSS can leverage the embedded semantics of semantic health data, such as subclass, transitive and symmetric relations between drugs, illnesses and treatments. Performing this kind of ontology-based, semantic reasoning in (production) rule systems requires a rule-based axiomatization of ontology semantics. The W3C OWL2 RL profile [5] is highly relevant, since it partially axiomatizes the OWL2 RDF-based semantics as a set of high-level, abstract IF-THEN rules.

There is a growing demand to deploy production rule systems, such as clinical decision support systems, directly on mobile, resource-constrained platforms. Examples include clinical, time-sensitive tasks to be performed directly on mobile consumer devices [6], and sensor networks pushing reasoning down to the device layer to cope with unstable communication [7]. However, recent benchmarks [8] show that the mobile performance of existing, desktop- or server-based reasoners still leaves much to be desired. It may be noted that, although modern mobile consumer devices are outfitted with 2 GB of RAM or more, single apps are only assigned max. 192 MB on Android; whereas devices in sensor networks may even feature much less memory.

To optimize semantic, ontology-based reasoning in rule systems, we propose a novel version of the RETE algorithm, a well-known algorithm for production rule systems, which aims to better balance memory usage with performance. The RETE algorithm uses alpha nodes to represent rule premises, with alpha memories keeping matching premise facts (tokens). Our proposed RETE pool algorithm is based on the observation that generic rule premises, which occur frequently in OWL2 RL, result in a large duplication of data in alpha memories. For instance, the same set of tokens can match OWL2 RL premises <?x ?p ?y>, <?c owl:unionOf ?x> and <?c rdf:type owl:Class>. An extreme example is a “wildcard” premise, i.e., with variables at all positions, which will effectively duplicate the data from all other premises. By pooling a selection of alpha memories into a single shared memory, the RETE pool algorithm aims to reduce duplication of data in RETE. We note that RETE pool is well-suited towards Semantic Web settings that typically involve an existing, multi-purpose RDF store. We integrated this algorithm into Apache Jena [9] and AndroJena [10], a port of Apache Jena for mobile (Android) platforms. We present an extensive evaluation of semantic reasoning, using a rule system featuring the RETE pool algorithm and an OWL2 RL ruleset, both on PC and mobile platform (Android). This work is motivated by our previous work, where we (1) developed a mobile patient diary with built-in local decision support [6] based on rule-based reasoning; and (2) presented a set of mobile benchmarks, together with a mobile benchmark framework, using existing reasoners [11].

The paper is structured as follows. First, Sect. 2 summarizes our OWL2 RL ruleset, which implements the OWL2 RL specification and is used to realize semantic reasoning. In Sect. 3, we summarize and exemplify the RETE algorithm. Section 4 presents the RETE pool algorithm. In Sect. 5, we extensively evaluate semantic reasoning using RETE pool and our OWL2 RL ruleset. In Sect. 6, we discuss relevant state of the art, and Sect. 7 presents conclusions and future work.

2 OWL2 RL Ruleset

To realize semantic reasoning on mobile, resource-constrained platforms, we rely on the W3C OWL2 RL profile. The OWL2 Web Ontology Language Profiles document [5] introduces three distinct OWL2 profiles, which are optimized to handle specific application scenarios. The OWL2 RL profile is aimed at balancing expressivity with reasoning scalability, and presents a partial, rule-based axiomatization of OWL2 RDF-Based Semantics [12]. As only a partial axiomatization, OWL2 RL guarantees completeness for ABox reasoning but not for TBox reasoning [13]; and places syntactic restrictions on ontologies to ensure all correct inferences. Nevertheless, this trade-off seems acceptable when targeting scalable reasoning on resource-constrained platforms.

Based on the OWL2 RL specification, we created a concrete OWL2 RL ruleset that is re-usable by any arbitrary rule engine, which means no particular internal support (e.g., for datatypes or lists) was assumed. Below, we focus on 3 non-trivial issues that occur when attempting to create an OWL2 RL ruleset:

  1. (1)

    A pair of rules (#dt-type2 and #dt-not-type) support RDF datatype semantics by inferring datatypes (e.g., typing integer “42” with xsd:int) and flagging datatype inconsistencies. Two other rules (#dt-eq and #dt-diff) infer equality and inequality of literals, which requires differentiating literals from URIs (to avoid these rules firing for URI resources as well). As such, these rules require built-in support for RDF datatypes and literals, which cannot be assumed for arbitrary systems; hence, we chose to leave out these rules. We note that related work, including DLEJena [14] and the SPIN [15] and OWLIM [16] OWL2 RL rulesets, also do not include datatype rules. Others opted to leave out datatype rules due to their significant performance issues [17].

  2. (2)

    Another set of rules lacks an antecedent and are thus always applicable. Some of these rules lack variables (e.g., specifying that owl:Thing has type owl:Class), and were represented as axiomatic triples accompanying the ruleset. Other rules comprise “quantified” variables in the consequent; e.g., stating that each annotation property has type owl:AnnotationProperty. Similarly, these were implemented by creating an axiom for each annotation (OWL2 [18]) and datatype property (OWL2 RL [5]).

  3. (3)

    N-ary rules refer to a finite list of elements. A first subset (L1) places restrictions on a limited number of list elements; e.g., #eq-diff2 flags an inconsistency if two members of an owl:AllDifferent instance are defined as equivalent. A second subset (L2) places restrictions on all elements; e.g., #cls-int1 will type a resource with an intersection class only when the resource is typed by all of the intersection member classes. A third ruleset (L3) yields inferences for all list elements; e.g., #scm-uni will infer subclass relations for all classes that are members of a union class.

Rulesets (L1) and (L3) can be supported by adding two auxiliary list-membership rules (Rule 1), which link each list element to all preceding list cells; meaning the first cell will be directly linked to all elements.

E.g., using these rules, #scm-uni (L3) may be formulated as follows (Rule 2; note that Rule 3 similarly belongs to (L3)):

Multiple solutions are possible for n-ary rules from (L2). We chose a solution that replaces each (L2) rule by a set of auxiliary rules [16], which infer intermediary assertions for each list cell \( i(0 \le i < n) \), and, based on these inferences, finally generates the n-ary inference if the first cell is related to an (L2) assertion. We note that this is the only solution that does not require pre-processing the ruleset or ontology for per ontology update, compared to e.g., instantiating (L2) rules based on n-ary assertions [19], or “binarizing” (i.e., converting all n-ary assertions to binary ones). For details on these solutions, we refer to the online documentation [20].

Based on these considerations, we created an OWL2 RL ruleset written in the SPARQL Inferencing Notation (SPIN) based on an initial ruleset created by Knublauch [15]. This initial ruleset did not specify axioms, and relied on built-in Apache Jena functions to implement n-ary rules. Our final ruleset contains 78 rules and 43 supporting axioms, and can be found online [20]. We checked the conformance of the OWL2 RL ruleset using the OWL2 RL conformance test suite by Schneider and Mainzer [21]. We note that some of these tests had to be left out, either due to the limitations of our OWL2 RL ruleset or difficulties testing conformance. We detail these cases online [20].

3 Using RETE for Reasoning on RDF

Production rule systems operate by matching production conditions to a set of assertions, and then adding/removing assertions based on the production’s actions [22]. For this purpose, the RETE algorithm sets up a network consisting of alpha nodes for each condition (i.e., intra-condition check), and beta nodes to join shared variables between these conditions (i.e., inter-condition check). Each alpha node, and all but the last beta node, is linked to a memory keeping the results of these checks. A rule ends with a terminal node, which represents the actions. To create a RETE network, the right input of each beta node is linked to an alpha node, and its left input to the previous beta node, or if none exists, the first alpha node (cfr. Ishida [23]). When reasoning over RDF data, an intra-condition check matches a triple pattern (or FILTER expression) to an RDF triple token [7, 24]. Below, we show the RETE network for the #cls-int2 OWL2 RL rule to be applied on an RDF dataset described using an OWL2 RL ontology; and describe the reasoning process when new facts (tokens) enter the network.

At time t 1 , triple tokens A 2 and A 3 enter the network, which are matched by alpha nodes α 2 and α 3 and inserted into their memories. At time t 2 , incoming token A 1 is matched to alpha node α 1 , and inserted into its memory. Since nodes α 1 and α 2 are connected by beta node β 1 , the new token triggers a join attempt between the left-input A 1 token and the right-input A 2 token. As shown in the token table, tokens A 1 and A 2 have the same value for shared variable ?x (i.e., <lst-A>), leading to a successfully joined token A 12 that is added to the β 1 memory. With this new token at its left input, node β 2 attempts a join with right input token A 3 . Both tokens have the same value for shared variable ?c (i.e., <cls-A>), leading to a successfully joined token A 123 . This token reaches the terminal node, which will use the instantiated variables to infer a new fact, i.e., <inst-1> rdf:type <cls-C 1 >.

A standard RETE optimization is to re-use alpha nodes (and memories) when the same premise occurs multiple times, and beta nodes in case rules share the first two or more premises (else, the contents of the beta memories may differ). Re-using nodes and memories reduces the number of match and join operations, and avoids duplicate storage of tokens. To speed up the joining process, the most restrictive conditions (i.e., alpha nodes) are often placed first, and Cartesian products are avoided [7, 23]. Alpha and beta memories are typically indexed to allow for hashed joins [9, 25].

4 The RETEpool Algorithm

A default RETE optimization involves re-using alpha memories for identical premises (Sect. 3), which reduces data duplication. Nevertheless, we observe that generic rule premises (i.e., with more than 1 variable) typically still lead to large duplications of data in alpha memories. For instance, the memory related to premise <?x ?p ?y> will effectively duplicate all data from the memory of <?p rdf:type owl:ObjectProperty>; and both memories will overlap with the memory of <?y rdf:type ?c>. This is especially apparent in the OWL2 RL ruleset with its many generic premises. An extreme example are wildcard premises, which are found in the OWL2 RL ruleset and match all tokens. As such, they effectively duplicate data from all other alpha memories. Furthermore, we note that many Semantic Web applications involve an existing RDF store, which is used to load data into the rule system but for other purposes as well (e.g., querying). Alpha memories will always duplicate (parts of) this RDF store, thus presenting a second, orthogonal level of data duplication.

We present the RETE pool algorithm, which pools alpha memories into a single shared memory. As a result, duplicate tokens, i.e., tokens occurring in multiple alpha memories, are only stored once in a single memory. In doing so, data duplication in alpha memories is effectively avoided. In scenarios with an existing RDF datastore, RETE pool can directly re-use this store as the shared memory; thus avoiding both internal duplication, as well as duplication between the RETE structure and RDF store. Below, we discuss the implementation of the RETE pool algorithm.

4.1 Implementation of RETEpool

The RETE pool algorithm utilizes virtual alpha memories (cfr. Hanson [26]) in the RETE network, which keep a mask on the single, shared memory that represents the related premise (e.g., <?c rdf:type ?t>). For instance, an RDF store may be used as the shared memory, as is done in our evaluation (Sect. 5).

New tokens are added to the shared memory, and injected at suitable alpha nodes into the network (see Fig. 1). In this process, a beta node will attempt to join the new token with other tokens from its input alpha memory. Joining two tokens implies they have the same value(s) for the shared variable(s). Hence, join operations can be performed by searching the alpha memory for tokens matching the shared variable(s) value(s). In our case, the virtual memory’s mask is extended with these values, and then used as a search constraint on the shared memory. By extending the mask, only tokens that match the alpha node premise will be returned. This is illustrated in Fig. 2. At time t 3 , token A 12 is used to extend the virtual memory mask with the token’s value of shared variable ?c, leading to search constraint S = ?, P = rdf:type, O = <cls-A>.

Fig. 1.
figure 1

Example RETE structure (rule #cls-int2).

Fig. 2.
figure 2

Usage of the shared memory in RETEpool.

Since each join involves accessing a (very) large shared memory, instead of a relatively small alpha memory, it is clear that this algorithm optimizes memory at the cost of performance. This is confirmed by our evaluation (Sect. 5). We note that the issue that RETE pool aims to solve, i.e., data duplication in alpha memories, clearly depends on premise selectivity. By only utilizing a virtual alpha memory for overly generic, non-restrictive premises, we may thus better balance memory usage with runtime performance. To that end, RETE pool allows configuring a selectivity threshold \( t_{s} (0 < t_{s} \le 1) \). In case the premise selectivity (i.e., number of matching facts) equals or exceeds \( t_{s} \), a virtual memory will be utilized, otherwise a regular memory. Our evaluation studies the effects of different values for \( t_{s} \) on memory and performance.

Below, we discuss an additional issue that arises when an existing, pre-loaded RDF store is being re-used.

4.2 Reciprocal Join Issue

Many Semantic Web applications will start out with a pre-loaded RDF datastore, which will be used to inject data into the rule system. When utilizing RETE pool , an opportunity exists to re-use this datastore as a single shared memory. In this case, each virtual alpha memory will initially be fully “populated”, since it references the pre-loaded RDF dataset (Sect. 4.1). This is illustrated in Fig. 3.

Fig. 3.
figure 3

Example where the reciprocal-join issue occurs in RETEpool.

Rule 1.
figure 4

Two rules for inferring list membership.

Rule 2.
figure 5

Rule inferring subclasses based on union membership (#scm-uni).

Rule 3.
figure 6

Rule inferring resource types based on intersection members (#cls-int2).

At time t 0 , token A 1 is injected into the network, and joins with token A 2 from node α 2 (already present in its virtual memory). At time t 1 , token A 2 is injected and similarly joins with token A 1 (also present) – thus performing a second, redundant (“reciprocal”) join. Later on, in case token A 12 was stored twice in memory β 1 , four joins with token A 3 would take place: one for each of the two A 12 tokens at times t 0 and t 1 ; and when token A 3 is injected at time t 2 , again once for each A 12 token. A single successful rule firing requires an exponential joins, with the set of alpha nodes in network (for any rule r). In case duplicate tokens are not stored (e.g., duplicate checking takes place), a single rule firing would still require joins. Without reciprocal joins, a single successful rule firing requires only joins.

Since this issue only occurs during the first reasoning cycle, we introduced a custom reasoning process for this cycle. In this process, only tokens are injected that match the first alpha node \( \alpha_{1} \). As the token travels through the network, all possible joins will be attempted, since all tokens are already at their virtual alpha memories – while avoiding reciprocal joins. In Fig. 3, by only injecting token A 1 , a second, redundant join with A 2 will be avoided, which in turn avoids redundant joins later on in the network.

5 Semantic Reasoning Benchmarks

This section presents benchmark results for ontology-based reasoning, using a rule system and an OWL2 RL ruleset. For the rule system, we benchmarked multiple configurations of the RETE pool algorithm, and compared the results to a baseline RETE algorithm. To test the performance impact on resource-constrained platforms, we ran each benchmark on a PC as well as on a mobile device.

Below, we discuss the setup (Sect. 5.1) and present the benchmarks (Sect. 5.2).

5.1 Benchmark Setup

5.1.1 Baseline System

We extended the original Apache Jena RETE implementation [27] with standard optimizations, including the re-use of RETE nodes and memories, memory indexing and join ordering (see Sect. 3). These optimizations are considered standard-practice in modern RETE systems [7, 25, 28], making the extended system an appropriate baseline. We also copied these extensions to Jena’s Android port, i.e., AndroJena [10]. In the benchmark results, this baseline system is referred to as RETE base . We implemented the RETE pool algorithm on top of this baseline implementation.

5.1.2 OWL2 RL Ontologies and Ruleset

Our benchmarks were executed on the BioPortal ontologies from the OWL 2 RL Benchmark Corpus [29]. On PC, reasoning over 3 of the 45 ontologies took longer than 10 min (our cut-off time) for any configuration, so these were left out. For the 42 remaining ontologies, the number of statements range from 246 to 57310 (avg. 6684), and their file sizes (N3 format) range from 24 KB to 5852 KB (avg. 642 KB). For the mobile benchmarks, we considered a subset of 34 BioPortal ontologies, since the other 11 ontologies either caused out-of-memory exceptions, or ran longer than 10 min. For these ontologies, the number of statements range from 246 to 7291 (avg. 2199) and their file sizes (N3 format) from 24 KB to 838 KB (avg. 210 KB). As a result, we note that average performance times for PC and mobile are not directly comparable. We detail each set of ontologies in our online documentation [20]).

As the benchmark ruleset, we utilized the OWL2 RL ruleset introduced in Sect. 2. This ruleset contains 78 rules and 43 supporting axioms, and can be found online [20]. As mentioned, we checked the conformance of this ruleset using the OWL2 RL conformance test suite by Schneider and Mainzer [21], and detail the results online [20].

5.1.3 Benchmark Platforms

Benchmarks were performed on two platforms:

  1. (1)

    PC: Lenovo ThinkPad T530, with a dual-core Intel Core i7-3520 M CPU (2.9 Ghz), 8 GB RAM and a 64 bit infrastructure, running Windows 7.0 (Service Pack 1).

  2. (2)

    Mobile: LG Nexus 5 (model LG-D820), with a 2.26 GHz Quad-Core Processor and 2 GB RAM. This device runs Android 6, which grants apps 192 MB of heap space.

During the benchmarks, both devices were connected to a power supply.

5.1.4 RETEpool Configurations

We benchmarked the following algorithms and configurations (regular memories are utilized for beta nodes).

  1. (A.i)

    RETE base : a regular memory is utilized for each alpha node.

  2. (A.ii)

    RETEfull-pool: a virtual alpha memory is utilized for each alpha node.

  3. (A.iii)

    RETEpart-pool: a virtual alpha memory is only utilized in case its premise selectivity exceeds the configured threshold: 0.1 − 0.25 − 0.5 − 0.75 − 1.

Further, we consider two orthogonal scenarios for RETE pool :

  1. (S.i)

    A shared memory is introduced for the sole purpose of supporting RETE pool ;

  2. (S.ii)

    An existing RDF store is re-used as the shared memory pool.

We utilized the Apache Jena RDF store as the shared memory. Each configuration was benchmarked in a one-shot reasoning scenario, i.e., with a single reasoning cycle over each of the benchmark ontologies. For RETEX-pool configurations, the shared RDF store was pre-loaded with the ontology, after which a custom reasoning process took place (see Sect. 4.2). This allowed us to estimate premise selectivity by using the number of matched tokens from the actual benchmark ontology.

5.1.5 Benchmark Metrics

We measure the following performance metrics:

  1. (P.1)

    Network compilation time: time needed to compile the network, including selecting memory indices and deciding the best join order.

  2. (P.2)

    Reading time: time needed for the system to read and parse the data.

  3. (P.3)

    Reasoning time: time needed to complete the first reasoning cycle.

  4. (P.4)

    Initialization time: time needed to load the ontology into the shared memory. As mentioned, for RETEX-pool configurations, the shared memory was pre-loaded with the ontology, after which reasoning took place (Sect. 5.1.4).

In addition, we collect the following memory-related metrics:

  1. (M.1)

    Number of alpha memories: the total number of alpha memories, differentiating between different types (i.e., regular vs. virtual memories).

  2. (M.2)

    Alpha memory size: the total size (KB) taken up by the set of alpha memories.

  3. (M.3)

    Total memory size: the total size (KB) taken up by the set of alpha memories and the shared memory pool.

  4. (M.4)

    RDF nodes size: the total size (KB) taken up by RDF node data. The contents of an RDF graph are kept in RDF node objects. By considering this memory size separately, these are not accidentally counted towards only (M.2) or (M.3).

  5. (M.5)

    Shared memory size: the total size (KB) taken up by the shared memory, if any.

For evaluating the (Z.ii) shared memory pool setup, we also measure the following:

  1. (M.6)

    Memory operations: the number and performance overhead of memory operations, including updating the shared memory and individual RETE memories.

To obtain actual memory usage (KB), we performed heap dumps on PC and Android. Per configuration, a heap dump was taken after reasoning over the ontology that yielded the min., median and max. number of alpha memory tokens, respectively.

5.2 Benchmark Results

Table 1 shows the memory usage for RETE base (i.e., the baseline system) and the RETE pool configurations. In particular, it shows the memory usage for the min., median and max. ontology (see Sect. 5.1.5). The table lists the number of alpha memories (M.1), alpha memory sizes (M.2), and (for ease of reference) the total memory size, which includes alpha memories and the shared memory (if any) (M.3). Further, the table shows the reasoning performances (P.3).

Table 1. Memory usage (KB) and reasoning performance (ms) *: RETEpart-pool \( \left( {\varvec{t}_{\varvec{s}} } \right) \), †: r = regular, v = virtual, **: median (min – max), ***: average (min – max).

The following memory metrics are identical for all configurations (KB): (M.4) RDF nodes size: 5.1.5 median: 800 (min: 203 – max: 26760); (M.5) Shared memory size: median: 562 (min: 98 – max: 26070). Further, the following performance metrics (ms) are identical: (P.1) Network compilation time: avg. ca. 5 (PC), 71 (mobile); (P.2) Reading time: avg. ca. 73 (PC), 696 (mobile). As mentioned, a separate data loading step took place for RETEX-pool configurations (P.4). This amounts to 11 ms for median (min: 3 ms – max: 670 ms) on PC, and 99 ms for median (min: 15 ms – max: 1034 ms).

Note that RETEpart-pool configurations with \( t_{s} = 0.1 \) and \( t_{s} = 0.25 \); and \( t_{s} = 0.75 \) and \( t_{s} = 1 \) are pairwise identical, so we only present \( t_{s} = 0.1 \) and \( t_{s} = 1 \).

We first observe that only 46 alpha memories are required for a total of 78 OWL2 RL rules, due to the re-use of alpha (and beta) nodes and memories where possible (Sect. 3). Memory savings for RETE pool depend on the concrete application scenario: i.e., whether (S.i) a separate shared memory needs to be introduced, or (S.ii) an existing, multi-purpose RDF store can be re-used for this purpose.

Scenario (S.i)

Even in scenario (S.i), we expect significant memory savings since duplication of data in alpha memories is either avoided entirely (RETEfull-pool) or partially (RETEpart-pool). Indeed, memory savings for RETEfull-pool are huge, with a ca. 60% for the median ontology (min: 55%, max: 50%). As expected, the number of regular alpha memories (M.1) increases together with threshold \( t_{s} \). This causes data duplication among individual memories, as well as with the shared memory; thus increasing the total memory usage (M.3). For RETEpart-pool (0.1), memory savings drop to ca. 30% for median (min: 25%, max: 25%); for RETEpart-pool (0.5), ca. 4% (min: 1%, max: 23%); and for RETEpart-pool (1), memory usage is similar for median and min., and increases for max.

At the same time, we observe that RETEfull-pool greatly reduces performance; by factors of avg. ca. 3,3 and 2,8 for PC and mobile, respectivelyFootnote 1. We expect RETEpart-pool to strike a better memory/performance balance, with performance improving as threshold \( t_{s} \) increases. Indeed, performance improves greatly on mobile (e.g., \( t_{s} = 0.1 \): avg. ca. 58%), and approaches RETE base as \( t_{s} \) increases. On PC, we observe the same effect (e.g., \( t_{s} = 0.1 \): avg. ca. 69%) at least until \( t_{s} \ge 0.5 \), after which performance (slightly) worsens. This is likely due to excessive garbage collection; when leaving out the 7 most memory-intensive ontologies, RETEpart-pool performance is constant for all \( t_{s} \) at avg. ca. 1,25 s (RETE base  = 1.15 s, RETEfull-pool = 3.3 s).

We conclude that RETEpart-pool (0.1) effects the best memory/performance balance. Compared to RETE base , its saves memory by 30% for median (min: 25%, max: 25%) whereas it is ca. 4.3 s slower on mobile and 0.6 s slower on PC. While RETEpart-pool (0.5) only incurs a penalty of ca. 1.5 s on mobile and 0.8 s on PC, its memory savings are significantly lower, i.e., 4% for median (min: 1%, max: 23%). At the same time, we note that an extra initialization time is incurred for all RETEX-pool configurations (P.4).

Scenario (S.ii)

In scenario (S.ii), since we are re-using an RDF store that is utilized for other purposes as well, the size of the shared memory is not counted towards our total memory usage. In that case, memory savings by RETEfull-pool are tremendous, using only ca. 0,04% (max) to ca. 7,6% (min) compared to RETE base . In fact, RETEpart-pool (1), which utilizes the most memory, still only takes up avg. ca. 57% of RETE base .

For RETEpart-pool (0.1): memory savings include 68% for median (min: 62%, max: 74%); for RETEpart-pool (0.5), savings constitute 42% for median (min: 39%, max: 72%); for RETEpart-pool (1), savings include 40% for median (min: 37%, max: 43%).

In this case, we conclude that RETEpart-pool (0.5) is preferable: it greatly improves performance (ca. 65%) on mobile (PC is only slightly slower), while, in this scenario, memory savings are significant as well. Further, the performance gains by RETEpart-pool (1) do not seem comparable to its increased memory usage. As before, we note that any RETEX-pool configuration also incurs an extra initialization time (P.4).

6 Related Work

To realize ontology-based reasoning, many mobile reasoners, i.e., targeting resource-constrained platforms, utilize rule-based OWL axiomatizations; such as custom entailment rulesets [30, 31] or OWL2 RL rulesets [7, 13]. For instance, MiRE4OWL [32] and μOR [31] apply a custom entailment ruleset; Seitz et al. [17] load the CLIPS engine with the OWL2 RL ruleset; and Tai et al. [7] and BaseVISor [33] rely on rules implementing pD* semantics. In general, by focusing on subsets of rule axioms, rule-based axiomatizations allow easily adjusting reasoning complexity to the application scenario [7], or avoiding resource-heavy inferences [16, 17]. In contrast, transformation rules used in tableau-based DL reasoning are often hardcoded, making it hard to de-select them at runtime [7]. Also, most classic DL optimizations improve performance at the cost of memory, which is limited in mobile devices [8].

To deal with data duplication caused by generic rule premises in OWL2 RL, we presented the RETE pool algorithm, which utilizes virtual alpha memories that act as masks (or views) on a large, shared dataset. This concept was first introduced by Hanson for the Ariel system [26]. Later on, Hanson et al. [34] presented a set of optimizers that choose an efficient Gator network (see below), possibly including virtual alpha memories, based on database size, predicate selectivity and update frequency distribution, among others. In this paper, we implemented this concept to realize more memory-efficient, semantic ontology-based reasoning on mobile platforms. We further consider typical Semantic Web scenarios, where an RDF store is already available and possibly pre-loaded with data, as well as the issues ensuing from such a setup. As opposed to Hanson et al., our evaluation focuses in particular on how memory and performance may be balanced by using different selectivity thresholds \( t_{s} \).

Some approaches [13, 14] support a different solution for dealing with generic rule premises, as they occur in rule-based axiomatizations such as OWL2 RL. In these solutions, a first step materializes all schema-related inferences in the ontology (e.g., using a separate OWL reasoner), which is then followed by a rule instantiation step. Based on the materialized schema, the second step creates multiple concrete rules for each generic rule by replacing schema variables by concrete schema references. This kind of approach deals poorly with ontology schema updates, and is thus only suitable for scenarios where such updates do not occur (or occur very infrequently). As a different solution to optimizing mobile, semantic reasoning, Tai et al. [7] present a selective rule loading algorithm, which composes a pD* ruleset based on ontology expressivity; and a two-phase RETE construction process, which utilizes selectivity information from the first phase to optimize join sequences in the second phase. Komazec and Cerri [35] integrated a special \( \varepsilon \) network into RETE to optimize RDFS entailments.

We note that other production rule algorithms aside from RETE exist. Miranker [22] proposed the TREAT algorithm, which, instead of storing join results, re-calculates results of intermediate joins when required. In doing so, TREAT avoids the memory and maintenance overhead of beta memories. The Gator [36] and RETE* [37] algorithms generalize RETE and TREAT, treating both as special cases. The more recent PHREAK algorithm, introduced by the well-known Drools production system [25], is based on RETE but incorporates lazy and goal-oriented aspects. As our work focuses on reducing data duplication in alpha memories, which is an issue that, to the best of our knowledge, potentially affects all these approaches and algorithms, it can be considered complementary to these efforts.

7 Conclusions and Future Work

In this paper, we presented the RETE pool algorithm which, by pooling a particular selection of RETE alpha memories, aims to balance memory usage with performance. We illustrated how this algorithm is well-suited for many typical Semantic Web scenarios, which typically utilize an existing, multi-purpose RDF store. We performed an extensive set of benchmarks, which evaluated semantic, ontology-based reasoning using our OWL2 RL ruleset and multiple configurations of the algorithm, both on PC and mobile platforms. In line with expectations, the RETE pool algorithm drastically reduces memory usage. By configuring selectivity thresholds, i.e., where virtual alpha memories are only used in case estimated selectivity exceeds a threshold, we were better able to balance memory savings with performance overhead.

Our evaluation has a number of limitations. Firstly, our solution and evaluation focuses specifically on semantic reasoning using the OWL2 RL ruleset, which includes many generic rule premises. For other rulesets with more concrete premises, utilizing RETE pool will likely lead to smaller memory savings. Hence, future work includes running additional benchmarks to test the usefulness of this approach for other types of rulesets. Secondly, premise selectivity was estimated based on the actual number of tokens matched from the benchmark ontology. Clearly, this will not be possible in incremental reasoning scenarios, where only a very limited amount of initial data is available. As a result, future work involves utilizing other kinds of selectivity estimates (e.g., based on SPO position). Thirdly, our goal was to establish to what extent the proposed optimization reduces memory usage and impacts performance – which can only be done by comparisons with the baseline system. When we arrive at a more mature, fully-fledged rule system, future work will involve comparisons to other rule systems. Finally, to avoid the main pitfall of RETE pool – i.e., accessing a large shared memory for each join attempt – future work involves creating a more fine-grained memory strategy. We observe that alpha memories will often completely subsume other memories, depending on premise structure: e.g., premise <?c rdf:type ?t> subsumes premise <?c rdf:type owl:Class>. By constructing a nested memory structure, a subsuming memory could directly access the data of subsumed memories, while still reducing duplication.