Keywords

1 Introduction

Augmented Reality (AR) [1] systems have become the accepted part of our everyday life due to the proliferation of smartphones. The physical environment of a user can be extended by virtual elements using such a system. An Augmented Reality system is regarded as a representative of context-aware computing [2], where the continuously changing context and the environment are taken into account by the system. As a result, the user is able to obtain and visualize context-dependent information.

The behavior of the state-of-the-art Augmented Reality systems can only be described as architectural diagrams nowadays. This fact raises the following question: how can we formally model a context-aware semantic augmented reality system? A formal description is needed that is able to model the context-aware AR systems with mathematical precision. In addition, the features of a context-aware Augmented Reality system can be described using the formalism.

In this paper, a new formal model for context-aware semantic Augmented Reality systems is presented. The model consists of two parts. The components of the AR system are formalized by the first part of the model using set theory functions. The second part can be used for describing the behavior of the system with integrated space-time-motion logic. Therefore, the system enables to execute logical inferences. Our model contributes to the development of theoretical foundations of Augmented Reality systems. For validation purposes, two context-aware mobile Augmented Reality browsers have been implemented based on our abstract formal model.

The structure of the paper is as follows. After the introductionary Sect. 1, related work is presented in Sect. 2. The proposed formal model is described in Sect. 3. Then, use cases where the formal model was applied are introduced in Sect. 4. Finally, the conclusion and future work are shown in Sect. 4.

2 Related Work

In the last decades, several definitions of Augmented Reality have been developed. One of the most popular definitions is given by Azuma [1], who described Augmented Reality in an informal way using three characteristics. Milgram and Kishino [3] defined a taxonomy of mixed reality and introduced the concept of virtuality continuum which positioning Augmented Reality between the real environment and the virtual environment. Despite the popularity of these definitions, those are still informal explanations and a formal description is needed in order to ensure the mathematical precision.

Reicher introduced a framework for AR systems in his doctoral thesis [4]. A reference architecture and design patterns for Augmented Reality have also been described in that work. The proposed method provides a detailed description of a wearable Augmented Reality system architecture and can be used as a guideline for AR system design. However, the solution does not contain any formal model or description.

Galton developed a temporal logic in [5] which is able to describe time-dependent and spatial phenomena. The work is based on the spatial logic introduced by Randell, et al. [6]. Owing to the capability of the description of continuous movement, this framework can be used for modeling the behavior of an Augmented Reality system. In addition, reasoning about the motion of a user of the AR system is also become available by means of the logic. The details can be seen in Sect. 3.

3 CAESAR Model

The formal description of the proposed system is introduced in this section. The model has been called as CAESAR (Context-Awareness Enriched Semantic Augmented Reality) and consists of a set theory functions-based part that formalizes the components of an AR system while an integrated time-space-motion logic is used to describe the behavior of the system.

3.1 Formal Description with Set Theory Functions

The formal description of each component of CAESAR model is described in the following section using set theory functions. Furthermore, the formalization of the features provided by CAESAR model can also be seen in this subsection. The model includes the following four components:

  • data: provides the user-generated and the integrated POI data from different data sources that are displayed by the AR browser,

  • browser: is responsible for the visualization of data (provided by data component),

  • semantic: is responsible for the connection to semantic web,

  • context: adaptive segment that recommends content based on the context.

Formally, a quartet \( CAESAR: = \left\langle {Data,ARB,SWB,C} \right\rangle \) is called as a context-aware semantic Augmented Reality system, where \( Data \), \( ARB \), \( SWB, \) and \( C \) are the components of data, browser, semantic and context, respectively.

Data Component. The first building block of the model is the data component which is responsible for the displayable data provision. The data can be derived from different sources that use different storage schemas. Therefore, heterogeneous data integration is needed which includes the global schema matching and entity resolution.

Definition 1.

The triple \( {\mathcal{G}},{\mathcal{S}},{\mathcal{M}} \) is called as data integration system, where \( {\mathcal{G}} \) is the global scheme, \( {\mathcal{S}} \) is the set of source elements, and \( {\mathcal{M}} \) is the mapping among the global schema and the schemas of the heterogeneous data.

A requested query \( q \) will be queried against the integrated data source using the global schema \( {\mathcal{G}} \). During this method, \( {\mathcal{M}} \) is responsible for mapping \( q \) to \( {\mathcal{S}} \) [7].

Schema matching can be used for determining global schema \( {\mathcal{G}} \) based on \( {\mathcal{S}} \). The core element of schema matching is operator \( match_{SM} \). The definition of \( {\mathcal{M}}_{SM} \) which is the mapping between two schemas is needed before introducing \( match_{SM} \).

Definition 2.

The mapping \( {\mathcal{M}}_{SM} \) is the set of mapping elements between schema \( S_{1} \) and \( S_{2} \) . The mapping elements represent that certain elements of \( S_{1} \) have been mapped into certain elements of \( S_{2} \).

Definition 3.

The operator \( match_{SM} \) is a function \( f:{\mathcal{S}} \times {\mathcal{S}} \to {\mathcal{M}}_{SM} \) which creates the mapping \( {\mathcal{M}}_{SM} \) from the given two schemas. The resulted overcome is called as match result [8].

The identification of duplicated elements is crucial during the data integration process and entity resolution can be used for this purpose. Function \( match_{ER} \) (which is different from the above-mentioned \( match_{SM} ) \) can be used in order to identify the same elements (in our case, the POIs which represent the same real-world entity).

Definition 4.

Let \( E \) be the set of entities. Then, \( match_{ER} \) is a Boolean function \( f:E \times E \to \{ true,false\} \) which determines whether two entities are matching (i.e. represent the same real-world entity) or not (denoted by \( e_{1} \approx e_{2} \) , if \( match\left( {e_{1} ,e_{2} } \right) = true \) , where \( e_{1} ,e_{2} \in E) \).

A partial order on the entities can be defined using the entity-related useful information. If entity \( e_{2} \) holds more information than \( e_{1} \), then \( e_{2} \) dominates \( e_{1} \) (denoted by \( {\text{e}}_{1} { \preccurlyeq }{\text{e}}_{2} ) \).

Definition 5.

The function \( \mu :E \times E \to E \) (called as merge) merges two matching \( e_{1} \approx e_{2} \) entities into one entity. During the merging method, the function keeps only the dominant entity \( e_{2} \) , and extends it with the missing attributes deriving from \( e_{1} \).

Definition 6.

An instance \( I = \left\{ {e_{1} , \cdots ,e_{n} } \right\} \) is a finite set of entities from \( E \).

The merge closure finds all matching entities within instance \( I \) and merges them using the match and merge functions.

Definition 7.

Let \( I \) be an instance, then merge closure of \( I \) (denoted by \( \bar{I}) \) the smallest set of \( S \) such that \( I \subseteq S \) . In addition, \( \forall e_{1} ,e_{2} \in S \) , if \( e_{1} \approx e_{2} \) , then \( merge(e_{1} \approx e_{2} ) \in S \).

The domination of entities can naturally be extended to the instances as well.

Definition 8.

Let \( I_{1} ,I_{2} \) be two instances. Then, \( I_{1} \) is dominated by \( I_{2} \) (denoted by \( I_{1} { \preccurlyeq }I_{2} \) ), if \( \forall e_{1} \in I_{1} ,\exists e_{2} \in I_{2} \) , such that \( e_{1} { \preccurlyeq }e_{2} \).

The definition of entity resolution can be defined using the above auxiliary formulas.

Definition 9.

Let \( I \) be an instance, \( \bar{I} \) be the merge closure of \( I \) . An entity resolution of \( I \) is the \( I^{'} \) set of entities such that \( I' \subseteq \bar{I} \) and \( \bar{I}{ \preccurlyeq }I' \) . In addition, there is not any proper subset of \( I' \) which satisfies the first two conditions [9].

In conclusion, component \( Data \) is a \( {\mathcal{G}},{\mathcal{S}},{\mathcal{M}} \) data integration system which creates global schema \( {\mathcal{G}} \) from source schemas \( {\mathcal{S}} \) using operator \( match_{SM} \). Furthermore, the execution of entity resolution during the data integration phase is also the responsibility of component \( Data \).

Browser Component. The second component of CAESAR model is the browser component, which is responsible for the displaying of the data provided by the before mentioned data component. The visualization uses Augmented Reality, which can be formally defined in the following way.

Definition 10.

A quintet \( {\mathcal{M}},{\mathcal{V}\mathcal{E}},{\mathcal{T}},\varphi ,\xi \) is called as Augmented Reality system, where \( {\mathcal{M}} \) is the set of the markers, \( {\mathcal{V}\mathcal{E}} \) is the set of the virtual elements, \( {\mathcal{T}} \) is the set of transformations, \( \varphi \) is the mapping function, and \( \xi \) is the transformation function.

Let \( IB, PB \) (image-based markers and position-based markers) be two disjoint sets. Then, \( {\mathcal{M}} \) can be written as follows:

$$ {\mathcal{M}} = IB\mathop \cup \nolimits PB . $$
(1)

Let \( I, V, S \) and \( K \) (images, videos, sounds, knowledge base, respectively) be pairwise disjoint sets. Then the set of virtual elements \( {\mathcal{V}\mathcal{E}} \) can be written in the next form:

$$ {\mathcal{V}\mathcal{E}} = I\mathop \cup \nolimits V\mathop \cup \nolimits S\mathop \cup \nolimits K. $$
(2)

The set \( {\mathcal{T}} \) contains geometric transformations, namely translation (\( \tau \)), rotation (\( \rho \)) and scale (\( \sigma \)). In addition, let \( L \) be the set of 3D vectors. Every virtual element \( v \in {\mathcal{V}\mathcal{E}} \) has an \( l \in L \) vector. The vector \( l \) stores the position of virtual element \( v \).

Function \( \varphi :{\mathcal{M}} \to {\mathcal{V}\mathcal{E}} \times L \) maps a virtual element and its relative initial position to a marker. The range of function \( \varphi \) contains the empty set (i.e. there is not any virtual element assigned to a given marker).

The last part of the quintet is the transformation function \( \xi \). Function \( \xi :{\mathcal{M}} \times {\mathcal{V}\mathcal{E}} \times L \times { \mathcal{T}} \to {\mathcal{V}\mathcal{E}} \times L \) transforms a virtual element corresponding to the given marker with a given transformation in real-time.

The current Augmented Reality systems can be modeled by the above-mentioned definitions. The browser component of CAESAR model is an AR system \( \langle{\mathcal{M}}|_{{\text{PB}}} ,{\mathcal{V}}{\rm E}|_{{\text{KB}}} ,{\mathcal{T}},\varphi ,\xi\rangle \) where set \( {\mathcal{M}} \) is restricted to the position-based markers and set \( {\mathcal{V}\mathcal{E}} \) is restricted to knowledge base.

Semantic Component. The third component of the model is based on semantic web technologies and is responsible for the interconnection of data component provided information with publicly available semantic data sources. Since semantic data sources are interlinked using IRIs (Internationalized Resource Identifier), therefore, the semantic data source can be explored starting from a concept derived from the data component.

Let \( B,I, \) and \( L \) be the pairwise disjoint sets of blank nodes, literals and IRIs, respectively. In addition, let \( BIL \), \( BI \), and \( IL \) be the abbreviations of \( B\mathop \cup \nolimits I\mathop \cup \nolimits L, B\mathop \cup \nolimits I \), and \( I\mathop \cup \nolimits L \), respectively. The abbreviation \( BIL \) can be referred as RDF term.

Definition 11.

A triplet \( (s,p,o) \in BI \times I \times BIL \) is called as RDF triple, where predicate \( p \) connects subject \( s \) with object \( o \) . Let the finite set of RDF triple be denoted by \( RDF_{3} \).

Definition 12.

Let the infinitive set of RDF triples be denoted by \( RDF_{DB} \) and be called as RDF database (or RDF document).

The data storage model of semantic web can be described by Definition 11 and 12. However, a query language is needed in order to access and manipulate the data stored in such a way. One of the semantic technologies, SPARQL query language was designed for this purpose. Since the RDF triples can be considered as directed edges, the RDF database can be seen as a directed graph. Due to this solution, the SPARQL language reduces the RDF database searching to graph pattern matching [10].

Ontology is one of the key components of semantic web, which describe the relations, rules and restrictions among the concepts.

Definition 13.

An ontology is a structure \( {\mathcal{O} := }(C, \le_{C} ,P,\sigma ) \) , where \( C \) and \( P \) are the disjoint sets of classes and properties, respectively. A partial order \( \le_{C} \) on \( C \) is called as class hierarchy while the function \( \sigma :P \to C \times C \) describes the signature of properties. Let \( c_{1} ,c_{2} \in C \) be two classes. If \( c_{1} \le_{C} c_{2} \) , then \( c_{1} \) is the subclass of \( c_{2} \) while \( c_{2} \) is the superclass of \( c_{1} \) [11].

The following definitions will be used in the description of the semantic component of CAESAR model.

Definition 14.

Let \( link:RDF_{3} \times RDF_{3} \to \{ true,false\} \) be a Boolean function, which decides whether two RDF triples are directly accessible from each other or not. Let \( { \rightsquigarrow } \) denote this function.

The behavior of function \( link \) can formally be described in the following way:

$$ \begin{aligned} & \left( {s_{1} ,p_{1} ,o_{1} } \right){ \rightsquigarrow }\left( {s_{2} ,p_{2} ,o_{2} } \right) = \\ & \quad \left\{ {\begin{array}{*{20}c} {true, if s_{1} \ne s_{2} \wedge \left( {o_{1} = s_{2} \vee o_{2} = s_{1} \vee \left( {o_{1} = o_{2} \wedge o_{1} \notin L} \right)} \right)} \\ {false, otherwise} \\ \end{array} } \right., \\ \end{aligned} $$
(3)

where \( \left( {s_{1} ,p_{1} ,o_{1} } \right),\left( {s_{2} ,p_{2} ,o_{2} } \right) \in {\text{RDF}}_{3} \).

The directly accessible (one step relation) RDF triples can be obtained using function \( link \). The indirectly accessible RDF triples can be determined using a method which is similar to the derivation method of formal languages theory.

Definition 15.

Let \( link_{indirect*} :RDF_{3} \times \ldots \times RDF_{3} \to \{ true,false\} \) be a Boolean function which determines whether two RDF triples are indirectly accessible from each other or not. Let \( { \rightsquigarrow }_{*} \) denote this function.

A \( b \in RDF_{3} \) is indirectly accessible or derivable from \( a \in RDF_{3} \) triple, if the following condition holds:

$$ \exists n \in {\mathbb{N}},\text{ }r_{1} , \cdots ,r_{n} \in {\text{RDF}}_{3} :a = r_{1} \wedge b = r_{n} \wedge \left( {\forall i \in \left[ {1 \cdots n - 1} \right]:r_{i} { \rightsquigarrow }r_{i + 1} } \right). $$
(4)

The component \( SWB \) is a semantic database which satisfies the following criteria:

$$ \exists d \in Data,\forall r \in RDF_{DB} :d { \rightsquigarrow }_{ *} r. $$
(5)

In details, at least one element from the information provided by data component can be used to explore the full semantically represented dataset.

Context Component. The last component of the model, which is responsible for the contextual data tailoring. Since the data provision has to take into account various parameters (location, time, etc.), this component can be seen as a context-aware recommender system that is based on [12]. The first step of the recommendation process is the specification of the set of initialization of recommendations which can be done by explicit (given by the users) or implicit (by means of inferences) way. After the initial data source has been created, the component tries to estimate the following recommendation function \( R \):

$$ R:User \times POI \times Context \to Rating, $$
(6)

where \( User \) is the set of users, \( POI \) the data coming from component \( Data \), \( Context \) is the set of contextual information while \( Rating \) is the range of the ratings.

When function \( R \) has performed the estimation over \( User \times POI \times Context \) space, the context component returns the POI which has the highest \( Rating \) value. Intuitively, the component tries to estimate a value for unknown POIs. Function \( R \) can be defined in the following way.

Definition 16.

Let \( R:User \times POI \times Location \times Time \times Category \to Rating \) be a recommender function which determines that the \( u \in User \) how would rate the \( p \in POI \) (deriving from data component) which is located in \( l \in Location \) and has a \( c \in Category \) in \( t \in Time \) , namely \( R\left( {u,p,l,c,t} \right) \in Rating \).

Formal Description of the System Functionality. After the introduction of the components of the model, the abstract description of the features provided by CAESAR model can be read in this subsection. The definition of multivalued functions [13] is needed for the formal description of the features.

Definition 17.

Let multivalued function \( f:A \to B^{*} \) a function which assigns one or more value from range to the elements of the domain, namely \( \forall x\exists n \in {\mathbb{N}},y_{1,} \cdots ,y_{n} ,\forall i \in \left[ {1,n} \right]:\left( {x,y_{i} } \right) \in f. \)

The signatures of the features of CAESAR model can be defined by means of set theory functions and Definition 17. The core function of the model is function \( browse:Data \times String \times {\mathbb{R}} \to {\mathcal{V}\mathcal{E}}^{ *} \), which collects the virtual elements that represents the relevant hits from the POI data derived from component \( Data \), from search keyword and from the search radius (the current position of user is considered as the central point of search area). The resulted overcome (e.g. a POI) can be visualized by component \( ARB \). In addition, detailed information can be displayed from a given POI using function \( details:{\mathcal{V}\mathcal{E}} \to I \times String^{n} \), where \( I \) is the set of images while the second component of the range is an \( n \)-dimensional word vector which represents the attributes of virtual element. A new virtual element can also be created by component \( ARB \) using function \( new:{\mathbb{R}} \times {\mathbb{R}} \times I \times String^{n} \to {\mathcal{V}\mathcal{E}} \). The inputs of this feature are the latitude and longitude coordinates, an image, and a word vector which contains the properties of a POI. Furthermore, navigation is also provided by component \( ARB \) by means of function \( navigate:{\mathcal{V}\mathcal{E}} \times {\mathbb{R}} \times {\mathbb{R}} \times {\mathbb{R}} \times {\mathbb{R}} \to ({\mathbb{R}} \times {\mathbb{R}})^{ *} \). The function \( navigate \) gets a virtual element (POI) and its latitude/longitude coordinates, and the current latitude/longitude coordinates of a user and returns with the latitude/longitude coordinate pairs which represents the path to the given POI from the current position. Function \( toRDF:{ \mathcal{V}\mathcal{E}} \times {\mathbb{R}}^{\text{n}} \to {\text{RDF}}_{3}^{ *} \) creates the RDF representation of a virtual element. In this way, a POI can be used for exploring a semantic dataset using Definition 15 and Eq. 5.

3.2 Integrated Time-Space-Motion Logic

The logical description of our CAESAR system is based on Galton’s work [5] which introduces a special logic that enables to combine temporal and spatial logics. In this way, the changes in context can be described and logical inferences can be executed.

Galton’s logic is based on the work of Randell et al. [6] who developed a spatial reasoning system (hereinafter referred to as RCC). RCC determines the set of relations between spatial regions and the concept of connection is considered as a primitive.

The concept of connection can be extended to spatial points beside the regions. In case of points, connection can take two values: two points either are the same points or different points. The relation between a point and a region can be the followings:

  • \( p \) is inside \( r \),

  • \( p \) bounds \( r \),

  • \( p \) is outside \( r \).

\( Inside \) can be considered as a primitive, then, \( Bounds \) and \( Outside \) can be defined in the following way:

$$ Bounds\left( {p,r} \right) \equiv \forall r^{\prime} \left( {Inside\left( {p,r^{\prime} } \right) \to PO\left( {r,r^{\prime} } \right)} \right) $$
(7)
$$ Outside\left( {p,r} \right) \equiv \neg Inside\left( {p,r} \right) \wedge \neg Bounds\left( {p,r} \right), $$
(8)

where \( PO \) means partially overlapping. The fulfillment of certain conditions can be indicated by three predicates, namely \( Holds - on \), \( Holds - in \), and Holds-at. The first two predicates deal with the state of intervals while the last can be used for relating states to instants. In addition, let \( inf(i) \) and \( sup(i) \) denote the boundaries of an interval \( i \) while let \( Div(t,i) \) denote that instant \( t \) falls within interval \( i \).

Events can be defined using the changes of states. Similarly to the before mentioned predicate \( Holds \), three \( Occurs \) predicates can be introduced, namely \( Occurs - on \), \( Occurs - in \), and \( Occurs - at \). If an event \( e \) occurs during interval \( i \), then predicate \( Occurs - in(e,i) \) can be used. If event \( e \) takes the whole interval \( i \), \( Occurs - on(e,i) \) should be used. If event \( e \) is instantaneous and occurs at the instant \( t \), then \( Occurs - at(e,t) \) can be written [5].

Logical Formalization of the System. The first step of the logical description is to define the following predicates and function symbol:

  • \( Region(x) \)\( x \) is a variable with type region,

  • \( VirtualRegion(x) \)\( x \) is a variable with type virtual region,

  • \( Point(x) \)\( x \) is a variable with type point,

  • \( VirtualPoint(x) \)\( x \) is a variable with type virtual point,

  • \( ARDevice(x) \)\( x \) is a variable with type AR device,

  • \( POI(x) \)\( x \) is a variable with type POI,

  • \( Belongs(dev,vr) \) – indicates that virtual region \( vr \) belongs to AR device \( dev \),

  • \( Display(dev) \) – indicates, whether AR displaying is possible on AR device \( dev \),

  • \( Type(x) \) – function symbol, returns the category of variable \( x \) with type POI (e.g. university or restaurant).

The device which enables the AR displaying can be indicated by predicate \( ARDevice(x) \). Predicate \( VirtualRegion(x) \) represents a virtual region that belongs to variable \( y \) with type Region while \( VirtualPoint(x) \) represents a virtual element.

Hereinafter, the terms of virtual world and virtual region are used interchangeably. A variable with type virtual point can correspond to a POI variable; therefore, a virtual element can be assigned to a real world object. The following formalism can be used for creating a hierarchy among the types which is necessary for the further description:

$$ \forall x\left( {VirtualPoint\left( x \right) \to Point\left( x \right)} \right) $$
(9)
$$ \forall x\left( {POI\left( x \right) \to Point\left( x \right) \vee Region\left( x \right)} \right) $$
(10)
$$ \forall x\left( {ARDevice\left( x \right) \to Point\left( x \right) \vee Body\left( x \right)} \right) $$
(11)
$$ \forall x\left( {VirtualRegion\left( x \right) \to Region\left( x \right)} \right) $$
(12)

In possession of the above predicates and statements, the logical description of a CAESAR system can be specified in more detail. The first building component is the assignment of physical region with the corresponding virtual region. For this purpose, a formula is needed which decides whether a point belongs to a region, formally:

$$ \begin{aligned} Contains\left( {r, p} \right) \equiv & \;Region\left( r \right) \wedge Point\left( p \right) \\ & \wedge \left( {Bounds\left( {pos\left( p \right),r} \right) \vee Inside\left( {pos\left( p \right),r} \right)} \right) \\ \end{aligned} $$

where \( pos(p) \) is the position of point \( p \). The correspondence among the real and virtual points can be written using the above formula in the following way:

$$ \begin{aligned} Superimpose & \left( {r,vr} \right) \\ & \quad \equiv \forall poi(POI\left( {poi} \right) \wedge Contains(r,poi) \\ & \quad \to \exists vpoi(VirtualPoint\left( {vpoi} \right) \wedge Contains\left( {vr,vpoi} \right) \wedge pos\left( {poi} \right) \\ & \quad = pos\left( {vpoi} \right))) \\ \end{aligned} $$

Therefore, every physical POI belonging to region \( r \) can be assigned to a virtual object belonging to virtual region \( vr \) (which is corresponding to region \( r \)). The next step is the correspondence between the real and virtual world such that the virtual world corresponds to the given AR device. In this way, the physical region can correspond to the virtual region which can be seen on the display of the AR device:

$$ \eqalign{ & ARMode\left( {dev,r,vr} \right) \cr &\,\;\qquad\qquad\qquad \equiv (ARDevice\left( {dev} \right) \wedge Region\left( r \right) \wedge VirtualRegion\left( {vr} \right) \cr & \,\;\qquad\qquad\qquad\wedge Belongs\left( {dev,vr} \right) \wedge pos\left( {vr} \right) = pos\left( r \right) \wedge Superimpose\left( {r,vr} \right)) \cr & \,\;\qquad\qquad\qquad\to Display\left( {dev} \right). \cr} $$

The formula checks whether virtual region \( vr \) belonging to AR device \( dev \) can correspond to the given region \( r \). In addition, it examines the correspondence of physical POIs with virtual elements. If both conditions hold, then AR device \( dev \) is able to superimpose AR content into region \( r \), i.e. the browser component \( ARB \) of CAESAR model can be used in the region \( r \).

Certain questions related to the behavior of an AR system can be answered using logical inferences on data derived from the semantic \( SWB \) and the context \( C \) components of CAESAR model.

Such a behavior related question is that whether virtual content is available passing through a region. The answering logical formula is the following:

$$ \begin{aligned} Occurs - & on\left( {transitAR\left( {p,r,dev,vr} \right),i} \right) \\ & \quad \quad \equiv Holds - on\left( {pos\left( p \right) = pos\left( {dev} \right),i} \right) \wedge Holds \\ & \quad \quad - at\left( {Bounds\left( {pos\left( p \right),r} \right),{ \inf }\left( i \right)} \right) \wedge Holds \\ & \quad \quad - at\left( {Bounds\left( {pos\left( p \right),r} \right),{ \sup }\left( i \right)} \right) \wedge Holds \\ & \quad \quad - on\left( {Inside\left( {pos\left( p \right),r} \right) \wedge ARMode\left( {dev,r,vr} \right),i} \right), \\ \end{aligned} $$

where \( transitAR\left( {p,r,dev,vr} \right) \) describes the event which represents the motion of a user during the AR system usage. The position of AR device \( dev \) and the moving point \( p \) which represents the user is the same during the whole interval of passing through. Point \( p \) bounds one of the boundaries of region \( r \) in the first instant of passage, it crosses the opposite boundary in the last instant while it is located inside of region \( r \) in the meantime. Predicate \( ARMode \) contained by the last \( Holds - on \) predicate determines whether the displayable virtual content is available.

The above question is related to the virtual elements within a region; however, it is independent of time. The need to respond to issues where the time is essential was also raised. The introduced logical description can be used to answer time-related questions as well. Let us consider the following two questions:

  • Does the user miss a meeting at a given time and location?

  • Is the user allowed to enter a place at a given time?

The first question can be answered using the following logical formula:

$$ \begin{aligned} Occurs - & at\left( {contact\left( {b_{1} ,b_{2} ,p_{loc} ,t} \right),t_{cur} } \right) \equiv t_{cur} \\ & \quad \quad \le t \wedge Holds \\ & \quad \quad - at\left( {\left( {EC\left( {pos\left( {b_{1} } \right),pos\left( {b_{2} } \right)} \right) \wedge pos\left( {b_{1} } \right) = pos\left( {p_{loc} } \right)} \right),t_{cur} } \right) \\ & \quad \quad \wedge \exists t^{\prime}\left( {Holds - on\left( {DC\left( {pos\left( {b_{1} } \right),pos\left( {b_{2} } \right)} \right),\left( {t^{\prime},t_{cur} } \right)} \right)} \right). \\ \end{aligned} $$

The meeting of users \( b_{1} \) and \( b_{2} \) is represented by event \( contact(b_{1} ,b_{2} ,p_{loc} ,t_{cur} ) \). The meeting time and place have been agreed in location \( p_{loc} \) at time t. The truth value of the formula is true if the current time less than or equal to time \( t \); \( b_{1} \) and \( b_{2} \) are externally connected in location \( p_{loc} \) at time \( t \) (i.e. the meeting has happened), and there is a time \( t^{\prime} \) when \( b_{1} \) and \( b_{2} \) are disconnected. The truth value of the following formula answers the second question:

$$ \begin{aligned} Occurs - & in\left( {enter\left( {b,r} \right),i} \right) \\ & \quad \quad \equiv POI(r) \wedge \exists t'\left( {t^{\prime} \ge \inf \left( i \right) \wedge Holds - at\left( {EC\left( {pos\left( b \right),r} \right),t^{\prime}} \right)} \right) \\ & \quad \quad \wedge \exists t^{\prime\prime}\left( {Div\left( {t^{\prime\prime},i} \right) \wedge t^{\prime\prime} > t' \wedge Holds - at\left( {EC\left( {pos\left( b \right),r} \right),t^{\prime\prime}} \right)} \right) \\ & \quad \quad \wedge \exists t\left( {Div\left( {t,\left( {t^{\prime},t^{\prime\prime}} \right)} \right) \wedge Holds - on\left( {PO\left( {pos\left( b \right),r} \right),\left( {t^{\prime},t^{\prime\prime}} \right)} \right)} \right). \\ \end{aligned} $$

A user \( b \) is permitted to pass through a region \( r \) if the entering time falls within the allowed time interval \( i \) and the exit time falls also within this interval. Since the type of region \( r \) is POI (see Eq. 10), the related opening/allowed time interval can be obtained from data component \( Data \) or semantic component \( SWB \) of CAESAR model.

The last example is the formalization of navigation. The navigation regarded as a success if the user gets to point \( B \) from point \( A \) passing through the path made up of regions. The formalization requires defining the passage of point \( p \) through region \( r \):

$$ \begin{aligned} & Occurs - on\left( {transit\left( {p,r} \right),i} \right) \\ & \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \equiv Holds - at\left( {Bounds\left( {pos\left( p \right),r} \right),\inf \left( i \right)} \right) \wedge Holds \\ & \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, - at\left( {Bounds\left( {pos\left( p \right),r} \right),\sup \left( i \right)} \right) \wedge Holds \\ & \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, - on\left( {\left( {Inside\left( {pos\left( p \right),r} \right),i} \right)} \right). \\ \end{aligned} $$

The logical statement of navigation can be given using the above formula:

$$ \begin{aligned} & Occurs - on\left( {navigate\left( {p_{user} ,r_{1} , \cdots ,r_{n} ,a,b} \right),i} \right) \equiv r_{1} = a \wedge r_{n} = b \wedge \forall r_{wp} \\ & \quad \quad \quad \quad \quad \quad \in \left[ {r_{1} , \cdots ,r_{n} } \right]:(POI(r_{wp} ) \wedge Occurs - in(transit(p_{user} ,r_{wp} ),i) \\ & \quad \quad \quad \quad \quad \quad \wedge \exists vpoi(VirtualPoint(vpoi) \wedge Inside(pos(vpoi),r_{wp} ))). \\ \end{aligned} $$

A point \( p_{user} \) which represents the user passes through each region \( r_{wp} \) (i.e. every waypoint of the navigation path). Since the type of the waypoints of navigation path is POI, virtual elements can be assigned to the waypoints and these points can be displayed by Augmented Reality, helping the user during the navigation. If time constraints are also available for each waypoint, then predicate \( Occurs - in\left( {enter\left( {p_{user} ,r_{wp} } \right),i} \right) \) can be used. In this way, the handling of time, space, and motion changes provided by the logic is fully utilized during the logical inference.

4 Use Cases

Two context-aware Android-based mobile AR browsers have been implemented which are based on our formal model. The architecture of the developed applications has been mapped to the proposed formal model, and the described modules have been also included. The first one is a tourist application, it allows the users to collect and display POIs in the surrounding environment [13]. The second is used in the field of cultural heritage; it provides movable story maps based on time and space [14]. Both applications use integrated data sources coming from component \( Data \) and this information was visualized by component \( ARB \). In the case of the first application, publicly available semantic data sources can be explored starting from a given POI using component \( SWB \) while the second application provides POI-related semantic metadata browsing by means of component \( SWB \). The component \( C \) was restricted to location, time and POI’s category. These applications can be seen as a proof of the practical applicability of our formal model. The detailed description of the browsers can be found in the cited papers.

5 Conclusion

In this paper, we present a new formal model for CAESAR systems. In addition, the behavior of the system was also introduced using set theory functions as well as integrated time-space-motion logic. The current Augmented Reality systems can be modeled with mathematical precision by means of the proposed model. The developed method contributes to the development of theoretical foundations of Augmented Reality systems. For a demonstration of the practical applicability of CAESAR model, two context-aware mobile Augmented Reality applications were briefly presented which are based on our formal model. In the future, a more in-depth investigation into the implementation of logical part of the model is needed.